Goal: FPGA-based accelerator design flow is at the stage that software engineers can benefit from without an in-depth knowledge of hardware details. This series of blogs explain different methodologies for implementing efficient hardware accelerators for a wide range of compute-intensive tasks, using the Xilinx Vitis IDE platform.

FPGA-based accelerators are getting the attention of the industry and software developers in large cloud-based systems and small embedded systems at the edge. These systems usually treat the FPGA as a reconfigurable computing core along with other cores, such as CPUs and GPUs, in a heterogeneous computing architecture.  Software engineers as the primary programmers for these systems should be able to program, evaluate and debug their applications, leveraging the FPGA high performance and low power consumption.

Embedded System

Heterogeneous application development is not so difficult if there is no FPGAs in the system, because CPUs and GPUs have their own ISA models (over fixed hardware architectures) along with the corresponding compiler toolsets that make them easy to program and debug. However, considering the FPGA as a core makes the application development a real challenge, as traditionally FPGAs have their own design flow involving computing architecture and microarchitecture design. This design flow requires a great knowledge of low-level hardware design techniques which makes the development flow complicated.

Heterogeneity

Conventionally, hardware designers use one of the hardware description languages (HDLs) such as VHDL to create clock cycle accurate circuits. Although, this approach can provide the most efficient design, writing HDL codes is tedious, error-prone and difficult to perform verification and debugging. To address these issues, industry and academia have proposed the high-level synthesis design flow along with the required toolset.  The Xilinx Vivado-HLS toolset is a typical example of this approach. HLS enables designers to convert an algorithm to the corresponding HDL code without worrying about the hardware details. In this approach, C/C++/OpenCL/SystemC is used to describe different modules in a design. These descriptions are synthesised into HDL codes as hardware IP cores. Then, graphical environments such as Vivado is used to connect these cores together and to other components in the embedded system. However, communication between design modules, and the rest of the system in an embedded system, such as the main memory, software functions, require special attention.


An FPGA-based embedded system commonly consists of processors, FPGAs, other hardware accelerators and the main memory. A set of baremetal APIs or an operating system, such as Linux or FreeRTOS, usually orchestrates all the communication among computational cores and memories. FPGA cores also are not separated from this scheme, and this is the place that platform-based design concept comes to play. These communications require using proper protocols for moving data around and preparing the corresponding API and runtime system. Designing and developing these APIs can be tedious if the design is developed in an OS-based environment. To address this problem, researchers have proposed the platform-based design methodology. In this new approach, an abstraction layer called platform encapsulates the hardware detail and the target operating system along with the runtime-system for the communication among hardware and software modules.

The concept of platform-design in embedded systems is not new and has been the main focus of many research activities [1][2]. Recently, the concept of platform-based design is widely used by industry. Xilinx has used this concept in SDSoC and SDAccel design flow and recently in Vitis the Xilinx unified software platform.

For developing a software application in Vitis that uses the embedded FPGA as an accelerator, we need to have a platform that encapsulates the underlying FPGA hardware, OS (or baremetal APIs), Xiling runtime (XRT). Also, we should follow a programming model for the accelerator and its communication with software. This programming model is mainly based on the OpenCL framework.

In future blogs, I will explain how to create a platform and use that to implement efficient accelerators using Ultra96v2 heterogeneous FPGA-based embedded system.

References:
[1] Vincentelli, Alberto & Martin, Grant. (2001). Platform-Based Design and Software Design Methodology for Embedded Systems. IEEE Design & Test of Computers. 18. 23-33.
[2] Bhattacharya, Raktim. (2015). Model & Platform Based Design of Embedded Systems. 10.13140/RG.2.1.2670.9922.