Goal: FPGA-based accelerator design flow is at the stage that software engineers can benefit from without an in-depth knowledge of hardware details. This series of blogs explain different methodologies for implementing efficient hardware accelerators for a wide range of compute-intensive tasks, using the Xilinx Vitis IDE platform.
FPGA-based accelerators are getting the attention of the industry and software developers in large cloud-based systems and small embedded systems at the edge. These systems usually treat the FPGA as a reconfigurable computing core along with other cores, such as CPUs and GPUs, in a heterogeneous computing architecture. Software engineers as the primary programmers for these systems should be able to program, evaluate and debug their applications, leveraging the FPGA high performance and low power consumption.
CPUs and GPUs have their own ISA models (over fixed hardware architectures). These models are supported by the corresponding compiler toolsets that make them easy to program and debug. However, considering the FPGA as a core makes the application development a real challenge, as traditionally FPGAs have their own design flow. This flow requires designing computing architectures and microarchitectures. Therefore, it needs a great knowledge of low-level hardware design techniques which makes the development flow complicated.
Conventionally, hardware designers use one of the hardware description languages (HDLs) such as VHDL to provide cycle-accurate RTL description for a given task. Although, this approach can provide the most efficient design, writing HDL codes is tedious, error-prone and difficult to perform verification and debugging. To address these issues, industry and academia have proposed the high-level synthesis design flow along with the required toolset. The Xilinx Vivado-HLS toolset is a typical example of this approach. HLS enables designers to convert an algorithm to the corresponding HDL code without worrying about the hardware details. In this approach, C, C++, OpenCL, and SystemC languages are used to describe different modules in a design. These descriptions are synthesised into HDL codes as hardware IP cores. Then, low-level design environments such as Vivado is used to connect these cores together and to other components in an embedded system. However, communication between design modules, and the rest of the system in an embedded system, such as the main memory and software functions, require special attention.
An FPGA-based embedded system commonly consists of processors, FPGAs, other hardware accelerators and the main memory. A set of baremetal APIs or an operating system, such as Linux or FreeRTOS, usually orchestrates all the communication among computational cores and memories. FPGA cores also are not isolated from this scheme, and this is the place that platform-based design concept comes to play.
The communication among computing cores, accelerators, memories and other modules in an embedded system requires using proper protocols for moving data around. These protocols should be supported by a set of APIs and a runtime system. Designing and developing these APIs can be tedious if the design is developed in an OS-based environment. To address this problem, researchers have proposed the platform-based design methodology. In this new approach, an abstraction layer called platform encapsulates the hardware detail and the target operating system along with the runtime-system for the communication among hardware and software modules.
The concept of platform-design in embedded systems is not new and has been the main focus of many research activities . Recently, the concept of platform-based design is widely used by industry. Xilinx has used this concept in SDSoC and SDAccel design flow and recently in Vitis the Xilinx unified software platform.
For developing a software application in Vitis that uses the embedded FPGA as an accelerator, we need to have a platform that encapsulates the underlying FPGA hardware, OS (or baremetal APIs), and Xilinx runtime (XRT). Also, we should follow a programming model for the accelerator and its communication with software. This programming model is mainly based on the OpenCL framework.
In future blogs, I will explain how to create a platform and use that to implement efficient accelerators using Ultra96v2 heterogeneous FPGA-based embedded system.
 Vincentelli, Alberto & Martin, Grant. (2001). Platform-Based Design and Software Design Methodology for Embedded Systems. IEEE Design & Test of Computers. 18. 23-33.
 Bhattacharya, Raktim. (2015). Model & Platform Based Design of Embedded Systems. 10.13140/RG.2.1.2670.9922.