Digital System Design with High-Level Synthesis for FPGA: Combinational Circuits
Udemy Course
Writing a C/C++ program for HLS requires its own coding style and techniques. In this series of blogs I am going to explain a few techniques to modify an existing C/C++ program to be synthesised by Vivado-HLS.
In each blog, I will use a simple example to discuss the ideas.
The first example is vector addition. The following simple C code represent this example:
[code language="C"] #define DATA_LENGTH 1024 void vector_add(int a[DATA_LENGTH], int b[DATA_LENGTH], int c[DATA_LENGTH]) { for (int i = 0; i < DATA_LENGTH; i++) { c[i] = a[i]+b[i]; } } [/code]
In order two modify this code for HLS two task should be done: transferring the data between the main memory and FPGA and performing the addition efficiently.
For the first one we can use memcpy as follows
[code language="C"] #define DATA_LENGTH 1024 void vector_add(int a[DATA_LENGTH], int b[DATA_LENGTH], int c[DATA_LENGTH]) { int a_local[DATA_LENGTH]; int b_local[DATA_LENGTH]; memcpy(a_local, (const int*)a,DATA_LENGTH*sizeof(int)); memcpy(b_local, (const int*)b,DATA_LENGTH*sizeof(int)); .... memcpy((int*)c, c_local, DATA_LENGTH*sizeof(int)); } [/code]
For the computation we can keep the loop and add some compiler directives to that for increasing the performance.
Using the pipeline directive can improve the computation efficiency.
[code language="C"] #define DATA_LENGTH 1024 void vector_add(int a[DATA_LENGTH], int b[DATA_LENGTH], int c[DATA_LENGTH]) { int a_local[DATA_LENGTH]; int b_local[DATA_LENGTH]; memcpy(a_local, (const int*)a,DATA_LENGTH*sizeof(int)); memcpy(b_local, (const int*)b,DATA_LENGTH*sizeof(int));</pre> for (int i = 0; i < DATA_LENGTH; i++) { #pragma HLS PIPELINE c[i] = a[i]+b[i]; } memcpy((int*)c, c_local, DATA_LENGTH*sizeof(int)); } [/code]
Finally, in order to transferring data between the main memory and FPGA, proper interfaces should be defined for top function arguments. The following code shows the synthesisable C code.
[code language="C"] #include <string.h> #define DATA_LENGTH 1024 void vector_addition(int a[DATA_LENGTH], int b[DATA_LENGTH], int c[DATA_LENGTH]) { #pragma HLS INTERFACE s_axilite port=return bundle=control_bus #pragma HLS INTERFACE s_axilite port=a bundle=control_bus #pragma HLS INTERFACE s_axilite port=b bundle=control_bus #pragma HLS INTERFACE s_axilite port=c bundle=control_bus #pragma HLS INTERFACE m_axi depth=16 port=a bundle=bus_port #pragma HLS INTERFACE m_axi depth=16 port=b bundle=bus_port #pragma HLS INTERFACE m_axi depth=16 port=c bundle=bus_port int a_local[DATA_LENGTH]; int b_local[DATA_LENGTH]; int c_local[DATA_LENGTH]; memcpy(a_local, ( int*)a, DATA_LENGTH*sizeof(int)); memcpy(b_local, ( int*)b, DATA_LENGTH*sizeof(int)); for (int i = 0; i < DATA_LENGTH; i++) { #pragma HLS PIPELINE c_local[i] = a_local[i]+b_local[i]; } memcpy((int*)c, c_local, DATA_LENGTH*sizeof(int)); } [/code]
Digital System Design with High-Level Synthesis for FPGA: Combinational Circuits
Udemy Course