Writing a C/C++ program for HLS requires its own coding style and techniques. In this series of blogs I am going to explain a few techniques to modify an existing C/C++ program to be synthesised by Vivado-HLS.

In each blog, I will use a simple example to discuss the ideas.

The first example is vector addition. The following simple C code represent this example:

#define DATA_LENGTH 1024
void vector_add(int a[DATA_LENGTH], int b[DATA_LENGTH], int c[DATA_LENGTH]) {
  for (int i = 0; i < DATA_LENGTH; i++) {
    c[i] = a[i]+b[i];
  }
}

In order two modify this code for HLS two task should be done: transferring the data between the main memory and FPGA and performing the addition efficiently.

For the first one we can use memcpy as follows

#define DATA_LENGTH 1024
void vector_add(int a[DATA_LENGTH], int b[DATA_LENGTH], int c[DATA_LENGTH]) {
  int a_local[DATA_LENGTH];
  int b_local[DATA_LENGTH];
  memcpy(a_local, (const int*)a,DATA_LENGTH*sizeof(int));
  memcpy(b_local, (const int*)b,DATA_LENGTH*sizeof(int));
  ....

  memcpy((int*)c, c_local, DATA_LENGTH*sizeof(int));
} 

For the computation we can keep the loop and add some compiler directives to that for increasing the performance.

Using the pipeline directive can improve the computation efficiency.

#define DATA_LENGTH 1024
void vector_add(int a[DATA_LENGTH], int b[DATA_LENGTH], int c[DATA_LENGTH]) {
  int a_local[DATA_LENGTH];
  int b_local[DATA_LENGTH];
  memcpy(a_local, (const int*)a,DATA_LENGTH*sizeof(int));
  memcpy(b_local, (const int*)b,DATA_LENGTH*sizeof(int));</pre>

for (int i = 0; i < DATA_LENGTH; i++) {
#pragma HLS PIPELINE
    c[i] = a[i]+b[i];
  }

memcpy((int*)c, c_local, DATA_LENGTH*sizeof(int)); }

Finally, in order to transferring data between the main memory and FPGA, proper interfaces should be defined for top function arguments. The following code shows the synthesisable C code.

#include <string.h>
#define DATA_LENGTH 1024
void vector_addition(int a[DATA_LENGTH], int b[DATA_LENGTH], int c[DATA_LENGTH]) {

#pragma HLS INTERFACE s_axilite port=return bundle=control_bus

#pragma HLS INTERFACE s_axilite port=a bundle=control_bus
#pragma HLS INTERFACE s_axilite port=b bundle=control_bus
#pragma HLS INTERFACE s_axilite port=c bundle=control_bus

#pragma HLS INTERFACE m_axi depth=16 port=a bundle=bus_port
#pragma HLS INTERFACE m_axi depth=16 port=b bundle=bus_port
#pragma HLS INTERFACE m_axi depth=16 port=c bundle=bus_port



int a_local[DATA_LENGTH];
int b_local[DATA_LENGTH];
int c_local[DATA_LENGTH];

memcpy(a_local, ( int*)a, DATA_LENGTH*sizeof(int));
memcpy(b_local, ( int*)b, DATA_LENGTH*sizeof(int));

for (int i = 0; i < DATA_LENGTH; i++) {
#pragma HLS PIPELINE
c_local[i] = a_local[i]+b_local[i];
}


memcpy((int*)c, c_local, DATA_LENGTH*sizeof(int));
}