Digital System Design with High-Level Synthesis for FPGA: Combinational Circuits

Udemy Course

Writing a C/C++ program for HLS requires its own coding style and techniques. In this series of blogs I am going to explain a few techniques to modify an existing C/C++ program to be synthesised by Vivado-HLS.

In each blog, I will use a simple example to discuss the ideas.

The first example is vector addition. The following simple C code represent this example:

[code language="C"]
#define DATA_LENGTH 1024
void vector_add(int a[DATA_LENGTH], int b[DATA_LENGTH], int c[DATA_LENGTH]) {
  for (int i = 0; i < DATA_LENGTH; i++) {
    c[i] = a[i]+b[i];
  }
}
[/code]

In order two modify this code for HLS two task should be done: transferring the data between the main memory and FPGA and performing the addition efficiently.

For the first one we can use memcpy as follows

[code language="C"]
#define DATA_LENGTH 1024
void vector_add(int a[DATA_LENGTH], int b[DATA_LENGTH], int c[DATA_LENGTH]) {
  int a_local[DATA_LENGTH];
  int b_local[DATA_LENGTH];
  memcpy(a_local, (const int*)a,DATA_LENGTH*sizeof(int));
  memcpy(b_local, (const int*)b,DATA_LENGTH*sizeof(int));
  ....

  memcpy((int*)c, c_local, DATA_LENGTH*sizeof(int));
} [/code]

For the computation we can keep the loop and add some compiler directives to that for increasing the performance.

Using the pipeline directive can improve the computation efficiency.

[code language="C"]
#define DATA_LENGTH 1024
void vector_add(int a[DATA_LENGTH], int b[DATA_LENGTH], int c[DATA_LENGTH]) {
  int a_local[DATA_LENGTH];
  int b_local[DATA_LENGTH];
  memcpy(a_local, (const int*)a,DATA_LENGTH*sizeof(int));
  memcpy(b_local, (const int*)b,DATA_LENGTH*sizeof(int));</pre>

for (int i = 0; i < DATA_LENGTH; i++) {
#pragma HLS PIPELINE
    c[i] = a[i]+b[i];
  }

memcpy((int*)c, c_local, DATA_LENGTH*sizeof(int)); }
[/code]

Finally, in order to transferring data between the main memory and FPGA, proper interfaces should be defined for top function arguments. The following code shows the synthesisable C code.

[code language="C"]
#include <string.h>
#define DATA_LENGTH 1024
void vector_addition(int a[DATA_LENGTH], int b[DATA_LENGTH], int c[DATA_LENGTH]) {

#pragma HLS INTERFACE s_axilite port=return bundle=control_bus

#pragma HLS INTERFACE s_axilite port=a bundle=control_bus
#pragma HLS INTERFACE s_axilite port=b bundle=control_bus
#pragma HLS INTERFACE s_axilite port=c bundle=control_bus

#pragma HLS INTERFACE m_axi depth=16 port=a bundle=bus_port
#pragma HLS INTERFACE m_axi depth=16 port=b bundle=bus_port
#pragma HLS INTERFACE m_axi depth=16 port=c bundle=bus_port

int a_local[DATA_LENGTH];
int b_local[DATA_LENGTH];
int c_local[DATA_LENGTH];

memcpy(a_local, ( int*)a, DATA_LENGTH*sizeof(int));
memcpy(b_local, ( int*)b, DATA_LENGTH*sizeof(int));

for (int i = 0; i < DATA_LENGTH; i++) {
#pragma HLS PIPELINE
c_local[i] = a_local[i]+b_local[i];
}

memcpy((int*)c, c_local, DATA_LENGTH*sizeof(int));
}
[/code]

Digital System Design with High-Level Synthesis for FPGA: Combinational Circuits

Udemy Course

2 thoughts on “How to modify a C program for HLS – 00: Adding Compiler Directives”
  1. I am not that much strong in programming side. Through basic knowledge i have written the Codes for Pseudo_random binary sequence using buffer concept but showing error. I am trying this task for my official work. Guide please.
    I want to execute PRBS using buffer concept in HLS and need to create IP. Then i need communicate between an IP and SDK. Please anyone help and debug my codes.
    int PRBS () {
    {
    #pragma HLS INTERFACE s_axilite port=return bundle=a
    uint32 buff[];
    static unsigned lfsr = 0xCD;
    int bit, i, count;
    for ( i = 0; i > 0) ^ (lfsr >> 2) ^ (lfsr >> 3) ^ (lfsr >> 4) ) & 1;
    lfsr = (lfsr >> 1) | (bit << 7);
    buff[] =bit;
    }
    memcpy(buff+i, bit, bit* sizeof(int));
    buff[0] +=1;
    for (i = 0; i < 50; i++) {
    buff[i+1]= buff[i]+ 1;
    }
    printf("%d", &buff);
    }
    }

  2. I have a doubt regarding DDR memory.

    I have a large set of coefficients belonging to a neural networks that I want to store in a DDR memory and access it through a dma ,

    So how can I define this structure in my hls design , in terms of accessing the coefficients for my c simulation and also the required pragmas for it

Leave a Reply

Discover more from High-Level Synthesis & Embedded Systems

Subscribe now to keep reading and get access to the full archive.

Continue reading