Recently, I have started to used FPGA (e.g. Zynq) to run neural-networks (NNs) defined in Caffe. My first step is performing the NN inference on FPGA. To do this and to be able to integrate the FPGA platform into Caffe, I have started to understand the Caffe C++ code and structure. Through a series of blogs, I am trying to explain my understanding using a few simple examples.
The goal of these blogs is not using the Caffe in efficient way to implement an application but get familiar with the Caffe code. Therefore, this blog is written for code developers not for application developers. In addition, I assume that reader is already familiar with the basic concepts in Caffe such as net, blob, layer and so on that can be found in its website.
In this first blog, I am going to define convolutional neural network (CNN). Although there are many books and articles explaining CNNs, their concepts and applications, here I am trying to keep everything simple just enough to be used in understanding the Caffe structure and how to add FPGA back-end for it.
Almost all articles explaining CNN start from neural-network (NN) concept, however, here I decided to start with convolution. This approach helps people who do not have background knowledge of NN start having early real experiments.
What is an image convolution?
First, what is a convolution? In general, convolution is a binary operator which combines two input functions and generates a new function highlighting a feature in one of the input function. The function whose features are going to be highlighted is called the main function and the second function is called the kernel.
In image processing, convolution is used to apply different filters on an image such as blurring, sharpen, edge detection and so on.
The following figure shows how to apply a kernel of size on an input image of size .
How to write a simple convolution in Caffe?
Step 00—Include required header files
#include <caffe/caffe.hpp> #include <opencv2/highgui/highgui.hpp>
Step 01—Select CPU or GPU
#ifdef CPU_ONLY Caffe::set_mode(Caffe::CPU); #else Caffe::set_mode(Caffe::GPU); #endif
Step 02: Define a network
shared_ptr<Net<float> > net_;
Step 03: Load the network from a file
net_.reset(new Net<float>(model_file, TEST));
Step 04: assign weights for the Sobel filter
shared_ptr<Layer<float> > conv_layer = net_->layer_by_name("conv"); float* weights = conv_layer->blobs()->mutable_cpu_data(); weights = -1; weights = 0; weights = 1; weights = -2; weights = 0; weights = 2; weights = -1; weights = 0; weights = 1;
Step 05: read the input image
string image_file = argv; cv::Mat img = cv::imread(image_file, -1);
Step 06: reshape the input blob to the size of the input image
shared_ptr<Blob<float> > input_blob = net_->blob_by_name("data"); num_channels_ = input_blob->channels(); input_blob->Reshape(1, num_channels_, img.rows, img.cols);
Step 07: reshape the whole network correspondingly
Step 08: copy the input image to the network input blob
int width = input_blob->width(); int height = input_blob->height(); float* input_data = input_blob->mutable_cpu_data(); cv::Mat channel(height, width, CV_32FC1, input_data); img.convertTo(channel, CV_32FC1);
Step 09: run the NN inference
Step 10: get the output and save in a file
Blob<float>* output_layer = net_->output_blobs(); int num_out_channels_ = output_layer->channels(); width = output_layer->width(); height = output_layer->height(); float* output_data = output_layer->mutable_cpu_data(); cv::Mat outputImage(height, width, CV_32FC1, output_data); imwrite("outout_Image.jpg", outputImage);
If the input image is
Then the output would be
The complete code can be found at here.