Wed. Dec 6th, 2023

# Covariance

The description of the covariance is taken from .

Covariance is one of the operator in statistics and data mining. It takes an input data as a 2D matrix and calculate the covariance of each two column. Therefore, the output is a symmetric matrix.

data:  is an $n\times m$ matrix that contains the input data

cove : is an $m\times m$ matrix that contains the results. $cov(i, j)=\frac{\sum^{N-1}_{k=0}{(data(k,i)-mean(i))(data(k,j)-mean(j))}}{N-1}$

where $mean(i)=\frac{\sum^{N-1}_{k=0}{data(k,i)}}{N}$

This description can be divided into three kernels :  k_mean, k_reduce and k_covar as shown in the following figure.

There are three memory objects shares the data between kernels and host program: d_data, d_mean and d_covar

The k_mean kernel calculates the mean value of each column in the d_data 2D  matrix. The k_mean kernel subtract the corresponding mean from each element in the d_data. Finally, d_covar calculate the covariance.

The three kernels are running sequentially one after another as shown in the figure.

The corresponding unoptimised code can be found at here.

References

 Tomofumi Yuki, Louis-Noel Pouchet, “PolyBench 4.2.1 (pre-release),” May 20, 2016, [online] http://web.cse.ohio-state.edu/~pouchet/software/polybench/

 S. Grauer-Gray, L. Xu, R. Searles, S. Ayalasomayajula and J. Cavazos, “Auto-tuning a high-level language targeted to GPU codes,” Innovative Parallel Computing (InPar), 2012, San Jose, CA, 2012, pp. 1-10. [online] https://cavazos-lab.github.io/PolyBench-ACC/