In the past two days, I took some time to study the problem of calling the Cuda program using Matlab. I found that there was less information on the Internet and the White Paper provided by NVIDIA was not detailed enough. Therefore, I would like to summarize the development process, hope you can use it.
In general, there are two methods to call the Cuda program in MATLAB. The first is to create the DLL of the Cuda program using the Cuda Wizard of kaiyong, and then call it in MATLAB using the Mex compiler of MATLAB; the second is to write the Cuda program as a standard Mex file format according to the White Paper, and then use the Mex compiler to generate binary code and call it in MATLAB. Both methods generate binary programs that end with. mexw32 or. mexw64 and use them directly in the M file.
1. MATLAB White Paper Method
A. Just use cufft to write Cuda code and C code functions in the Mex standard file. Then, the following compilation command can be used to generate the binary program.
Mex fft2_cuda.c-ic:/Cuda/include-LC:/Cuda/lib-lcudart-lcufft
B. The Code contains core functions. readme indicates that the following commands can be used, but errors are always reported. Is there an error in nvmex. pl, maybe vs or MATLAB? (My vs2005 SP1 and Matlab r2010)
Nvmex-F nvmexopts. Bat sheze. cu-ic:/Cuda/include-LC:/Cuda/lib-lcufft-lcudart
C. I downloaded a code from the Internet, which is very good. It contains the MATLAB compiling script file my_compile.m AND THE Cuda file test. cu, Mex standard file test. CPP. this is a good example. Run my_compile test. cpp first, and then generate a binary program file for MATLAB to call.
My_compile.m
///////////////////////
Function my_compile (varargin)
! "% Vs80comntools % vsvars32.bat" & nvcc-C-arch compute_13 test. Cu
N = getenv ('cuda _ lib_path '); if n (1) =' "', n = N (2: End); end, if n (end) = '"', n = N (1: End-1); End
Mex (['-l' N],'-lcudart ', 'test. OBJ', varargin {:});
Test. Cu
//////////////////
Extern "C" Void gpuadd (double * a, double * B, double * C );
_ Global _ void vecadd (double * a, double * B, double * C)
{
Int I = threadidx. X;
C [I] = A [I] + B [I];
}
Void gpuadd (double * a, double * B, double * C)
{
Double * Ad, * BD, * CD;
Cudamalloc (void **) & AD, 5 * sizeof (double ));
Cudamalloc (void **) & BD, 5 * sizeof (double ));
Cudamalloc (void **) & CD, 5 * sizeof (double ));
Cudamemcpy (AD, A, 5 * sizeof (double), cudamemcpyhosttodevice); cudamemcpy (BD, B, 5 * sizeof (double), cudamemcpyhosttodevice );
Vecadd <> (AD, BD, Cd );
Cudamemcpy (C, CD, 5 * sizeof (double), cudamemcpydevicetohost );
Cudafree (AD); cudafree (BD); cudafree (CD );
}
Test. cpp
/////////////
# Include "Mex. H"
Extern "C" Void gpuadd (double * a, double * B, double * C );
Void mexfunction (
Int nlhs, mxarray * plhs [],
Int nrhs, const mxarray * prhs [])
{
Double * C;
If (mxgetnumberofelements (prhs [0])! = 5) mexerrmsgtxt ("wrong number of elements in! ");
If (mxgetnumberofelements (prhs [1])! = 5) mexerrmsgtxt ("wrong number of elements in B! ");
C = (double *) mxgetdata (plhs [0] = mxcreatedoublematrix (1, 5, mxreal ));
Gpuadd (double *) mxgetdata (prhs [0]), (double *) mxgetdata (prhs [1]), C );
}
2. Dynamic library Method
First, follow the DLL development method of kaiyong to complete the DLL encapsulation of the Cuda program. Then, create a standard Mex file containing the mexfunction and write the called Cuda function into it. The compilation method is similar:
Mex-L.-ltest test. c
3. Compile the Cuda program directly into an EXE file and use the system function in MATLAB to call it)