"Reprint" CPP file call Cuda. cu file for graphics acceleration related programming

Source: Internet
Author: User

Transferred from: http://m.blog.csdn.net/blog/oHanTanYanYing/39855829

This article is about how CPP files call Cuda. cu files for graphics acceleration related programming. Of course, this is done in the case where Cuda is already configured by default, and if you have questions about how to configure Cuda, you can read this article before. In addition, now Cuda has released the support VS2013 version 6.5, so it is recommended to use the latest, after all, VS2013 easy to work with too much, configuration is no different. About the configuration article, and did not solve the cuda correlation function occasionally error hints, although for the compilation has no effect, but for the people with obsessive-compulsive disorder is more tangled, I will update after the study, hope well known.

There are two ways to implement graphics acceleration-related programming issues with how to call Cuda. cu files via CPP files. This article first discusses this method of creating a Cuda project based on the VS2013 template (as you can see after installing version 6.5 Cuda) and then adding the CPP file. As for the additional in MFC or Win32 engineering, such as adding. cu files to call this is essentially the same, it will be more troublesome, I have time to update later.

Before the topic begins, let's say how to call Cuda for graphics acceleration, in fact the big direction is very simple. The process is generally as follows:

Initialize video card memory, copy the host's pending memory data into the video card memory, copying the processed graphics memory data back to the host memory using the video card processing related data

OK, enter the topic below

First you create a Cuda project, and after the project is created, a. cu file is replaced with the contents of the file as follows

1#include"cuda_runtime.h"2#include"Device_launch_parameters.h"3#include"Main.h"4 5InlinevoidCheckcudaerrors (Cudaerror Err)//Error Handling functions6  {7        if(Cudasuccess! =err)8         {9fprintf (stderr,"CUDA Runtime API error:%s.\n", cudageterrorstring (err));Ten             return; One         } A  } -  -__global__voidAddint*a,int*b,int*C)//Processing kernel functions the { -     intTid = blockidx.x*blockdim.x+threadidx.x; -      for(size_t k =0; K <50000; k++) -     { +C[tid] = A[tid] +B[tid]; -     } + } A  at extern "C" intRuntest (int*host_a,int*host_b,int*Host_c) - { -     int*dev_a, *dev_b, *Dev_c; -      -Checkcudaerrors (Cudamalloc (void* *) &dev_a,sizeof(int) * datasize));//allocating video card memory -Checkcudaerrors (Cudamalloc (void* *) &dev_b,sizeof(int)*datasize)); inCheckcudaerrors (Cudamalloc (void* *) &dev_c,sizeof(int)*datasize)); -      toCheckcudaerrors (cudamemcpy (dev_a, Host_a,sizeof(int) * DataSize, Cudamemcpyhosttodevice));//Copy the host pending data memory block into the video card memory +Checkcudaerrors (cudamemcpy (Dev_b, Host_b,sizeof(int)*datasize, Cudamemcpyhosttodevice)); -  theAdd << <datasize/ -, ->> > (dev_a, Dev_b, Dev_c);//calling the graphics card to process data *Checkcudaerrors (cudamemcpy (Host_c, Dev_c,sizeof(int) * DataSize, cudamemcpydevicetohost));//Copy the video card after processing the data back $ Panax NotoginsengCudafree (dev_a);//clean up the video card memory - Cudafree (dev_b); the Cudafree (dev_c); +     return 0; A}

Then add the Main.h file to the project and add the following:

1 #include <time.h>// time-related header files, where functions can be used to  calculate image processing Speed 2 #include <iostream >3#define datasize 50000

The following is the implementation file of Main, CPP, implemented in CPP for Cuda. cu file calls. The contents are as follows

#include"Main.h"extern "C" intRuntest (int*host_a,int*host_b,int*host_c);//graphics card processing functionsintMain () {intA[datasize], b[datasize], c[datasize];  for(size_t i =0; i < datasize; i++) {A[i]=i; B[i]= i*i; }    LongNow1 = Clock ();//storage image processing start timeRuntest (A,B,C);//Calling graphics accelerationprintf"GPU Run time:%dms\n",int(((Double) (Clock ()-NOW1)/clocks_per_sec * +));//output GPU Processing time    LongNow2 = Clock ();//storage image processing start time     for(size_t i =0; i < datasize; i++)    {         for(size_t k =0; K <50000; k++) {C[i]= (A[i] +B[i]); }} printf ("CPU Run time:%dms\n",int(((Double) (Clock ()-now2)/clocks_per_sec * +));//output GPU Processing time    /*for (size_t i = 0; i <; i++)//View calculation result {printf ("%d+%d=%d\n", A[i], b[i], c[i]); }*/GetChar (); return 0;}

It should be noted that in the Cuda function used to be called to add the extern "C" declaration, and in the CPP file declaration (extern "C" int runtest (int *host_a, int *host_b, int *host_c), and then called.

By the end of the first part of this article, the compiler runs to see that the GPU is actually much faster than the CPU when it comes to processing complex parallel computations. On the other way mentioned before the next time, the holiday is over, the amount ...

Well, from the above article has been completed half a year long, to pits, another method of the blog address here.

"Blogger" Note: I tried, my situation is available: visual studio2010 + Cuda 6.0

"Reprint" CPP file call Cuda. cu file for graphics acceleration related programming

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.