Start of Cuda programming in Ubuntu 9.04

Source: Internet
Author: User

A while ago, I completed both the ant colony algorithm and the improved K-means algorithm, and then watched Cuda programming. I read the introduction of Cuda and thought that Cuda would be easy to use after C, in fact, you still need to know some GPU architecture-related knowledge to write a good program. After reading this book "Cuda for GPU high-performance computing", I feel that it is more like a manuscript, and I have sorted out a book from the previous hengduo document, because it is a collection of the wisdom of a large family, the lecture is good, that is, the order is not very good. There is always better than none. After reading it again, there is still some confidence in Cuda programming. We recommend that you take a look at it first.

Reading a book and writing a program is another thing. I set up the environment in the previous article, but I still don't know how to create a Cuda project and how to start writing a program. Fortunately, Cuda provides an SDK, which contains many instances for our reference. As a result, my first Cuda program starts from here.

The Cuda SDK instances are all under the src directory, and each instance has its own directory, such as deviceuery. There is also a MAKEFILE file used for compilation under its directory, this is to compile a single project. Now we compile all the instances. After running sudo make in the root directory of cuda_sdk, we can see the compiled executable program under <cuda_sdk_home>/bin/Linux/release, run the command to view the result.

This is the running result of devicequery:

  

We believe that we can use these instances to create our own projects. Then there is a template in the instance. cu ,. delete the CPP file and clear the content in the OBJ directory. This becomes an empty Cuda project. You can write a program under SRC and modify the compiled file name in makefie, compile. The generated execution file is in cuda_sdk_home/bin/Linux/release. Here is a test code that executes the matrix addition operation:

1 # include <stdio. h>

2 # include <stdlib. h>

3 # include <time. h>

4 # include <cuda_runtime.h>

5 # include <cutil. h>

6

7 # define vec_size 16

8

9 // Kernel Function

10 _ global _ void vecadd (float * d_a, float * d_ B, float * d_c)

11 {

12 INT Index = threadidx. X;

13 d_c [Index] = d_a [Index] + d_ B [Index];

14}

15

16 int main ()

17 {

18 // get the size of the allocated space

19 size_t size = vec_size * sizeof (float );

20

21 // allocate local memory

22 float * h_a = (float *) malloc (size );

23 float * h_ B = (float *) malloc (size );

24 float * h_c = (float *) malloc (size );

26 // Initialization

27 For (INT I = 0; I <vec_size; ++ I)

28 {

29 h_a [I] = 1.0;

30 h_ B [I] = 2.0;

31}

32

33 // copy the data in the local memory to the device

34 float * d_a;

35 cudamalloc (void **) & d_a, size );

36 cudamemcpy (d_a, h_a, size, cudamemcpyhosttodevice );

37

38 float * d_ B;

39 cudamalloc (void **) & d_ B, size );

40 cudamemcpy (d_ B, h_ B, size, cudamemcpyhosttodevice );

41

42 // allocate space for storing results

43 float * d_c;

44 cudamalloc (void **) & d_c, size );

45

46 // define 16 threads

47 dim3 dimblock (16 );

48 vecadd <1, dimblock> (d_a, d_ B, d_c );

49

50 // copy the computing result back to the primary storage

51 cudamemcpy (h_c, d_c, size, cudamemcpydevicetohost );

52

53 // output the calculation result

54 For (Int J = 0; j <vec_size; ++ J)

55 {

56 printf ("% F \ t", h_c [J]);

57}

58

59 // release host and device memory

60 cudafree (d_a );

61 cudafree (d_ B );

62 cudafree (d_c );

63

64 free (h_a );

65 free (h_ B );

66 free (h_c );

67

68 return 0;

69}

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.