The method of using Python to write Cuda programs is described in detail

Last Update:2017-03-28 Source: Internet

Author: User

Tags numba

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Here's a small piece to bring you a Python program using the method of writing Cuda. Small series feel very good, now share to everyone, also for everyone to make a reference. Let's take a look at it with a little knitting.

There are two ways to use Python to write Cuda programs:

* Numba
* Pycuda

Numbapro is deprecated now, features are split and integrated into accelerate and Numba, respectively.

Example

Numba

Numba optimizes Python code through a timely compilation mechanism (JIT), Numba can be optimized for native hardware environments, supports CPU and GPU optimizations, and integrates with NumPy to enable Python code to run on the GPU. Simply add the relevant instruction tag above the function,

As shown below:

Import NumPy as NP from Timeit import Default_timer as Timerfrom Numba import vectorize@vectorize (["Float32 (float32, float "], target= ' Cuda ') def vectoradd (A, B):  return a + bdef main ():  n = 320000000  a = Np.ones (n, DTYPE=NP.FLOAT3 2)  B = Np.ones (n, dtype=np.float32)  C = Np.zeros (n, dtype=np.float32)  start = timer ()  c = Vectoradd (A, B)  vectoradd_time = Timer ()-Start  print ("c[:5] =" + str (c[:5))  print ("c[-5:] =" + str (c[-5:]))  print ("Vectoradd took%f seconds"% vectoradd_time) if name = = ' main ':  Main ()

Pycuda

Pycuda's kernel functions (kernel) are written in C + +, and are dynamically compiled into GPU microcode, and Python code interacts with the GPU code as follows:

Import Pycuda.autoinitimport Pycuda.driver as Drvimport numpy as Npfrom Timeit import Default_timer as Timerfrom pycuda.co  Mpiler Import sourcemodulemod = Sourcemodule ("" "" Global void Func (float *a, float *b, size_t N) {const int i = blockidx.x * Blockdim.x + threadidx.x; if (i >= N) {return;} float temp_a = A[i]; float temp_b = b[i]; A[i] = (TEMP_A * + 2) * ((temp_b + 2) * 10-5) * 5; A[i] = A[i] + b[i];} "" ")  Func = mod.get_function ("func") def Test (n): # n = 1024x768 * 1024x768 * # float:4m = 1024x768 * 1024x768 print ("n =%d"% n) n =  Np.int32 (n) A = NP.RANDOM.RANDN (n). Astype (np.float32) b = Np.random.randn (n). Astype (np.float32) # Copy A to AA AA = Np.empty_like (a) aa[:] = a # GPU run ntheads = nblocks = Int ((N + nTheads-1)/ntheads) start = Timer () F UNC (DRV. InOut (a), DRV.  In (b), N, block= (ntheads, 1, 1), grid= (Nblocks, 1)) Run_time = Timer ()-Start print ("GPU Run time%f seconds "% run_time) # CPU Run start = timer () AA = (AA * 10 + 2) * ((b + 2) * 10-5) * 5 run_time = timer ()-Start print ("CPU run time%f seconds"% run_time) # Check Res Ult r = A-aa print (min (r), Max (R)) def Main (): For n in range (1, ten): n = 1024x768 * 1024x768 * (n *) print ("----------- -%d---------------"% n) test (n) If name = = ' Main ': Main ()

Contrast

Numba uses some instructions to flag some functions for acceleration (or you can write kernel functions using Python), which is similar to OPENACC, and Pycuda needs to write kernel itself, compile at run time, and the bottom layer is based on C/s + + implementations. By testing, the two approaches are almost as fast as they are. However, Numba more like a black box, do not know what the inside exactly do, and Pycuda is very intuitive. Therefore, these two approaches have different applications:

* If you just want to speed up your algorithm and don't care about CUDA programming, then it would be better to use Numba directly.

* If in order to learn, study CUDA programming or experiment with the feasibility of an algorithm in Cuda, then use Pycuda.

* If you write a program to be ported to C + + in the future, then you must use Pycuda, because the use of Pycuda write kernel itself is in Cuda C + + written.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More