Cuda kernel function parameter schematic: KERNEL<<<DG,DB, Ns, s>>> (param list)

Source: Internet
Author: User

The kernel function is a program that runs on each thread of the GPU. Must be defined by the __GLOABL__ function type qualifier. The form is as follows:

__global__ void kernel (param list) {}  

The kernel function can only be called on the host side, and the execution parameters must be declared when invoked. The invocation form is as follows:

kernel<<<dg,db, Ns, s>>> (param list);  

The <<<>>> operation characters is the execution parameter of the kernel function that tells the compiler how to start the kernel function at run time to illustrate the number of threads in the kernel function and how the threads are organized.  

The full execution configuration parameter of the <<<>>> operator for the kernel function is <<<DG, Db, Ns, s>>>  

    • Parameter DG is used to define the dimensions and dimensions of the entire grid, that is, how many blocks a grid has. is a dim3 type. DIM3 Dg (dg.x, DG.Y, 1) indicates that each row in the grid has a block of dg.x, DG.Y blocks per column, and a third-dimensional constant of 1 (currently a kernel function has only one grid). The entire grid has DG.X*DG.Y blocks, of which the maximum value of dg.x and Dg.y is 65535.
    • The parameter DB defines the dimensions and dimensions of a block, that is, how many thread a block has. is a dim3 type. DIM3 Db (db.x, DB.Y, db.z) indicates that each row in the block has db.x thread, each column has db.y thread and a height of db.z. The maximum value for db.x and Db.y is 512,db.z maximum of 62. A block has db.x*db.y*db.z a thread. Hardware with 1.0,1.1 computing power the maximum value for this product is 768, and the maximum value for hardware supported by 1.2,1.3 is 1024.
    • The parameter ns is an optional parameter that sets the shared memory size, in bytes, that can be dynamically allocated for each block, in addition to the statically allocated shared memory. This value is 0 or omitted when dynamic allocation is not required.
    • The parameter s is an optional parameter of type cudastream_t, with an initial value of zero, indicating which stream the kernel function is in.

Cuda kernel function parameters schematic: kernel<<<dg,db, Ns, s>>> (param list)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.