AMD opencl university course (4)

Source: Internet
Author: User

Kernel Object:

Kernel is a function in the program code, which can be executed on the opencl device. A kernel object is the kernel function and its related input parameters.

 

The kernel object is created through the program object and the specified function name. Note: A function must exist in the source code of the program.

Compile at runtime:

During runtime, compiling programs and creating kernel objects have time overhead, but this is flexible and can adapt to different opencl hardware platforms. The program can be dynamically compiled only once, and the kernel object can be called repeatedly after it is created.

 

After creating a kernel, we also need to set parameters for the kernel object before running the kernel. We can re-set the parameters to run again after the kernel is run.

Arg_index specifies the number of parameters in the kernel function (for example, the first parameter is 0, the second parameter is 1 ,...). Both the memory object and a single value can be used as the kernel parameter. The following are two examples of setting the Kernel Parameter:

Clsetkernelarg (kernel, 0, sizeof (cl_mem), (void *) & d_iimage );

Clsetkernelarg (kernel, 1, sizeof (INT), (void *) & );

Before running the kernel, Let's first look at the thread structure in opencl:

In a large-scale parallel program, each thread usually processes a part of the problem. For example, in Vector Addition, we add the elements corresponding to the two vectors so that each thread can process an addition.

The following shows a Vector Addition of 16 elements: two input buffers A and B, and one output buffer C.

In this case, we can create a one-dimensional thread structure to match this problem.

Each thread uses its own thread ID as an index and adds up corresponding elements.

The thread structure in opencl can be scaled. Each running instance of the kernel is called a workitem (that is, a thread). workitem is organized together and called a workgroup. In opencl, each workgroup is mutually independent.

By using a global ID (which is unique in the index space) or a workgroup ID and a local ID in the Work Group, I can calibrate a workitem.

In the kernel function, we can call the API to obtain the global ID and other information:

Get_global_id (DIM)

Get_global_size (DIM)

The two functions can obtain the global ID of each dimension.

Get_group_id (DIM)

Get_num_groups (DIM)

Get_local_id (DIM)

Get_local_size (DIM)

These functions are used to calculate the group ID and the local ID in the group.

Get_global_id (0) = column, get_global_id (1) = row

Get_num_groups (0) * get_local_size (0) = get_global_size (0)

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.