Kernel Object:
Kernel is a function in the program code, which can be executed on the opencl device. A kernel object is the kernel function and its related input parameters.
The kernel object is created through the program object and the specified function name. Note: A function must exist in the source code of the program.
Compile at runtime:
During runtime, compiling programs and creating kernel objects have time overhead, but this is flexible and can adapt to different opencl hardware platforms. The program can be dynamically compiled only once, and the kernel object can be called repeatedly after it is created.
After creating a kernel, we also need to set parameters for the kernel object before running the kernel. We can re-set the parameters to run again after the kernel is run.
Arg_index specifies the number of parameters in the kernel function (for example, the first parameter is 0, the second parameter is 1 ,...). Both the memory object and a single value can be used as the kernel parameter. The following are two examples of setting the Kernel Parameter:
Clsetkernelarg (kernel, 0, sizeof (cl_mem), (void *) & d_iimage );
Clsetkernelarg (kernel, 1, sizeof (INT), (void *) & );
Before running the kernel, Let's first look at the thread structure in opencl:
In a large-scale parallel program, each thread usually processes a part of the problem. For example, in Vector Addition, we add the elements corresponding to the two vectors so that each thread can process an addition.
The following shows a Vector Addition of 16 elements: two input buffers A and B, and one output buffer C.
In this case, we can create a one-dimensional thread structure to match this problem.
Each thread uses its own thread ID as an index and adds up corresponding elements.
The thread structure in opencl can be scaled. Each running instance of the kernel is called a workitem (that is, a thread). workitem is organized together and called a workgroup. In opencl, each workgroup is mutually independent.
By using a global ID (which is unique in the index space) or a workgroup ID and a local ID in the Work Group, I can calibrate a workitem.
In the kernel function, we can call the API to obtain the global ID and other information:
Get_global_id (DIM)
Get_global_size (DIM)
The two functions can obtain the global ID of each dimension.
Get_group_id (DIM)
Get_num_groups (DIM)
Get_local_id (DIM)
Get_local_size (DIM)
These functions are used to calculate the group ID and the local ID in the group.
Get_global_id (0) = column, get_global_id (1) = row
Get_num_groups (0) * get_local_size (0) = get_global_size (0)