GPGPU OpenCL Programming steps and simple examples

Source: Internet
Author: User

1.OpenCL concept

OpenCL is a framework for writing programs for heterogeneous platforms, which can be composed of cpui, GPU, or other types of processors. OpenCL consists of a language for writing kernels (functions that run on OpenCL devices) (based on C99) and a set of APIs for defining and controlling the platform.

OpenCL provides two kinds of parallel mechanisms: task parallelism and data parallelism.

The difference between 2.OpenCL and Cuda

Different points: OpenCL is a common heterogeneous platform programming language, in order to take into account different devices, the use of cumbersome.

Cuda is a framework for the programming of GPGPU by NVIDIA, which is simple to use and a good primer.

Same point: Both are based on task parallelism and data parallelism.

3.OpenCL Programming Steps

(1) Discover and initialize the platforms

Call the two-time clgetplatformids function, get the number of available platforms for the first time, and get an available platform for the second time.

(2) Discover and initialize the devices

Call the two-time clgetdeviceids function, get the number of available devices for the first time, and get an available device for the second time.

(3) Create a context (call Clcreatecontext function)

Context contexts may manage multiple device device.

(4) Create a command queue (call Clcreatecommandqueue function)

A device device corresponds to a command queue.

The context Conetxt sends commands to the corresponding command queue of the device, and the device can execute commands in the command queue.

(5) Create device buffers (invoke Clcreatebuffer function)

The data object is stored in the buffer, where the data required by the device execution program is stored.

The buffer is created by the context Conetxt, so that multiple devices that are managed by the context share the data in the buffer.

(6) Write host data to device buffers (invoke Clenqueuewritebuffer function)

(7) Create and compile the program

Create a Program object that represents your program source file or binary code data.

(8) Create the kernel (call Clcreatekernel function)

According to your program object, generate a kernel object that represents the entry of the device program.

(9) Set the kernel arguments (call clsetkernelarg function)

(a) Configure the Work-item structure (set worksize)

Configuration of Work-item (dimensions, group composition, etc.)

(one) Enqueue the kernel for execution (invoke Clenqueuendrangekernel function)

Put the kernel object, and the Work-item parameter, into the command queue for execution.

() Read the output buffer back to the host (call Clenqueuereadbuffer function)

(OPENCL) Release (this concludes the entire run process)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.