Start learning opencl ......
Because old wolf's video card is amd 5xx's redwood, we will first introduce the installation of opencl app (accelerated parallel processing.
: Http://developer.amd.com/tools/hc/AMDAPPSDK/downloads/Pages/default.aspx
Installation notes: http://developer.amd.com/tools/hc/AMDAPPSDK/assets/AMD_APP_SDK_Installation_Notes.pdf
Use
1. opencl Architecture
Opencl can implement parallel computing on hybrid devices, including CPU, GPU, and Other Processors, such as cell processors and DSPs. With opencl programming, you can achieve portable parallel acceleration code. [However, due to the different hardware performance of each opencl device, specific
threads 0, 2, 4 ....
Predicate = false for threads 1, 3, 5 ....
The following is an example of a control flow diverage.
In Case 1, all odd-number threads execute dosomework2 (), and all even-number threads execute dosomeworks. However, in each wave, IF and else Code commands must be released.
In Case 2, the first wave executes IF and the other waves executes else. In this case, only one If and else code is fired in each wave.
In prediction, the command execution time i
AMD Graphics Ubuntu System OPENCL Environment Setup
1. Install the video driver
1) Download the driver in Http://support.amd.com/zh-cn/download/linux, be sure to note the version
2) Install Fglrx-core_15.302-0ubuntu1_amd64_ub_14.01.deb First, you may encounter a lack of libc6-i385 lib32gcc1 dkms, perform
sudo apt-get autoremove sudo apt-get autoclean
Sudo apt-get-f Update
Sudo apt-get-f Install ibc6-i385 l
GPU Architecture
The content includes:
1. Relationship between openclspec and multi-core hardware
Amd gpu Architecture
Nvdia GPU Architecture
Cell broadband Engine
2. Some special topics about opencl
Opencl compilation system
Installable client driver
First of all, we may have doubts: Since opencl is platform
The amd opencl university course is a very good entry-level opencl tutorial. by reading the PPT In the tutorial, we can quickly learn about the opencl mechanism and programming methods. : Http://developer.amd.com/zones/OpenCLZone/universities/Pages/default.aspx
The English in the tutorial is very simple. I believe that
results on AMD and NV platforms are as follows:
Amd gpu = 5870 stream SDK 2.2
Nvidia gpu = GTX 480 with Cuda 3.1
In addition, in the program, we also tried to expand the loop. by expanding the inner loop, we reduced the number of GPU Execution Branch commands. In my test, we used expansion four times, the FPS is 30% faster than that before expansion. (AMD 5670
Since Apple officially submitted its opencl to the khronos group Open Standards Organization in, it has received support from major companies such as AMD, NVIDIA, and Intel. Opencl can make full use of GPU data-intensive large-scale computing capabilities, so that many multimedia applications and even scientific computing can greatly improve performance.
Here we
Kernel Object:
Kernel is a function in the program code, which can be executed on the opencl device. A kernel object is the kernel function and its related input parameters.
The kernel object is created through the program object and the specified function name. Note: A function must exist in the source code of the program.
Compile at runtime:
During runtime, compiling programs and creating kernel objects have time overhead, but this is flexible an
heterogeneous platforms. The heterogeneous platforms supported by opencl can be composed of multi-core CPUs, GPUs, or other types of processors. Opencl consists of two parts: one is used to compile the Kernel Program (run on the opencl DeviceCodeThe second is to define and control the platform's APIs. Opencl provides
three buffers in the host memory
Float * buf1 = 0;
Float * buf2 = 0;
Float * buf = 0;
Buf1 = (float *) malloc (BUFSIZE * sizeof (float ));
Buf2 = (float *) malloc (BUFSIZE * sizeof (float ));
Buf = (float *) malloc (BUFSIZE * sizeof (float ));
// Initialize buf1 and buf2 with some random values
Int I;
Srand (unsigned) time (NULL ));
For (I = 0; I
Buf1 [I] = rand () %65535;
Srand (unsigned) time (NULL) + 1000 );
For (I = 0; I
Buf2 [I] = rand () %65535;
// Compute the sum of buf1, buf2, and cpu
, there are two opencl platforms in my OS: Intel and AMD. Using the above Code may cause errors, because it has obtained Intel's opencl platform, while Intel's platform only supports CPU, and our subsequent operations are based on GPU, we can use the following code to obtain AMD's opencl platform.
cl_uint numPlatforms;
);
Note: If we have more than one OpenCL platform installed in our system, such as my OS, there are Intel and AMD two OPENCL platforms, with this line of code, there may be errors, Because it gets Intel's OpenCL platform, and Intel's platform only supports CPUs, and we're behind GPU-based, we can use the following co
OpenCL programming guide for interoperability with Direct3D and opencl programming guide
This article describes the interoperability between OpenCL and D3D 10.1. initialize OpenCL context for Direct3D interoperability
OpenCL sharing is enabled by pragma cl_khr_d3d10_sharing:
institution, middleware provider, and so forth, initially set by Apple to launch the standard, followed by Khronos Group to establish a working group to coordinate these companies to maintain the common computing language. Khronos Group sounds familiar. The famous hardware and software Interface API specification in the field of image rendering the famous OpenGL is also maintained by this Organization, in fact, they also maintain a lot of multimedia domain specifications, may also be similar to
clearer.
Figure 1 Cpu-gpu heterogeneous computing advocated by the OPENCL standardWe know that CPUs and GPUs have strengths, and the CPU is generally adept at handling irregular data structures and unpredictable access patterns, as well as recursive algorithms, branch-intensive code, and single-threaded threads. Such program tasks have complex instruction scheduling, looping, branching, logical judgments, and execution steps. For example, system so
, software developers, academic institutions, middleware providers, and other companies. It was initially proposed by Apple, followed by khronosThe group established a working group to coordinate these companies to jointly maintain the general computing language. Khronos group sounds familiar. OpenGL, a well-known software and hardware interface API specification in the image rendering field, is also maintained by this organization. In fact, they have maintained many specifications in the multim
Welcome to repost, please note
Http://blog.csdn.net/leonwei/article/details/8893796
1 Hello opencl
Compile a simple example program to demonstrate the basic usage of opencl:
1. You can download an opencl SDK from the developer website of nvdia, AMD, Intel, or all opencl mem
Recently, khronos announced the first test version of opencl (Open Computing language). Once released, it was a big wave in the field of general computing! Opencl is an open and free standard for concurrent programming for general purposes of heterogeneous systems, initiated by Apple and jointly developed by many famous manufacturers in the industry. It is also a unified programming environment. It facilita
industry-standard OpenGL and OpenAL, which are used for three-dimensional graphics and computer audio, respectively, for the two standard. OpenCL extends the ability of the GPU to be used outside of graphics generation. OpenCL is run by Khronos Group, a nonprofit technology organization. In addition, OpenCL is a cross-platform standard and not a cross-platform t
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.