GPU---parallel computing tool

Source: Internet
Author: User

1 What is a GPU

1, this PC and ordinary PC is different from the 7 card requires, the lower left corner is the graphics card, in the middle is the GPU chip. The processor of the graphics card is called the Graphics Processing device (GPU), which is the "heart" of the graphics card, similar to the CPU, except that the GPU is designed to perform complex mathematical and geometric calculations.

GPU computing power is very powerful, for example: the current mainstream i7 processor floating-point computing power is the mainstream NVIDIA GPU processor floating-point computing power of 1/12.

Figure 1 Graphics and GPU

2 Why is GPU computing power so powerful?

Figure 2 compares the CPU to the logical architecture in the GPU. Where control is the controller, ALU arithmetic logic unit, cache is the CPU internal buffer, DRAM is memory. You can see that the GPU designer uses more transistors as an execution unit than a CPU as a complex control unit and cache. In fact, 5% of the CPU chip space is the ALU, and 40% of the GPU space is the ALU. This is why GPU computing power is super-powerful.

Figure 2 Comparison of the logical structure of CPU and GPU hardware

That is why the CPU is not designed like the GPU, so the computational power is powerful!

Why? The CPU needs to be very generic. CPUs need to support both parallel and serial operations at the same time, requiring strong commonality to handle a wide variety of data types while supporting complex and common logic judgments, which introduces a large number of branch jumps and interrupt processing. All of these make the internal structure of the CPU is very complicated, and the proportion of the computational unit is reduced. The GPU is faced with a highly unified, interdependent, large-scale data and pure computing environment that does not need to be interrupted. So GPU chips are much simpler than CPU chips.

For example, assuming that a bunch of identical subtraction computing tasks need to be dealt with, that task is given to a bunch of (dozens of) elementary school pupils, where pupils are similar to the GPU's computational units, and some of the more complex logical reasoning, such as the derivation of formulas and the highly logical task of writing scientific articles, To primary school students obviously inappropriate, when the university professor is more suitable, the university professor here is the CPU computing unit, the university professor can certainly deal with the problem of subtraction, Professor calculates the subtraction May and elementary school students calculate the speed is as fast or even faster, but the cost is obviously much higher.

3 GPU Programming Library

GPU computing power is so strong that it is widely used! such as mining (bitcoin), graphics and image processing, numerical simulation, machine learning algorithm training, and so on, how can we play the GPU super-strong computing power? ---programming!

How to do GPU programming? The GPU is now available in a variety of GPUs, such as Nvidia, AMD, and Intel, and the most popular is the NVIDIA GPU, which also launches the Cuda Parallel programming library. However, each GPU production company has introduced its own programming library, which obviously increases the learning cost, so Apple has introduced standard OpenCL, saying that each manufacturer supports my standards, as long as a set of OpenCL programming libraries can be used for various types of GPU chips. Cuda and OpenCL are the two most mainstream GPU programming libraries today.

From the programming language point of view, Cuda and OpenCL are native support C/s, other languages want to access some trouble, such as Java, need to use JNI to access Cuda or OpenCL. Based on JNI, there are now a variety of Java versions of the GPU programming libraries, such as Jcuda. Another idea is whether the language is written in Java or by a tool that translates Java into C.

Figure 3 GPU Programming Library

LWJGL (http://www.lwjgl.org/) JOCL (http://www.jocl.org/) Jcuda (http://www.jcuda.de/) Aparapi (HTTP/ code.google.com/p/aparapi/) Javacl (http://code.google.com/p/javacl/) 4 Cuda program flow

Figure 4 Cuda Program flow

5 Example---image transformation

Let's say we're like a processing task, adding 1 to each pixel value. The parallel approach is simple, with one GPU thread for each pixel, plus 1 operation.

Figure 5 Example

Figure 6 Kernel functions

Figure 7 Main Flow function

6 GPU Acceleration Effects

It's me. P&d dem image preprocessing algorithm using GPU acceleration effect, GeForce GT 330 is a block of ordinary desktop cards, now the price is about 500 yuan, with it to reach 20 times times the speedup, Tesla M2075 is a more professional graphics card, the price of about 10,000 , using it to achieve nearly a hundredfold speedup, the program i7 CPU single Process single-threaded to run 2 hours, with a GPU more than a minute to complete the calculation.

Figure 8 P&d DEM Image preprocessing algorithm acceleration effect

GPU---parallel computing tool

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.