1. History and status quo of GPU Programming Technology

Source: Internet
Author: User

Preface

This article introduces the development history of GPU programming technology, so that you can get a preliminary understanding of GPU programming and enter the world of GPU programming.

Feng nuoman's computer architecture bottleneck

Almost all the processors used to work on the basis of von noriman's computer architecture.

In simple terms, this system architecture means that the processor constantly obtains, decodes, and executes the pointer from the memory.

But now, this system architecture has encountered a bottleneck: the memory read/write speed cannot keep up with the CPU clock frequency. Systems with this feature are called memory-constrained systems. Most computer systems currently belong to this type.

To solve this problem, the traditional solution is to use the cache technology. By setting a multi-level cache for the CPU, the pressure on the storage system can be greatly reduced:

  

However, as the cache capacity increases, the benefits of using a larger cache will rapidly decrease, which means we may need to find a new way.

Several things that inspire the development of GPU Programming Technology

1. At the end of 1970s, Clay series supercomputer was developed successfully (clay 1 spent $8 million that year ).

This type of computer uses a shared memory structure with several memory disks, that is, these memory disks can be connected to multiple processors to develop into today's symmetric multi-processor system (SMD ). Clay 2 is a vector machine. One operation processes multiple operands. The core of today's GPU devices is the vector processor.

2. In the early 1980s S, a company designed and developed a computer system called a connection machine.

The system has 16 CPU cores and adopts Standard Single-instruction multi-data (SIMD) parallel processing. This design can eliminate unnecessary memory access operations and change the memory read/write cycle to 1/16.

3. Cell processor invention

This type of processor is very interesting, and its architecture is roughly shown in:

  

In this structure, a PPC processor, as a regulatory processor, is connected to a large number of SPE stream processors, forming a workflow.

For a graphic processing process, one SPE can extract data, the other one is responsible for conversion, and the other is responsible for storing the data. In this way, a complete pipeline can be formed, greatly improving the processing speed.

By the way, in 2010, the third-largest supercomputer was implemented based on this design concept, covering an area of 560 m² square meters, costing US $0.125 billion.

Multi-Point Computing Model

Cluster Computing refers to high-performance computing by composing multiple computers with average performance into one computing network. This is a typical multi-point computing model.

The essence of GPU is also the multi-point computing mode. Compared with cluster computing, "point" is changed from a single computer to a single Sm (stream processor cluster ), network interconnection has become a connection through video memory (communication between the midpoint of the Multi-Point computing model is always an important issue to consider ).

GPU Solution

With the rise of the CPU "Power Consumption wall" problem, the GPU solution began to formally enter the stage.

GPU is especially suitable for parallel computing floating point type, showing the differences between GPU and CPU computing power in this case:

  

However, this does not mean that the GPU is better than the CPU, and the CPU should be eliminated. The test is performed when the computing can be completely parallel.

For more flexible and complex logic serial programs, GPU execution is far less efficient than CPU (no advanced mechanisms such as branch prediction ).

In addition, GPU applications are no longer limited to the graphics field. In fact, Cuda's current high-end board Tesla series is specifically used for scientific computing, they do not even have VGA interfaces.

Several new graphics cards and their configurations (column N cards only)

1. The specific meanings of parameters will be analyzed in detail in future articles.

2. The specific parameter information of the current video card can be obtained through the debugging tool (method omitted)

  

Mainstream GPU programming interfaces

1. Cuda

NVIDIA has launched an interface dedicated to GPU programming for N cards. The documentation is complete and applies to almost all N cards.

The GPU programming technology described in this column is based on this interface.

2. Open Cl

Open-source GPU programming interface, the most widely used, almost applicable to all graphics cards.

However, it is more difficult to master Cuda. It is recommended that you learn Cuda first. It is very easy to learn open Cl on this basis.

3. directcompute

GPU programming interface developed by Microsoft. It is very powerful and easy to learn, but it can only be used in Windows systems. It is not available in many high-end servers in UNIX systems.

In summary, these interfaces have their own advantages and disadvantages and need to be selected based on the actual situation. However, they use very similar methods, and it is easy to master one of them and learn the other two.

Learning the significance of GPU Programming

1. learn not only how to use GPU to solve problems, but also give us a deeper understanding of parallel programming ideas, laying the foundation for a comprehensive understanding of various parallel technologies in the future.

2. The research and development of parallel computing knowledge will inevitably become a hot spot in the IT field in the future.

    

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.