Directx11 tutorial (14) d3d11 pipeline (2)

Source: Internet
Author: User

Next we will learn some GPU memory knowledge, the main reference: http://fgiesen.wordpress.com/0211/07/02/a-trip-through-the-graphics-pipeline-2011-part-2

Currently, the commonly used video memory on GPUs is gddr5. Compared with the memory ddr3 commonly used on the host, it features high bandwidth and time extension. The following is a comparison between memory and memory in core i7 2600 and gtx480.

...

Core i7 2600

GTX 480

Bandwidth

19 Gb/s

180 Gb/s

Latency

140 clocks

400-800 clocks

DRAM chips are usually organized into a two-dimensional grid. Each intersection is composed of a transistor and a capacitor, and each intersection represents an address bit of memory. For example, 1g gddr5 memory is organized into 32 gddr5 blocks:

Each 32 MB block consists of four bank groups. Each bank group consists of 16 (or 32) Banks.

Each bank is composed of a two-dimensional grid dram chip:

The row address is a A0-A11, a total of 4 K, the column address A0-A5, a total of 64, so a bank space is 256 K

Generally, DRAM reads and writes data by row. To improve the Read and Write efficiency, it is best to read a row of data at a time. The pagesize of gddr5 is usually 2 K.

[Note: There are some mistakes in DRAM understanding. I am learning about dram. For details, refer to another log. old wolf: 2012-11-13: http://www.cnblogs.com/mikewolf2002/archive/2012/11/13/2768804.html]

Next, let's take a look at how memory connects to GPU and host to understand the workflow of video memory:

Some quick clients in the GPU, such as depth block, color block, and texture block, are directly connected to MC, while some blocks with a small amount of data, such as command processor (CP) it must go through the hub and then reach the corresponding MC (memory controller ).

In the hub, there may be VM L2, which will perform some page table searches and then the requests will be routed to the corresponding MC. Mc mainly includes the client interface, Vm L1, ARB and other modules. Client infterface will deal with different clients, then pass them to VM L1, perform page table search, and finally go to ARB for arbitration to enter the corresponding GDDR. Gpu mc is usually 32bit, while ddr3 MC is usually 64-bit. We can use the following formula to calculate the GPU memory bandwidth: mclk * datarate * channelwidth * channel number/8/1000, simplified: mclk * 4*32 * channel number/8/1000. If the video card has 12 Mc channels, the memory bandwidth is 1375*4*12*32/8/1000 = 264 Gb/s.

Other PCIe devices and hosts are connected to the MMU (Memory Management Unit) through the PCIe bus and then to the hub. Here, MMU is a general term for different implementations, MMU may include many blocks.

GPU interacts with the host and other devices through the PCIe bus. The GPU and the host usually use pcie2.0 16 Lane (the latest video card uses pcie3.0), and the uplink and downlink reach 8 Gb/s, other slow devices, such as display, may only require 4 lanes.

For details about PCIe, see: http://www.cnblogs.com/mikewolf2002/archive/2012/03/20/2408389.html

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.