Memory Management Mechanism in Cuda Programming

Source: Internet
Author: User

Main categories and features of GPU device-side memory:

Size:

Global and texture memory: The size is limited by the ram size.

Local Memory: Each thread is limited to 16 KB

Shared Memory: up to 16 KB

Constant memory: 64 kB in total

Each SM has 8192 or 16384 32-bit registers.

Speed:

Global, local, texture <constant <shared, register

Data Alignment:

The device can read 4-byte, 8-byte, or 16-byte content from the global memory in one operation to the Register, an error may be returned when an alignment of 8-byte or 16-byte content is read.

How to Use merged access to improve access efficiency:

1. Use Structure of Arrays: SOA to replace struct array (Array
Of structures: AOS ):


2. Use shared memory for combined access.

 

Memory substrate (memory padding ):

Common Access Mode: Two-dimensional array

When a thread with an index of (TX, Ty) accesses a two-dimensional array with a width of N and a base address of baseaddress, the following address is used: baseaddress + N * ty + Tx. In this case, how can we ensure the combined access:

Blockdim. x = 16x and n = 16x.

We can control blockdim. X, but the array width is not always 16x. The memory substrate is to create an array with a width of 16 X, and then fill the unused part with 0. Here is a concept: the Main Size of array A (Leading dimension) -- pitch, IDA for short. Because C/C ++ is Row-dominated, the main size is the row width (that is, the number of elements in a row ). Cuda provides the corresponding API, cudamallocpitch () to allocate 2D arrays. Similar functions also exist in 3D.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.