Memory Management Mechanism in Cuda Programming

Last Update:2018-12-03 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Main categories and features of GPU device-side memory:

Size:

Global and texture memory: The size is limited by the ram size.

Local Memory: Each thread is limited to 16 KB

Shared Memory: up to 16 KB

Constant memory: 64 kB in total

Each SM has 8192 or 16384 32-bit registers.

Speed:

Global, local, texture <constant <shared, register

Data Alignment:

The device can read 4-byte, 8-byte, or 16-byte content from the global memory in one operation to the Register, an error may be returned when an alignment of 8-byte or 16-byte content is read.

How to Use merged access to improve access efficiency:

1. Use Structure of Arrays: SOA to replace struct array (Array
Of structures: AOS ):

2. Use shared memory for combined access.

Memory substrate (memory padding ):

Common Access Mode: Two-dimensional array

When a thread with an index of (TX, Ty) accesses a two-dimensional array with a width of N and a base address of baseaddress, the following address is used: baseaddress + N * ty + Tx. In this case, how can we ensure the combined access:

Blockdim. x = 16x and n = 16x.

We can control blockdim. X, but the array width is not always 16x. The memory substrate is to create an array with a width of 16 X, and then fill the unused part with 0. Here is a concept: the Main Size of array A (Leading dimension) -- pitch, IDA for short. Because C/C ++ is Row-dominated, the main size is the row width (that is, the number of elements in a row ). Cuda provides the corresponding API, cudamallocpitch () to allocate 2D arrays. Similar functions also exist in 3D.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Memory Management Mechanism in Cuda Programming

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support