Thread Grid (GRID)

Last Update:2015-09-04 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

In parallel operation, the reasonable processing of thread grid can obtain higher performance speedup for the program, and how to use the thread grid rationally to make the parallel program more efficient.

A thread grid consists of several line Cheng, each of which is a two-dimensional Cheng, divided into x-axis and y-axis. At this point, you can open up to y*x*t threads at a time. Now we have an in-depth understanding of an example. For simple periods, we limit the y-axis direction to only one row of threads.
Let's say we're looking at a standard HD picture, and this image has a resolution of 1 080. Usually the thread number of threads is preferably an integer multiple of the size of a thread bundle, which is an integer multiple of 32. Since the device is scheduled for the entire line Cheng, if we do not set the number of threads on the thread block to an integer multiple of 32, it is useless to have a subset of threads in the last thread bundle. So we have to set a condition to limit it, to prevent the processed elements from exceeding the range specified in the x-axis direction. In the following sections we will see that if you do not do this, the performance of the program will be reduced. In order to prevent unreasonable memory merging, we try to make the distribution of memory in the thread distribution reach one by one mappings: if we do not do this, the performance of the program may be reduced a lot. In the program, avoid using small thread blocks as much as possible, as this will make the most of your hardware. In this example, we will open 192 threads on each thread block. Typically, 192 is the minimum number of threads that we consider. With each thread block of 192 threads, it is easy to figure out that processing a row of images requires 10 lines Cheng ().

Here, the choice of 192 is because the x-axis processing data size is its integer multiples, but also the size of the thread bundle integer times, which makes our programming more convenient. In practical programming, we also try to do this.

We can get the index of the thread at the top of the X -axis and we can get the line number in the Y-axis direction. Since each row only handles one row of pixels, each line has a total of ten thread blocks, so we need to theline to process the entire picture, altogether1080*10=10800 a thread block. According to this one thread processing a pixel, each line Cheng open 192 Threads, a dispatch of a multi-million threads. when we treat individual pixels or data in a single process, or when we process data on the same row, this particularlayout method is very useful.. hardware in the current Fermi architectureon,a SM can handle 8 thread blocks, so the above programfrom the application layer point of view, there needs to be a 1350 ( total Total of three thread blocks divided by 8 of each SM can dispatch Thread Block) SM to fully implement parallelism. But the hardware of the current Fermi architecture onlythere are more than one SM to use (GT x 580), that is, each SM will be assigned a 675 thread block for processing .

The above example is simple, the data distribution is aligned,so it's easy to find a good solution, but what if our data is not line-based?? because of the existence of arrays, data may not always be one-dimensionalthe. at this point, we can use a two-dimensional thread block. For example, an 8x8 thread block is used in many image algorithms toprocess pixels.

Thread Grid (GRID)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Thread Grid (GRID)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Thread Grid (GRID)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support