Test the filling rate of the GPU material.

Source: Internet
Author: User

The most important Optimization of body rendering is to reduce GPU sampling. Testing the filling rate of the GPU material can guide our work. Do you want to know why the GPU can only reach 12 FPS in 800*600 environments? This depends on the number of GPU samples per second.

I wrote a simple OSGProgramTo test the number of samples. Click here to download

The procedure is simple. There are several steps: create a window, generate and set the texture, and load the shader, and render the texture. How to do it depends on the program. I will not post it here.

Let's talk about the final test result. According to my 8800gts (g80) official materials, the filling rate of materials can reach 24 billion/sec. Official materials give the core frequency 500 MHz, the coloring frequency 1200 MHz, and the display memory 800 MHz. I also reduced the frequency of my video card based on this data.

Test environment: Window 800*600, 3D texture 256*256*256, data is luminace_alpha, 2 bytes per pixel. The shader of each pixel samples 3D textures 512 times.

Finally, the FPs of the test is 11.98 frames. Calculation: 800*600*512*11.98 = 2,944,204,800. Because it is a 3D texture, 8 samples are required for each sample. Therefore, the final filling rate is 23,553,638,400, it is very close to 24 billion/sec.

Using 2D pasters can produce similar results, but FPS is twice faster. The reason is that the workload of Tri-linear sampling is twice that of the binary linear sampling, and obviously the FPS will be doubled.

So how to optimize it? Below are some tests:

    1. Reduce the size of 3D pasters so that they can be installed into the cache as much as possible. Set it to 1*1*1. The performance is the same.
    2. Change the internal format of 3D textures to rgba, and the performance is the same.
    3. Overclock: Super core, the performance improvement percentage is almost the same as the overclock percentage. Super shader, almost unchanged. Super Memory, almost unchanged.
    4. Frequency reduction: reduces the core, and the performance reduction percentage is almost the same as the frequency reduction percentage. There is almost no way to downgrade the shader. Reduced Display memory, which has been down to MHz and remains unchanged.
    5. Reduce the sampling rate per pixel: after the reduction is 256, the performance is doubled, as expected.

The final result is obvious. 3D texture sampling has become the bottleneck of the entire system and has reached the limit of the graphics card's Texture unit. The shader processor is idle due to its small amount of computing. Due to the busy sampling and filtering, texture units do not need a large memory bandwidth, so there is almost no effect on memory.

Optimization: You can only reduce the number of samples as much as possible, or find a faster card. At present, it seems that only the sampling rate of G92's 9800gtx or 8800gts can reach 43.2 billion/sec or above, and gtx280 official information can only reach 48.2 billion/sec, gtx260 36.9 billion/sec. 9800gx2 can reach 76.8 billion/sec, that is, it does not know whether the actual SLI performance can meet the needs. It seems that there is already a theoretical and practical guide on how to select a fit rendering card.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.