Discussion on grating and MSAA

Source: Internet
Author: User

Q: for example, if I heard people say that the video card grating is less than 5 pixels (it seems like this number), the efficiency will be very low. Why? I have never been quite clear about the internal operation mode of full-screen anti-aliasing (MSAA). It is said that RT with full-screen anti-aliasing is completely different from normal ones. Which of the following friends can help me explain this?

A: Why five? The calculation of the level-of-measure (SLS) is based on the granularity of Quad. Therefore, the PS startup in the GPU is based on the Quad granularity. In a Screen spaces, Quad is an even number of pixels aligned and fixed. If you overwrite a Pixel in a Quad granularity, the other three Pixel locations will be wasted. As a result, the grating efficiency is 1/4 of the original. When PS is started, some silly designs only start PS according to the Quad granularity, resulting in the same efficiency degradation in PS execution efficiency. Of course, this topic is far away. Now we are talking about grating efficiency.

In this case, MSAA can solve a copy problem. In fact, MSAA is further subdivided into Subsampling In the pixels of the corresponding boundary, so as to develop more Rast efficiency.

The RTset of MSAA is indeed different from the ordinary RTset, which is also reflected in DX10. DX10 provides special APIs that allow you to access each Subpixle at the SUbsampling granularity to achieve K-buffer and other graphic effects. As for how the GPU implements the MSAA RTset, this is generally transparent to programmers. There are many implementation methods ~~ In general, the principle is to maximize Bandwidth savings, as long as you do not deviate from this direction, you can flexibly play:> :〉

1 # effulgent
This is a good description of MSAA.
Antialiasing today
Http://www.nvworld.ru/docs/fsaa2-e.html
RT Implementation Details
You can refer to the related sections of OpenGL spec MSAA, or
ARB_multisample extension spec

DX10 raster Rule I posted it and said, "When Multisampling Pixel shaders always run using a minimum 2x2 pixel area to support derivative calculations", so for small Triangle, MSAA can still achieve extra efficiency. In addition, it also says "This means that shader invocations occur more than is shown to fill out the minimum 2x2 quanta (which is independent of multisampling ). the shader result is written out for each covered sample that passes the per-sample depth-stencel test. ", so PS efficiency and Rast efficiency are related to this. But I don't understand what happened to five Pixel servers?

Q: I cannot remember the exact pixel number, but at least I can say this on the NV card. I don't know much about hardware, because I don't know what algorithm is used by the video card (GT200) for grating (Scan Line or TILE ?) If it is based on TILE, the Administrator just mentioned that the minimum TILE granularity is 2x2 pixels. I have seen LARRABEE's TILE-based software raster, because raster is recursive, therefore, a triangle that is too small will indeed cause deep recursion, and the actual contribution of pixels will rarely cause efficiency loss. However, the specific pixel number must be related to hardware implementation. I do not know. DX documents also mentioned that because the MSAA center sampling requires derivative, PS operation requires 2x2 pixel values. Is normal rendering the same?
I am sorry for the rough question about mutilsample. For the mutilsample algorithm to mention that only one pixel caculation is performed for a covered pixel, I do not know whether it is a triangle once, because if a pixel is completely covered by the same triangle, it should be performed only once. If a pixel is covered by N Different triangle parts, do you need to perform N times of PS? I think anti-aliasing is based on a complete frame, rather than a single draw call. Therefore, a pixel may involve different ps operations. Do you know if this is the case? If the execution times of a single pixel ps are related to the number of triangles covered by N, dynamic memory allocation is involved dynamically. Otherwise, some graphics cards may only save a limited number of results, I wonder if the GT200 has a color limit?

A: About centroid sampling. In DX10, Centroid Interpolation is irrelevant to MSAA. You only need to name the Interpolation method in PS Input.
About 2*2. This is the case since ancient times. It was 2x2 since SGI began designing OGL specifications 20 years ago. DX has always inherited this point. Does not have a coupling relationship with MSAA.
PS issues during MSAA. Pixel at the edge of Primitive may contain several Primitive Framgnet files. Of course, each Fragment must be regarded as its own PS. Otherwise, how can we mix the soft edge effects:> so you are right, but there is no Dynamic Allocation of memory space. All of them are allocated according to the maximum size. You should be able to make a Render to Texture ~~

A: To answer this question, as far as I know.
If a pixel is overwritten by N Different triangles, do you need to perform N times of PS?
Yes
I think anti-aliasing is based on a complete frame, rather than a single draw call. Therefore, a pixel may involve different ps operations. Do you know if this is the case?
Yes
If the execution times of a single pixel ps are related to the number of triangles covered by N, dynamic memory allocation is involved dynamically. Otherwise, some graphics cards may only save a limited number of results,
After msaa, the size of the allocated memory changes as follows:
Vid_mem = sizeof (Front_buffer) + sizeof (Back_buffer) + num_samples
* (Sizeof (Front_buffer) + sizeof (ZS_buffer ))
Update the samples corresponding to the fragment in the multisample buffer Based on the coverage value of fragment each time the draw is performed.
When you need to draw the result, perform downsample on the multisample buffer and update it to the real draw buffer.

Centroid Interpolation seems to be meaningful in the case of msaa. The following article can be referred.
Http://www.opengpu.org/bbs/viewt... & extra = page % 3D1

The so-called Centroid is nothing more than adjusting its own attribute location. In this way, the Blending color effect is more gentle when Dispaly is performed after MSAA. So if there is no Multisampling/Supersampling, Centroid Sampling means no more. However, this is a bit absolute. In fact, Centroid Sampling may bring better results when rendering a Transparent object, although at this time it may not be possible to have a Multiset ampling/Supersampling:> :〉

Centroid is associated with Multisampling to avoid sampling outside the triangle. Supper sampling does not require centroid.
In addition, "In fact, Centroid Sampling may bring better results when rendering a Transparent object." What is it about?

"But the storage space will not be dynamically allocated. All of them come up and are allocated according to the maximum size"
-----------------------------------------------------
Does the render target of MSAA have a compression algorithm? It seems that many patents mentioned above should not be allocated by the maximum space.

Well. This is mainly for the compression of Z buffer and Color Buffer, which is actually compressed according to Tile. But it does not save space? In my understanding, the so-called compression can save Memory Traffic, but it cannot save the RTset Memory space. It is necessary to ensure that the address space must be aligned according to the Tile compression algorithm during access, otherwise dynamic memory allocation will be troublesome. How to manage the storage space?

Therefore, even if you do not enable MSAA, it will be compressed. I don't know if my understanding is correct.

In fact, there is a potential problem here. It may be that I have not figured it out. We all say that the maximum size of the buffer is allocated on the premise that there is only one color at the same sample point in the pixel, and ALPHA is ignored, for example, if the pixel is covered by eight half-transparent triangles and the MSAA processes half-transparent, the segment at the pixel may be larger than four, I don't know. Is there any problem?
Does the Administrator have hardware grating information? MA said Rasterization is the root of all evil.
By the way, the Rasterization Rules (Direct3D 10) posted by the IC expert
I have seen that the center of many pixels falls completely inside the triangle, but why light gray (for example, 3 pixels )?

From the perspective of programming model, it should be a large-sized Triangle, and a high-speed Rasterizer. Only in the past 15 years have the GPU developed rapidly ......, Otherwise, where will the development and performance be achieved?

As for HW Implement, does ACM have many similar Paper? The concept of Rasterizer is vague. It should include three modules: Primitive Setup, Rast, and Interpolation. If it is pulled out separately, no module can understand the key points of this. In general, algorithms are divided into two categories: the center of gravity coordinate algorithm and the Edge Function. You can look at the superficial things. In fact, this article is too rough to be rough. For details, you can refer to other papers. There are too many articles in this regard.
L4 [Edge functions and interpolation].pdf (1.63 MB) Downloads: 7
For the RT size, you can check the D3D manual. Alpha will immediately drop the color Blending in the corresponding Pixel, regardless of whether it is saved. Therefore, if MSAA is 4X, only four colors can be saved.

The Rast Hardware of Micropolygon is totally different from the Rast Hardware of Traditional Polygon. So far, this post is still about the grating of Traditional Polygon.
 
Efficient Design of Graphic Rasterization moduleization (167.48 KB) Downloads: 1
This article provides an Rast Unit implementation ~ Generally, the local texture address must be taken into account in Rast design, but it is not mentioned in this article.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.