This articleArticleThere are many things involved, some ideas are interesting, and the practicality is limited in realtime game rendering. However, if it is used in some tools, it should be useful for model processing.
Multi fragments Effects
This is represented by order independent transdependsy, which requires interaction between multiple fragments.
For the moment, the bestAlgorithmDirect compute uses the fragment list.
Depth peeling
This is a classic algorithm for processing order independent transdependcy.
For related articles, see:
- File: // F:/drive/SkyDrive/hardcore/render/transparent/order_independent_transparency.pdf
- File: // F:/drive/SkyDrive/hardcore/render/transparent/dualdepthpeeling.pdf
Use multiple depth and multiple rendertargets to render the scene multiple times. For example, the depth and rendertarget closest to camera are constructed for the first time, and then painted again. In the fragments with a greater depth value than the first time, the depth value is the smallest, and so on. Later, the hardware capabilities became stronger. For example, dual depth peeling and NV used the latest hardware (g80 architecture) to support floating point MRT and Max blending operation, so the algorithm was improved.
Dual depth peeling Peeling is performed at the same time in two directions,
Use an rendertarget of rgbaf32 as the depth peeling buffer. The render target of two rgba8. one record is followed by the record, and the other record is followed by the record. There is a problem here: we cannot compare the depth value of a fragment with the value in the rendertarget of the stored depth. If the condition is met, write the depth render target. This will cause read-Modify-write hazard. The solution is to use Max's blending operation.
The comparison operation is still performed in pixel shader. when outputting the data, use Max's blending operation. Here, the power of floating point is shown, because one is front to back, one is back to front, the full use should be a max, and the other is Min, so we can take the second one as an inverse and output (depth0,-depth1 ), in this way, you can use Max.
Bucket sort Here the bucket sort and standard practices are the same idea: http://en.wikipedia.org/wiki/Bucket_sort The core idea is cutting. Using the new hardware to have a larger volume of MRT (eight buckets are used for sort, which is almost fatal), the depth area is divided into N equal portions, where a fragment falls and where it is written, it will be built once. The final compoe will work together. However, in many cases, the depth distribution is not so average. One approach is to subdivide one or more pass entries in dense areas. Another method is to create a buffer called depth histogram. The rendertarget of an rgbaf32 has 32x4 bits, and the 8th is 1024, in this way, the depth range is divided into one of the 1024 fragment values. If it is smaller, it will be ignored. In this case, a geometry pass is used to construct the depth histogram, and then we can use this histogram to construct the adaptive (instead of the above average score.
In short, the idea is a little abnormal. Multiple methods can be integrated and applied flexibly. Although it is almost practical, it is enlightening and I like it.