A few days ago, I wanted to see AMD's previous technical demonstration. I saw this demo. Unfortunately, there is no amd video card and the actual effect cannot be seen. I haven't found any online materials for a long time. I just found a PPT on the desktop yesterday. technology behind AMD's "Leo Demo ". Last year, a PPT was released on GDC, showing the light of Leo demo.
Leo demo does not choose to delay rendering. There are three reasons: one is that the material is too complex, the other is the total lighting class, and the third is that the light is translucent. Light culling is used to divide the lights in the world coordinate system into small tile in the screen coordinate system. Each tile stores its own lighting information. In fact, it is similar to the method in Battlefield 3, but battlefield 3 uses delayed rendering.
Then, when rendering the pixel, find the corresponding tile to obtain the illumination information, complete the illumination calculation, and output the result. One important thing is that UAV (unorder access view) is used to store illumination information, which cannot be implemented without this function. It's actually very easy. Just look at it.
The Code is as follows:
//1. preparefloat4 frustum[4];float minZ, maxZ;{ConstructFrustum( frustum );minZ = thread_REDUCE(MIN, depth );maxZ = thread_REDUCE(MAX, depth );ldsMinZ = SIMD_REDUCE(MIN, minZ );ldsMaxZ = SIMD_REDUCE(MAX, maxZ );minZ = ldsMinZ;maxZ = ldsMaxZ;}
_local u32 ldsNLights = 0;__local u32 ldsLightBuffer[MAX];//2. overlap check, accumulate in LDSfor(int i=threadIdx; i<nLights; i+=WG_SIZE){Light light = fetchAndTransform( lightBuffer[ i ] );if( overlaps( light, frustum ) && overlaps ( light, minZ, maxZ ) ){AtomicAppend( ldsLightBuffer, i );}}
//3. export to global__local u32 ldsOffset;if( threadIdx == 0 ){ldsOffset = AtomAdd( ldsNLights );globalLightStart[tileIdx] = ldsOffset;globalLightEnd[tileIdx] = ldsOffset + ldsNLights;}for(int i=threadIdx; i< ldsNLights; i+=WG_SIZE){int dstIdx = ldsOffset + i;globalLightIndexBuffer[dstIdx] = ldsLightBuffer[i];}
// BaseLighting.inc // THIS INC FILE IS ALL THE COMMON LIGHTING CODEStructuredBuffer<float4> LightParams : register(u0);StructuredBuffer<uint> LowerBoundLights : register(u1);StructuredBuffer<uint> UpperBoundLights : register(u2);StructuredBuffer<int2> LightIndexBuffer : register(u3);uint GetTileIndex(float2 screenPos){ float tileRes = (float)m_tileRes; uint numCellsX = (m_width + m_tileRes - 1)/m_tileRes; uint tileIdx = floor(screenPos.x/tileRes)+floor(screenPos.y/tileRes)*numCellsX; return tileIdx;}
StartHLSL BaseLightLoopBegin// THIS IS A MACRO, INCLUDED IN MATERIAL SHADERS uint tileIdx = GetTileIndex( pixelScreenPos ); uint startIdx = LowerBoundLights[tileIdx]; uint endIdx = UppweBoundLights[tileIdx]; [loop] for ( uint lightListIdx = startIdx; lightListIdx < endIdx; lightListIdx++ ) {int lightIdx = LightIndexBuffer[lightListIdx];// Set common light parametersfloat ndotl = max(0, dot(normal, lightVec));float3 directLight = 0;float3 indirectLight = 0; if( lightIdx >= numDirectLightsThisFrame ) { CalculateIndirectLight(lightIdx , indirectLight); } else { if( IsConeLight( lightIdx ) ) { // <<== Can add more light types here CalculateDirectSpotlight(lightIdx , directLight); } else { CalculateDirectSpherelight(lightIdx , directLight); } } float3 incomingLight = (directLight + indirectLight)*ndotl; float shadowTerm = CalcShadow();EndHLSLStartHLSL BaseLightLoopEnd }EndHLSL
#include "BaseLighting.inc"float4 PS ( PSInput i ) : SV_TARGET{ float3 totalDiffuse = 0; float3 totalSpec = GetEnvLighting();;$include BaseLightLoopBegin// unique material code goes here!! Light accumulation on the pixel for a given light// we have total incoming light and direct/indirect light components as well as material params and shadow term// use these building blocks to integrate lighting terms totalDiffuse += GetDiffuse(incomingLight); totalSpec += CalcPhong(incomingLight);$include BaseLightLoopEnd float3 finalColor = totalDiffuse + totalSpec; return float4( finalColor, 1 );}