I _dovelemon
Date: 5/2014/9
Source: csdn blog
Topic: Per-Pixel Lighting, Omni light, early-Z, multi-pass, assembly shader
Introduction
In the first-person shooting game, you often need a light source to complete the light computing. Therefore, this article will show you how to create a light source and how to design the engine to support multi-light rendering.
Omni light Introduction
The so-called omni light refers to the point light source. However, when we calculate the illumination of the point light source, we do not use the normal of the vertex for illumination calculation. Omni light has two attributes: position in the world coordinate space and radious in the illumination radius of the light source. With these two attributes, we can calculate the vertex illumination.
As below, I will use the Assembly shader method to write the shader, and we need to set the decay parameter of the vertex according to the two attributes of Omni light. The decay parameter indicates that when the distance from the light source is closer, the brighter the vertex is, the darker the distance, and even after the illumination radius specified by Omni light is exceeded, no lighting.
Therefore, we need a matrix that can calculate the decay parameters of vertices and omni light.
First, for the convenience of calculation, I first assume that the world transformation matrix of the object model is a unit matrix. That is to say, the positional coordinates of Omni light and the vertex coordinates on the model can be considered as in a space without coordinate transformation. (Note: If the world transformation coordinate of the model is not a matrix of units, the transformation is required)
Then, I will translate the position of the Omni light source to the origin of the world coordinates, and use the same transformation matrix to transform the vertices, the coordinates of the vertex represent the distance from the light source on each axis.
After the distance is reached, we need to perform some transformations to control the decay parameter between 0 and 1, so that we can conveniently calculate the color of the light source between 0 and 1, and in pixel shader, all input data must be between 0 and 1. Even if it is not the value of this range, it will be forcibly reduced to 0-1.
The code for this transformation is as follows:
<span style="font-family:Microsoft YaHei;">ZFXMatrix CalcOmniAttenuMatrix(ZFXVector pos, float radious){ZFXMatrix mT ;mT.identity();mT.translate(-pos.x, -pos.y, -pos.z);ZFXMatrix mS ;mS.identity();float invRadious = 0.5f / radious ;mS._11 = mS._22 = mS._33 = invRadious ;ZFXMatrix mB ;mB.identity();mB.translate(0.5f, 0.5f, 0.5f);mT = mT * mS ;mT = mT * mB ;ZFXMatrix mResult ;mResult.transposeOf(mT);return mResult ;}// end for CalcOmniAttenuMatrix</span>
The above code first performs the translation operation, that is, the MT Operation. This operation will translate the vertex and then perform the zoom operation Ms. The scaling factor is 0.5/radious. As you can imagine, radious is the illumination radius of the light source. When zooming, there are three components of the vertex, for example, X/raidous * 0.5. That is to say, if the vertex is within the radius of X, then the scaled value is changed to a value between-0.5 and 0.5. With the last 0.5 translation operation, the range is 0. In this way, the input conditions of pixel shader are met.
Note: At the end, I transpose the computed matrix. This is because in the Assembly shader, when performing the multiplication operation on the vector and matrix, the Assembly shader calculation method is, it is multiplied by the rows of a vector and a matrix, rather than the mathematical multiplication of columns. The reason for doing so is that the addressing for multiplication of rows is simpler. If the values are multiplied by columns, the extra overhead will be incurred when the values are taken across registers. Readers may wonder, isn't C20 a vector unit? How can we accommodate the next matrix? This is the processing mechanism of the GPU. If the C20 cannot accommodate it, it will use C21, c22, and C23 to continue to accommodate it. So far. That is to say, the four registers C20, C21, c22, and C23 store the data of a matrix.
The vertex shader and pixel shader codes of the light source are as follows:
<span style="font-family:Microsoft YaHei;">vs.1.1dcl_position0 v0dcl_normal0 v3dcl_texcoord0 v6m4x4 oPos, v0, c0mov oT0, v6m4x4 oT1, v0, c20</span>
<span style="font-family:Microsoft YaHei;">ps.1.1tex t0texcoord t1dp3_sat r0, t1_bx2, t1_bx2mul r0, c1, 1-r0mul r0, r0, t0</span>
The previous article describes how to use various registers in the GPU. I will not go into details here.
In addition to basic transformation operations on vertices, the vertex shader above uses the matrix calculated above to transform the vertices and stores the results in the ot1 register, used by pixel shader.
In pixel shader, use the texcoord command to extract data from T1, instead of the Tex command. The Tex command uses the coordinates in T0 to sample the texture, and stores the sampled grain value in T0. With the value t1, that is, the distance between the three axes of each vertex and omni light, the following command is used:
Dp3_sat r0, t1_bx2, t1_bx2
In this command, t1_bx2 subtract 0.5 from the value in T1 and multiply it by 2. As mentioned above, the value in the input register in pixel shader can only be 0-1. Therefore, after such a transformation, it becomes the range from-1 to 1, and then uses the dp3 command to calculate the square of the distance between the coordinate and the origin (that is, Omni light, dp3_sat adds an operation, that is, clamp the calculation result of dp3 to the range of 0-1.
In this way, we have a square decay parameter. The linear decay method is rarely used for decay. After the square is added, the decay effect is better. With the decay parameter, we can use 1-r0 to obtain the illumination intensity of the light source. The lower the decay parameter, the higher the illumination intensity, isn't it? C1 stores the color attribute of the light source. Use the decay parameter to obtain the color and mix it with diffuse texture to obtain the final pixel value. Saved in R0 as the output pixel.
Early-Z Introduction
Early-Z Technology is created to allow pixel shader to execute complex operations without downgrading the efficiency. Early-Z detection is performed before pixel shader, which first removes data that cannot pass the z test, thus saving the overhead of pixel shader computing. The traditional Z-test is performed after pixel shader. If this is the only option, there is no way to remove the data before pixel shader. pixel shader needs to calculate a lot of data that is excluded from the next Z-test, it is a waste of resources. Therefore, early-Z technology can improve the efficiency of pixel shader. However, ealry-Z is an implicit hardware license. That is to say, DirectX API has no function to specify whether it is running or not. If Alpha blend is not enabled by default, it is enabled. This is because the ealry-Z Technology conflicts with the Alpha blend technology.
Multiply light and multiply-pass
There are many ways to implement multi-light rendering. Here we use a multiply-pass rendering technology. That is to say, each time a scene is rendered to a light source, the rendering result is then Alpha blend with the previous rendering result to achieve the multi-light source rendering effect.
A m_badditionblend parameter is set in the engine to control whether multi-pass rendering is enabled. If you need to render this, you need to set this variable to true. The engine will set Alpha blend based on this variable during final rendering to perform Alpha blend.
The following is the set function:
<span style="font-family:Microsoft YaHei;">void ZFXD3D::useAdditiveBlending(bool b){if(m_bAdditive == b)return ;//clear all vertex cachem_pVertexMan->forcedFlushAll();m_pVertexMan->invalidateStates();m_bAdditive = b ;if(!m_bAdditive){m_pDevice->SetRenderState(D3DRS_ALPHABLENDENABLE, FALSE);m_pDevice->SetRenderState(D3DRS_SRCBLEND, D3DBLEND_ZERO);m_pDevice->SetRenderState(D3DRS_DESTBLEND, D3DBLEND_ONE);}}// end for useAdditiveBlendingbool ZFXD3D::useAdditiveBlending(){ return m_bAdditive ;}// end for useAdditiveBlending</span>
Once canceled, Alpha blend is disabled. After startup, do not do anything, because the final rendering Code contains the following code:
<span style="font-family:Microsoft YaHei;">//should we use additive blendingif(m_pZFXD3D->useAdditiveBlending()){m_pDevice->SetRenderState(D3DRS_ALPHABLENDENABLE, TRUE);m_pDevice->SetRenderState(D3DRS_SRCBLEND, D3DBLEND_ONE);m_pDevice->SetRenderState(D3DRS_DESTBLEND, D3DBLEND_ONE);}</span>
With this method, you can achieve multiple rendering times to achieve multiple light sources.
Program:
Today's note is over!
Zfxengine Development notes-Omni light