Inside geometry instancing (lower)

Source: Internet
Author: User
Tags mul

This tutorial is copyrighted by me and is for personal use only. Do not repost it. It is not used for any commercial purposes. For commercial applications, contact me.
Due to my limited level, errors are inevitable. If you are not clear about it, please refer to the original document. You are also welcome to talk with me a lot.
Some of the images are from the Internet, and they are the same as the illustrations in the original book.
Thank you very much for the reproduction of the MTT function.ArticleIn the flowchart ^_^
Translation: Clayman
Blog: http://blog.csdn.net/soilwork
Clayman_joe@yahoo.com.cn
 

3.3.3 vertex constants instancing

In the vertex constants instancing method, we use vertex constants to store object attributes. In terms of rendering performance, the batch of vertex constants is very fast and supports moving object positions, but these features are at the expense of controllability.

The following are the main restrictions of this method:

L according to the size of common sense values, the number of objects in each batch is limited. Generally, for a method call, the number of objects in a batch does not exceed 50 to 100. However, this is enough to reduce the load of the CPU to call the drawing function.

L Skinning is not supported. All vertex constants are used to store object attributes.

L hardware supporting vertex shaders

First, you need to prepare a static vertex buffer (also including the index buffer) to store multiple copies of the same geometric package. Each copy is saved in the coordinate space of the model, and corresponds to an entity in the batch.

 

You must update the original vertex format to add an integer index value for each vertex. For each object, this value is a constant that indicates the entity of a specific geometric package. This is somewhat like palette skinning. Each vertex contains an index that points to one or more of its bones.

The updated vertex format is as follows:

Stuct instancevertex

{

D3dvector3 mposition;

// Other properties ......

Word minstanceindex [4]; // direct3d requires short4

};

After all the object data is added to the geometric batch, The COMMIT () method prepares the vertex buffer according to the correct design.

Next, load attributes for each object to be rendered. We assume that the attributes only include the model matrix describing the object position and orientation, and the object color.

For GPUs supporting the directx9 series, up to 256 vertex constants can be used: 200 of them are used to save object attributes. In our example, each entity requires four constant storage model matrices and one constant storage color. In this way, each entity requires five constants, so each batch can contain up to 40 entities.

The following is the update () method. The actual entity is processed in vertex shader.

D3dvector4 instancesdata [max_number_of_constants];

Unsigned int COUNT = 0;

For (unsigned int I = 0; I <getinstancescount (); ++ I)

{

// Write model matrix

Instancesdata [count ++] = * (d3dxvector4 *) & minstances [I]. mmodematrix. M11;

Instancesdata [count ++] = * (d3dxvector4 *) & minstances [I]. mmodelmatrix. m21;

Instancesdata [count ++] = * (d3dxvector4 *) & minstances [I]. mmodelmatrix. m31;

Instancesdata [count ++] = * (d3dxvector4 *) & minstances [I]. mmodelmatrix. M41;

// Write instance color

Instacedata [count ++] = convercolortovec4 (minstances [I]. mcolor );

}

Lpdevice-> setvertexconstants (instances_data_first_constant, instancesdata, count );

Below is the vertex shader:

// Vertex input Declaration

Struct vsinput

{

Float4 postion: positon;

Float3 normal: normal;

// Other vertex data

Int4 instance_index: blendindices;

};

 

Vsoutput vertexconstantsinstancingvs (in vsinput input)

{

// Get the instance index; the index is premultiplied by 5 to take account of the number of constants used by each instance

Int instanceindex = (INT [4]) (input. instance_index) [0];

// Access each row of the Instance model matrix

Float4 m0 = instancedata [instanceindex + 0];

Float4 M1 = instancedata [instanceindex + 1];

Float 4 m2 = instancedata [instanceindex + 2];

Float 4 m3 = instancedata [instanceindex + 3];

// Construct the model matrix

Float4x4 modelmatrix = {M0, M1, M2, M3}

// Get the instance color

Float instancecolor = instancedata [instanceindex + 4];

// Transform input position and normal to world space with the instance model matrix

Float4 worldpostion = MUL (input. Position, modelmatrix );

Float3 worldnormal = MUL (input. Normal, modelmatrix;

// Output posion, normal and color

Output. Position = MUL (worldpostion, viewprojectionmatrix );

Output. Normal = MUL (worldpostion, viewprojectionmatrix );

Output. Color = instancecolor;

// Output other vertex data

}

The render () method sets the observation and projection matrices, and calls the drawindexedprimitive () method to submit all objects.

ActualCodeYou can store the rotating part of the model space as a quaternion, saving two constants and increasing the maximum number of entities to around 70. Then, re-construct the matrix in vertex shader. Of course, this also increases the encoding complexity and execution time.

 

3.3.4 batching with the geometry instancing API

The last method introduced is the batch of geometric entity APIs introduced in directx9 that can be fully implemented by the geforce 6 series GPU hardware. With more hardware supporting geometric entity APIs, this technology will become more interesting. It only needs to occupy a very small amount of memory and does not require too much CPU interference. Its only drawback is that it can only process entities from the same geometric package.

Directx9 provides the following functions to access the geometric entity API:

Hresult setstreamsourcefreq (uint streamnumber, uint frequencyparameter );

Streamnumber is the index of the target data stream. frequencyparameter indicates the number of objects contained in each vertex.

We first create two fast Vertex buffers: a static buffer to store a single geometric package that will be materialized multiple times; a dynamic buffer to store Entity Data. Shows two data streams:

 

Commit () must ensure that all ry uses the same ry package and copy the ry information to the static buffer.

Update () simply copies all object attributes to the dynamic buffer. Although similar to the update () method in a dynamic batch, It minimizes CPU interference and the graphic bus (AGP or PCI-E) bandwidth. In addition, we can allocate a large enough vertex buffer to meet the needs of all object attributes without worrying about memory consumption, because each object attribute only occupies a small part of the memory consumption of the entire geometric package.

The render () method uses stream frequency to set two streams, and then calls the drawindexedprimitive () method to render all objects in the same batch. The Code is as follows:

Unsigned int instancescount = getinstancescount ();

// Set U stream source frequency for the first stream to render instancescount instances

// D3dstreamsource_indexeddata tell direct3d we'll use indexed geometry for instancing

Lpdevice-> setstreamsourcefreq (0, d3dstreamsource_indexeddata | instancescount );

// Set up first stream source with the vertex buffer containing geometry for the geometry Packet

Lpdevice-> setstreamsource (0, mgeometryinstancingvb [0], 0, mgeometrypacketdeck );

// Set up stream source frequency for the second stream; each set of instance attributes describes one instance to be rendered

Lpdevice-> setstreamsoucefreq (1, d3dstreamsource_indexeddata | 1 );

// Set up second stream source with the vertex buffer containing all instances 'bubutes

Pd3ddevice-> setstreamsource (1, mgeometryinstancingvb [0], 0, minstancesdatavertexdecl );

GPUs package vertices from the first stream to the second stream through virtual replication (virtually duplicating. The vertex shader input parameters include the vertex position in the model space and the entity attributes used to transform the model matrix to the world space. The Code is as follows:

// Vertex input Declaration

Struct vsinput

{

// Stream 0

Float4 position: position;

Float3 normal: normal;

// Stream 1

Float4 model_matrix0: texcoord0;

Float4 model_matrix1: texcoord1;

Float4 model_matrix2: texcoord2;

Float4 model_matrix3: texcoord3;

 

Float4 instance_color: d3dcolor;

};

 

Vsoutput geometryinstancingvs (in vsinput input)

{

// Construct the model matrix

Float4x4 modelmatrix =

{

Input. model_matrix0,

Input. model_matrix1,

Input. model_matrix2,

Input. model_matrix3,

}

// Transform inut position and normal to world space with the instance model matrix

Float4 worldposition = MUL (input. Position, modelmatrix );

Float3 worldnormal = MUL (input. Normal, modelmatrix );

// Output positon, normal, and color

Output. positon = MUL (worldpostion, viewprojectionmatrix );

Output. Normal = MUL (worldnormal, viewprojectionmatrix );

Output. Color = int. instance_color;

// Output other vertex data .....

}

Since the CPU load and memory usage are minimized, this technology can efficiently render a large number of copies of the same ry, and thus is an ideal solution in the game. Of course, its disadvantage is that it requires support of hardware functions, and it cannot easily implement skinning.

To implement skinning, you can save all the skeleton information of all objects as a texture, and then select the correct skeleton for the corresponding object. This requires the vertex texture access function in shader model3.0. If this technology is used, the performance consumption caused by accessing the vertex texture is uncertain and should be tested.

 

3.4 Conclusion

This article describes the concept of ry and describes four different technologies to efficiently render the same ry multiple times. Each technology has its own advantages and disadvantages. There is no single solution to the problems that may occur in the game scenario. Based on the ApplicationProgramTo select the corresponding method.

The following are the recommended methods in some scenarios:

L static batches are the best choice for indoor scenarios that contain a large number of static entities in the same ry because they are rarely moved.

L outdoor scenes that contain a large number of animated entities, such as instant strategy games with hundreds of fighters, dynamic batches may be the best choice.

L outdoor scenarios that contain a large number of vegetables and trees usually need to modify their attributes (for example, to achieve the effect of moving with the wind), as well as particle systems, the geometric batch API may be the best choice.

Generally, the same application uses more than two methods. In this case, an abstract geometric batch interface is used to hide the specific implementation, making it easier for the engine to be modularized and managed. In this way, the implementation of ry materialized can also be much reduced for the entire program.

 

 

(In the figure, static buildings use static batches, while the tree uses geometric entity APIs)

Click here to download the complete PDF document. For the complete demo, you can refer to the example instancing In the nvidia sdk or download it directly here. You can also refer to the example instancing in DirectX SDK.

 

This article from the csdn blog, reproduced please indicate the source: http://blog.csdn.net/soilwork/archive/2006/04/09/655858.aspx

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.