The beauty of DirectX 11 (4)-append/consume, byte address and indirect Argument Buffer

Source: Internet
Author: User

The beauty of DirectX 11 (4) -- append/consume, byte address and indirect Argument Buffer

Author: Clayman

For personal use only, do not reprint, do not use for any commercial purposes.

 

Append/consume Buffers

Append And Comsume Buffer All SBV . Essentially, they all need UAV Resources, but they are HLSL Implements a stack-like access behavior: Use Append () Function Elements Push To Buffer Or use Consume () Function pulling element. For both Buffer The order of adding elements is not important, but the number of added elements is very important. UAV Add records internally Buffer The number of elements to be added or deleted. GPU The thread simultaneously operates like this Buffer Synchronization is not required, greatly improving the efficiency.

 

We will use a simple example to introduce this Buffer Is based on GPU Particle System: used by the System 2 Block Buffer Save the particle information, save the current state of the particle, and save the updated particle. Runtime, Computer shader Each GPU All threads use Consume () Method to read the current particle information, perform update calculation, and then use Append () Method To write the result to another Buffer . Because the state of each particle is independent Buffer The order of particles in does not matter, as long as the number is correct.

 

Create A/c Type Buffer Strict constraints, because Buffer To read and write data at the same time, you must use D3d11_usage_default And the bound ID must contain 3d311_bind_unordered_access Id. D3d11_bind_shader_resource . Create the corresponding RV The data format is always Dxgi_format_r32_unknown And whether it is Append Or Comsume Buffer Required D3d11_buffer_uav_flag_append Parameters. Introduction in the previous section B/S Buffer I once said that B/S Buffer Divided into multiple Subresource , Use different RV However, for read/write operations UAV However, such resources can only be used as a whole.

Below isHLSLDeclareA/c BufferOfCode

Struc Particle
{

Float3 position;

Float3 velocity;

Float time;

}

Appendstructuredbuffer <particle> newsimulationstate: Register (U0 );

Consumestrucuredbuffer <particle> currentsimulationstate: Register (U1 );

 

Byte address Buffers

Bab Is HLSL Provides a relatively low-level method to access the video memory block. Unlike accessing elements through indexes, Bab Access the elements in a byte address : Byte address N The value is offset from the beginning of the resource. N Byte 4 Items 32 The value of an unsigned integer. Note that, N Must be 4 Multiples of, pointing 4 Items 32 Bitwise unsigned integers can also be converted to values of other types. This type Buffer Is HLSL It provides powerful data management and operation capabilities and can be used to implement any type of data structure.

 

For example, a storage 32 Bit color value linked list, each linked list node consists of a color value and a tag pointing to the next element. When the first element is added, the color value is written to the offset value 0 The index pointing to the next element is -1 . When the second element is added,ProgramFirst, the read offset value is 0 At 2 Items 32 Bit value, and then through the index, find the end of the chain table in sequence, and finally add. Use Bab In the past, many CPU Traditional computingAlgorithm. GPU On! Of course, because GPU Due to the Inherent Parallelism, the actual situation is a little more complicated and will be introduced in detail later.

 

Bab The creation method is basically the same as the previous several types of resources. The only difference is that Miscflags Specified D3d11_resource_misc_buffer_allow_raw_views . Read-Only Bab You can use Shader resource view Bind to any Shader Stage; UAV Read/write Bab You can only bind Pixel shader And Compute shader Phase. In either case, RV All must be Dxgi_format_r32_typeless Format, UAV You must also add D3d11_buffer_uav_flag_raw Id.

 

InHLSL, Use the following code to declareBabResource:

Byteaddressbuffer rawbuffer0;

Rwbyteaddressbuffer rawbuffer1;

 

Indirect argument Buffers

SeveralBufferIt is mainly used to extract data fromCPUToGPUAndIABThe design concept is to allowGPUFill in the data by yourself and use it in subsequent calculations.IABIt is mainly used in the following three functions:

Drawinstancedindirect (id3d11buffer * pbufferforargs, uint alignedbyteoffsetforargs );

Drawindexedinstancedindirect (id3d11buffer * pbufferforargs, uint alignedbyteoffsetforargs );

Void dispatchindirect (id3d11buffer * pbufferforargs, uint alignedoffsetforargs );

 

All three functions use Indirect End of the suffix, with no Indirect The suffix method is essentially the same though the parameter types are different, for example Drawinstanced () Accept 4 Parameters: Vertexcountperinstance, instancecount, startvertexlocation And Startinstancelocation . Indirect The version also requires this 4 Only put them to the first parameter. Pbufferforargs And locate them in Buffer . Therefore, we must ensure that Pbufferforargs At the specified offset 4 Valid Uint And the offset value must be 4 byte Alignment, Pbufferforargs You can include multiple sets of different parameter values in different locations. Offset Select the expected parameter. The advantage is that Buffer Data can be GPU Fill, so CPU You do not need to know how many elements to draw! Data and CPU The calculation results can be updated. IAB Data. For example Compute shader Save the computing result Appendstructuredbuffer , Use Id3d11devicecontext: copystructurecount Set Appendstructuredbuffer Copy Data IAB , And finally use Drawinstancedindirect () Set CS Computation data is directly rendered, and the entire process requires only a few CPU Participate, CPU No need to know how many images are rendered.

 

Create IAB Methods and the previous Buffer There are no many differences. In general, if you want to use GPU Fill data, so be sure to use Default usage In addition, Miscflags Required D3d11_resource_misc_drawindirect_args , And finally, Bytewidth Required 4 byte Of N Times.

When GPU Fill IAB For example Stream out , Render target Or UAV Method, must be matched with the corresponding Resource View When CPU When filling in data Updatesubresource Method and Copystructurecount .

 

So far, we have introduced all buffer resources in DirectX 11, and the next part will discuss texture resource.

 

------------------------------ To be continue ----------------------------------------------

 

Remind me again that four articles have been publishedArticleThe introduction to dx11 is abstract. If you do not understand it, you should first look for some tutorials on dx11 with demo. This series of articles focuses on the introduction and sorting of dx11 API principles, and is suitable for reference, rather than the first tutorial. The future content includes the following parts after the introduction of texture Resource:

Rendering pipeline describes every stage of the pipeline.

Tessellation Pipeline

Computation pipeline focuses on the two most important functions in dx11

HLSL 5.0

Multithreaded Rendering

 

 

 

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.