Direct3d 10 System (IV)

Source: Internet
Author: User

Direct3d 10 System (IV) 

  5 core API and RuntimeWe set the API and runtime to two independent parts, but the complete Part: core API/runtime and the color language/status management system. We will introduce several important parts of the new runtime and the changes compared with the current system. The core API and runtime act as a thin abstraction layer on the hardware system and play the role of Low-overhead. The transformation of programmable pipelines and the removal of Fixed-function redundancy dramatically simplifies API and runtime. APIS provide services for the following operations: allocate and modify resources; Create view statuses and bind them to different parts of the pipeline; Create shader and bind them to the pipeline; control the status of unprogrammable parts of the pipeline; initialize rendering operations; query information from the pipeline by retrieving statistics or resource content. 5.1 statemanagementOne of the main problems we need to solve is to reduce the end-to-end consumption of instructions transmitted between applications and hardware. Commands are divided into two types: resource allocation or release, and pipeline status change. Among them, we are particularly concerned about the latter because they will frequently appear in applications. We use a simple model to pass these commands to the pipeline. A memory buffer will be allocated for the Application commands at runtime. Each running API instruction to the driver is converted into a specific hardware instruction and stored in the buffer. When the buffer is filled or another operation needs to synchronize the rendering state (e. g reads the content of a rendering target), the entire buffer is submitted to the hardware. In the PC system over the past 10 years, the runtime model has hardly changed. Our goal is to attach commands to the buffer without additional processing. In the past, this was obviously impractical. Therefore, we tried our best to understand the cause and find and modify the design scheme to bring our model closer to this goal. We found that there are some reasons in terms of runtime and driver, which will lead to additional processing. L mismatch between API and hardware l deferred processing style l among the three reasons for incorrect transfer of application requests, the third is the easiest solution, as long as an agreement is reached between application developers, runtime drivers, and hardware providers, the situation can be greatly improved (E. G is increased by dozens of times instead of 10% ). The second problem involves the traditional implementation strategy: When the elements are submitted to the pipeline, identify which state changes are cumulative. The advantage of this operation is that the status changes of a group can be processed in batches, and independent States (non-orthogonal) can be processed together, instead of an independent state in each change. It also allows discard redundancy. However, these functions require additional CPU cycles to record changes and perform global control. One of the catastrophic examples of non-orthogonal processing is the change in texture binding. The shader needs to be re-compiled to adapt to the new texture format. We advocate minimizing the dependence on hardware status implementation and dividing the changes in redundancy status into an optional layer during running. The first problem involves multiple types of mismatch. What is the orthogonal non-match when the shader is re-compiled ?? Orthogonality mismatches as exemplified by the shader recompile example. One of the reasons is related to the granularity of state changes (granularity. Both OpenGL and the previous direct3d version define the granularity of state changes very precisely (Fine Granularity), e. g, change a mixed factor, or change a sampling mode. We have made many attempts to bring together state changes to improve efficiency, such as using the display list in OpenGL or the State block in direct3d 9 ). Although these solutions can work well, we chose a simpler method. Dropping the redundancy function of a fixed pipeline has greatly reduced the total number of State types. Through analysis, we find that the current Fine Granularity division has no advantages. Therefore, we organize scattered states into large, relevant, and immutable aggregates) become Status object ( State objects ). In this way, a clear model can be created to indicate which States should be independent and which are not, thus reducing the number of API calls required for completely re-assembling the pipeline. We found that programs using these new APIs can improve the matching accuracy. Direct3d 10 defines five State objects: inputlayout (vertex buffer layout), sampler, Rasterizer, depthstencer, and blend. Such a Division reflects the logical relationship between States. If an application needs to change an independent State frequently separately, it can be further subdivided. When a State object is created, the driver creates a hardware representation for the State (E. g, a set of register values). When an object needs to be bound to a pipeline, the corresponding commands are copied to the command buffer. Some hardware implementations may retain (cache) State representation in the hardware, reducing the cost of converting API commands into hardware commands. In section 4, we describe the problems that may occur when updating pipeline constants. It is actually a common type of pipe failure (hazards. When a value is about to be used, but the previous value is still in use, it may also cause some faults. To solve this problem, we usually use an additional storage space to save the new value and redirect the reference to the new buffer. Another fault occurs when the data is read from the newly written resource. For example, the previous rendering target is used as the texture. When performing such an exchange, the rendering command must have been executed and all data has been written to the rendering target so that the data can be picked up. Unlike the update fault described earlier, the read-after-write fault is more difficult or cannot be solved in API and runtime. To avoid delay (stalling) on pipelines in this case, when building an application, try to avoid rendering operations that need to read the data in the previous rendering target immediately. 5.2 validation and Error HandingSome APIs are designed to avoid errors or perform error checks for operations that are frequently used but costly, such as object creation, instead of checking the objects used. Although we have performance requirements, we do not allow too many error checks on deployed programs through APIS. Here, our error detection and reporting policies divide errors into two categories: fatal and non-fatal. In any version of the runtime, critical errors are detected and reported. Non-fatal errors are detected through a separate listening layer, which is transparent to the runtime. This verification layer is initially used for program development. When a program is deployed, developers usually block it. To detect errors, it usually looks for and reports the non-ideal usage type of the API. This verification layer can be controlled to specify which errors it detects and reports. The error differentiation method is indeed a bit fuzzy (ambiguity): Which errors will always be detected, and which errors will not be detected? Undefined error behavior may be converted to an unintended but relied upon (defacto) behavior later. Furthermore, if a specific error is not detected during the runtime, the driver will ignore the performance cost and try its best to avoid catastrophic hardware errors. We will try to identify such errors and detect them during runtime. In addition, during rendering, we will not perform any error check, e. g. the error detection will be delayed until a draw command is completed. Critical error protection expands the depth buffer and rendering target size does not match, while binding a resource to read and write operations, and so on, non-fatal errors include: mismatched shader type join (signature) and Data Format declaration do not match. During the development of the current coloring and compiling model, it is very costly to capture errors during the coloring program running. Therefore, we define a complete behavior. For example, when the array is out of bounds, 0 is returned to obtain consistent behavior. In the long run, the hardware will support the exception mechanism.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.