Transferred from: http://blog.csdn.net/blues1021/article/details/41099705
References organized from article:
Http://zh.wikipedia.org/zh/Direct3D
http://blog.csdn.net/weili_2007/article/details/1907066
http://msdn.microsoft.com/en-us/library/windows/desktop/bb219679 (v=vs.85). aspx#direct3d_system_integration
Concepts include Idirect3d, Adapter, Device, swap chain (surface background cache, foreground cache, depth, and template cache), resources (resource type _d3dresourcetype, format _d3dformat, The memory area in which the resource is stored, the usage of the resource identifies usage). The model is the Direct3D object interface model formed by these concepts.
First, the basis of graphic model
1) Physical composition of the graphics card:
(1) The BIOS of the video card is used to drive the video card.
(2) Gpu
(3) Memory RAM,
(4) RAMDAC Digital to Analog converters
(5) interface pci/agp/tv output.
2) Graphic data transmission model:
Graphics card BIOS initialization, data read from disk or from internal program (Cpu&ram), (1) Motherboard PCI/AGP Advanced Graphics Port Input--(2) Devicegpu complex mathematical and geometric operations of the graphics rendering pipeline, (3) memory Ram holds the data required for GPU operation and the information after completion (pixel + position) memory low end 1GB/2GB high-end 4gb/6gb-> (4) VGA (CRT)/pcie/vivo and other port output monitor monitor.
Direct3D graphics Pipeline Rendering process:
The Direct3D API defines vertices (vertices), textures (textures), buffers (buffers), and the process of transitioning the State group to the screen. Such a process is described as rendering pipeline (rendering pipeline), which has many different stages. The stages of the Direct3D 10 rendering pipeline include:
- Input Assembler: The program obtains vertices from the art disk file cpu/system RAM and packs the data provided by the program into the assembly line.
- Vertex color engine (Vertex Shader): Handles one vertex at a time, such as transformations, textures, and lighting.
- Geometric Color picker (Geometry Shader): Shader Model 4.0 introduces a geometric shader that uses Shader resources to handle geometric coordinate transformations of points, lines, and polygons, processing up to six points at a time, and quickly combining similar vertices of the model. This process requires no CPU involvement.
- Stream output: Outputs data from vertex shader and pixel shader processing to the consumer.
- Rasterization (Rasterizer): Turns the finished vertex into pixels and then outputs the pixel (pixels) to pixel shader. Other work can be done here, such as cutting pixels in a non-frustum area, or interpolating vertices to get pixel data.
- Pixel-Color engine (Pixel Shader): Determine the final pixel color to be written to the render target (render Targe), as well as calculate a depth value that prepares to be written to the depth buffer.
- Output merger: Received from the slice of pixel shader, perform traditional stencil test and depth test, integrate various output data (color and position) to establish the final result.
The division of CPU and GPU, memory and graphics, which improves the efficiency of computing and memory usage, optimizes the performance of the program:
In the fixed rendering pipeline, the programming graphics file reads, the hardware device performance detection, by the explicit SFF adapter run type and the background cache surface parameter and so on the specified device's generation, the device creates the various resource maps animation illumination, the device state setting, the device conversion operation setting and so on, Calculations in non-rendering and render state settings for Dircet3d/opengl are performed in the CPU code.
Actually after the CPU submits the rendering, also needs the GPU actually to carry on the computation to the clipping, the back blanking, transforms, the illumination and so on, the texture pixel depth test, the template test, the pixel fusion, the grating; memory is stored and rendered by the graphics card.
So the CPU level of resources, the calculation of the amount of control; GPU various calculation of the open control, as far as possible to make the video memory hit rate high do not need to request the CPU frequently, so as to have a better program graphics computing performance and memory overhead.
Note: Shader programming will give more computations to the GPU processing, can deal with a large amount of computational efficiency, and the program also achieves better performance.
3) Coordinate transformations in the render pipeline:
Observing coordinate systems--world coordinate system--local coordinate system
---back blanking, light and cut
, and the viewport coordinate system--rasterization.
Second, Idirect3d Unified interface
The Idirect3d object is a unified interface that supports Direct3D graphics device drivers and can be used to obtain the hardware and software features of the system's adapter (graphics card).
The Idirect3d provides a graphical hardware model. In this model, a adapter (which is identified by an unsigned integer) can create one or more device, and one adapter to connect one or more monitor.
Note: A apdater is not entirely equivalent to a card requires, in recent years, some graphics cards can support two apdater, called "dual head" Display,idirect3d9 think they are different apdaters.
Three, adapter display SFF Adapter identification number and enumeration adapter functions:
The adapter is represented by an unsigned integer identification number, which is a software abstraction object for the SFF adapter (Agp,pci interface, GPU, video memory, Monitor), and the device object of the specified type can be obtained by enumerating adapter (a adapter can create multiple device ), and gain the characteristics of monitor.
Each monitor is driven by a adapter that supports a wide variety of display modes and refresh rates, and uses the adapter ID (which requires Idirct3d objects) to determine the user's hardware feature support to create a specific device, For example: whether to support render target format, resource format, and multisampling.
The Idirect3d object enumerates the functions of the adapter:
Getadaptercount ()
Getadapterdisplaymode
GetAdapterIdentifier//Get Adapter description
CheckDeviceType//Determine if the device on the specified adapter supports hardware acceleration
GetDeviceCaps//Specifies the performance of the device, primarily determining whether hardware vertex processing is supported (T&L)
GetAdapterModeCount//Get all available display modes in the specified buffer format on the adapter
EnumAdapterModes//Enumerate all display modes
CheckDeviceFormat
CheckDeviceMultiSampleType
To create a device note:
Direct3D equipment Two different modes of operation: windowed and exclusive.
Under windowed, graphical rendering is performed in the client area of the desktop window. Direct3D will work with GDI, using: The StretchBlt method present a back buffer in the client area of Windows.
In exclusive mode, Direct3D directly calls the video card driver, not through GDI. When an exclusive-mode application is running, no other applications can access the video card anymore.
Use:
1) CheckDeviceFormat:
Each adapter has multiple display modes. Each display mode contains screen dimension,refresh rate and pixel format, Direct3D is defined using struct D3ddisplaymode.
The back buffer surface format must be compatible with the display mode format, using CheckDeviceFormat to discover compatible formats. In general, the back buffer format has the same pixel depth and color layout as the display format.
A XRGB display format can be used with ARGB back buffer of the same depth.
2) GetDeviceCaps:
With the right equipment, we can check the rendering capabilities of the device through GetDeviceCaps.
3) CheckDeviceType:
Tells us whether the display format and the back buffer format are reasonable for a specified type of device.
4) CheckDeviceFormat:
Next, we can use CheckDeviceFormat to update all the resources (back buffer surfaces, depth/stencilsurfaces,texture surfaces and volume texture format).
Next, if the application needs to do depth visibility detection, it should use Checkdepthstencilmatch to discover a depth buffer.
Finally, the need to use the multisampling application needs to be detected by checkdevicemultisampling.
Note: CheckDeviceType is used to detect whether the adapter allows the color format to be compatible with the Backbuffer color formats supported by one of its devices.
CheckDeviceFormat is used to detect whether a adapter device supports a resource format.
5) GetAdapterIdentifier:
A adapter used to identify a brand. GetAdapterIdentifier returns the structure of a d3dadapter_identifier9.
Driver and description are used to select devices for the graphical interface. DriverVersion indicates the version number of the Direct3D.
VendorID, deviceid,subsysid,revision is used to differentiate between different hardware chips. Whqllevel is the driver's WHQL (Windows Hardware quality Laboratory) information.
This value, if 0, indicates that it has not been authenticated, and if 1 is indicated, there is no date information. The decision to WHQL level is a time-consuming operation and generally avoids doing so.
You can avoid this operation by getadapteridentifier the flag parameter to 0.
6) CreateDevice:
Some graphics cards can provide multiple video output on a single card. D3dcreate_adaptergroup_device allows an application to drive two video output through a single device interface, allowing resources to be shared on two output. D3dcreate_disable_driver_management and D3dcreate_disable_driver_management_exdisable device resource management, forcing all resource management to occur at run time. Using the ex form returns an error when creating a resource with insufficient memory, and the Non-ex form will not return an error.
The Direct3D uses a single-precision floating point calculation. If an application requires a higher precision for the FPU, there are two options. Or the application ensures that the FPU runs in single-precision mode, or the application pre-saves the application's FPU precision before performing a floating-point operation and restores the FPU when it returns.
Pure device flag lets you use the least-used internal state tracking to improve the performance of your application.
The presentation parameter describes the rendering parameters that the device displays on the monitor. Each member of the presentation describes the behavior of the device's presentation, so it is possible to modify the value inside the function when it returns, so the presentation must be writable.
D3dpresentflag_deviceclip restricts the structure of the client zone present operation in window mode, which is supported under WINDOWXP and Windows 2000. D3dpresentflg_video flag implies that the back buffer contains VIDEO. D3dpresentflag_discard_depthstencil DISCARD Depth/stencil The contents of the surface after calling present. This makes depth/stencil surface a writable surface. If the format of depth/stencil surface is d3dfmt_d16_lockable or d3dfmt_d32_lockable setting this tag will return an error.
In window mode, Hdevicewindow specifies a handle to the render window. If Hdevicewindow is null, the focus window will be the rendered window. The fullscreen_refreshrateinhz must be 0.
In exclusive mode, Hdevicewindow specifies the top-level window that the application uses. If there are multiple devices inside the system, only one device can use the Hdevicewindow Focus_window.
Backbufferwidth, Backbufferheight, Backbufferformat must be equal to the associated member of this adapter's d3ddisplaymode.
FULLSCREEN_PRESENTATIONINTERVAL Specifies the relationship that presentation rate and screen refresh rate expect.
FullScreen_RefreshRateInHz is a very good refresh rate or a D3dpresent_rate_defualt default value.
D3dpresent_rate_defualt instructs the runtime in exclusive mode to select a suitable refresh rate, using the current refresh rate in window mode.
7) Getadaptermonitor:
For multiple monitor systems, the virtual desktop consists of a bounding rectangle that contains all the adapters that participate in the Windows desktop.
Other adapters that are not involved can also be attached to the system. All adapters on the desktop share at least one pixel boundary.
The application may want a full-screen display of a monitor, and Getadaptermonitor returns a adapter hmonitor handle.
Once you have this handle, you can decide which part of the virtual desktop is occupied by this moniter.
8) Numberofadaptersingroup:
Multiple adapter can provide multiple different video output from a single card. When you use D3dcreate_adaptergroup_device to create a device, you need to provide a set of d3dpresent_parameters.
The number of this array cannot be less than the Numberofadaptersingroup member of D3dcaps. Only one d3dpresent_parameters can use the Focus window as its device window,
The rest must use their own top-level window as the device window. Regardless of how many swap chain are created, only one depth/stencil polygon is created.
Four, Device equipment object
Device is Idirct3d object, adapter type D3ddevtype, surface properties D3dpresent_parameters specified device object (more than adapter specific), adapter biased hardware, Device more software-specific "real equipment objects". Direct3D provides a method for device enumeration and creation. All other objects can be created by the device.
Device creates a variety of resources (CPU/RAM/AGP ram/memory), the various state settings and operations of the resource (transform \ Illumination \ rasterization), the new version of the D3D11 with content responsible for resource operations.
Device is the core of D3D, it wraps the entire graphics pipeline, including transformation, lighting and rasterization (shading), according to the D3D version, the pipeline is also different, such as the latest D3D10 contains the new GS geometry processing. So device creates and manages swap chains, surfaces and resources, performs the conversion of various settings, and is responsible for rendering the data.
The device includes at least one swap chain and a number of resources for rendering.
The HAL (Hardware abstraction Layer) device uses the hardware acceleration of graphics rendering, so it is the fastest device type.
The reference device is only available in the installation version of the SDK, and it includes a software implementation of the pipeline of the entire graph. Although this device is very slow, it is easy to debug the graphical application.
Null referece device will do nothing, all rendering is just a black screen. When the system does not have an SDK installed, but the application requests a reference device, it returns a null reference.
Pluggable software devices are available through the Registerdevice device method, and the Direct3D 9.0c does not have pluggable software devices.
Architecturally, Direct3D devices contain a transformation module, a lighting module, and a rasterizing module, as the fol lowing diagram shows.
V. Swap chain swap chains and various surface surfaces caches
Swap chain and surface are also resources, which are specified in Present_prameters and are created with the device at createdevice time. You can also create multiple back-end caches to form a swap chain that rotates as the queue moves in the present.
A swap chain contains one and multiple back buffer surface, surface is also a resource that contains a rectangular collection of pixel data, such as color, alpha, depth/stencil, and texture information.
All back buffer is a reasonable render target, but not all render target is back buffer. We can also have a textured surface as the render target for dynamic rendering.
Expand:
DX11:
1.device acts as a "real object", managing the storage of various data (of course, also responsible for communication with Hardwar), and hardware interaction, used to create resources and manipulate resources.
2.context is more of an "interface for operation", where device delegates a number of data operations to the context interface, typically used to manipulate various cached data to tell the device how to draw.
The 3.swapchain implements one or more of the various "rendering data" used by surface to store output to the output device, and the data is accelerated in the background, and most of the functions are related to the state of the buffer.
Vi. RUNTIME (Device)
The device internally is managed by the runtime runtime to manage the various operations of the device, and there is an important command structure of the commands Buffer (which should be in the system RAM) to implement the named distribution.
Vii. Driver (GPU)
The runtime needs to be implemented via the adapter (GPU) Driver, Driver is divided into Hal Driver and ref driver,driver which are hardware graphics or hardware graphics predefined driver and graphics operation instructions.
The runtime comes with various device commands, performing physical operations, of course driver also has a command structure driver Buffer (should be in memory).
VIII. Resource Various resources
Resources are stored in or near the device hardware to provide the specific data required for graphical rendering.
The resources provided by Direct3D include scene geometry (vertices and indexes) and appearance data (pictures, textures, and volumes).
Each resource has type,pool, format, and usage properties. These properties are specified once when the resource is created.
1) Type:
The Type property specifies the types of resources that are defined in the D3DResourceType enumeration type.
typedef enum _D3DRESOURCETYPE
{
D3drtype_surface = 1,
D3drtype_volume = 2,
D3drtype_texture = 3,
D3drtype_volumetexture = 4,
D3drtype_cubetexture = 5,
D3drtype_vertexbuffer= 6,
D3drtype_indexbuffer = 7
}
2) Pool:
The Pool property describes how Direct3D manages it, defined inside the D3dpool enumeration type.
When a device lost, the resources in its default pool will be lost.
typedef enum _D3DPOOL
{
D3dpool_default = 0,//resource is in device memory by default.
d3dpool_managed = 1,//MANAGED Pool The resource is inside the system memory, it will copy into the device memory when need to use him.
The resources of the d3dpool_systemmem=2,//system memory pool exist only in the system memory, CPU RAM.
D3dpool_scratch = 3//SCRATCH The resources inside the pool exist only with the system memory, and it is not limited by the device format.
}d3dpool;
Attention:
The memory available for the application is related to the currently used Diplay mode. If the device request video mode is inconsistent with the video mode displayed on the desktop,
Creating a device with a exclusive mode may cause changes in inconsistent display modes. To avoid this aberration when the application starts, it is best for the application to perform a memory test at the time of installation and then save the result.
This operation is safe because the amount of memory available in a particular display mode will not change unless the hardware is replaced.
3) Format:
The format attribute describes the layout of the resource in memory, which is defined in the D3dformat enumeration type. All resources have a format, but most format enumerations are in the format of pixel data.
typedef enum _D3DFORMAT
{
D3dfmt_unknown = 0,
d3dfmt_index16 = 101,
d3dfmt_index32 = 102,
D3dfmt_vertexdata = 100,
D3dfmt_a4l4 = 52,
D3dfmt_a8 = 28,
...
}
AN,LN,BN,PN,RN,GN and Bn are unsigned, and UN,VN,WN,QN are signed. The depth/stencil surface data specified by the DN and SN devices.
The MAKEFOURCC macro is used to generate a four-character code. The appended vendor specified format can be defined by MAKEFOURCC. The DXTN format is a compressed texture format.
The D3dcolor pixel format is d3dfmt_a8r8g8b8, whereas PaletteEntry and COLORREF let R and G swap.
4) Usage:
tags how to use resources, move, compress, texture way, template way; How to use depending on the resource type, different memory storage, there will be associated, so the tag will be limited to a certain resource type or some type of memory to use.
including D3dusage_autogenmipmap,d3dusage_depthstencil,
D3dusage_dmap,d3dusage_donotclip,d3dusage_dynamic,d3dusage_npatches,
D3dusage_points,d3dusage_rendertarget,d3dusage_rtpatches
, D3dusage_softwareprocessing,d3dusage_writeonly.
Direct3D Basic concepts and model finishing