D3d9 GPU Hacks (reprint)

Source: Internet
Author: User

D3d9 GPU Hacks

I ' ve been trying to catch up what hacks GPU vendors has exposed in Direct3D9, and turns out there's a lot of them!

If you know more hacks or more details, please let me know in the comments!

Most hacks is exposed as custom ("FOURCC") formats. So-check for the CheckDeviceFormat . Here's the list (Usage column codes:ds=depthstencil, Rt=rendertarget; Resource column codes:tex=texture, Surf=surface). More green = more hardware support.

Format Usage Resource Description NVIDIA GeForce ATI Radeon Intel
Shadow Mapping
D3dfmt_d16 Ds Tex Sample depth buffer directly as shadow map. 3 + HD 2xxx+ 965+
d3dfmt_d24x8 Ds Tex 3 + HD 2xxx+ 965+
Depth Buffer as Texture
DF16 Ds Tex Read depth buffer as texture. 9500+ g45+
DF24 Ds Tex x1300+ sb+
Intz Ds Tex 8+ HD 4xxx+ g45+
Rawz Ds Tex 6 & 7
Anti-aliasing related
Resz Rt Surf Resolve MSAA ' d depth stencil surface into NON-MSAA ' d depth texture. HD 4xxx+ g45+
Atoc 0 Surf Transparency anti-aliasing. 7+ sb+
Ssaa 0 Surf 7+
All ATI sm2.0+ Hardware 9500+
N/A Coverage Sampled Anti-aliasing[6] 8+
Texturing
ATI1 0 Tex ATI1N & ati2n Texture compression formats. 8+ x1300+ g45+
ATI2 0 Tex 6+ 9500+ g45+
DF24 Ds Tex Fetch 4:when sampling 1 channel texture, return four touched Texel values[1]. Check for DF24 support. x1300+ TBL
Misc
Null Rt Surf Dummy render target surface that does not consume video memory. 6+ HD 4xxx+ Hd+
Nvdb 0 Surf Depth Bounds Test. 6+
R2vb 0 Surf Render into vertex buffer. 6 & 7 9500+
INST 0 Surf Geometry instancing on pre-sm3.0 hardware. 9500+

Native Shadow Mapping

Native support for Shadow map sampling & filtering is introduced ages ago (GeForce 3) by NVIDIA. Turns out ATI also implemented the same feature for it's DX10 level cards. Intel also supports it on Intel 965 (aka GMA X3100, the Shader Model 3 card) and later (G45/X4500/HD) cards.

The usage is quite simple; Just create a texture with regular depth/stencil format and render into it. When reading from the texture, one extra component in texture coordinates would be the depth to compare with. Compared & Filtered result would be returned.

Also Useful:

    • Creating NULL color surface to keep D3D runtime happy and save on video memory.

Depth Buffer as Texture

For some rendering schemes (anything with "deferred") or some effects (SSAO, depth of field, Volumetric fog, ...) having ACC ESS to a depth buffer is needed. If native Depth buffer can be read as a texture, this saves both memory and a rendering pass or extra output for mrts.

Depending on hardware, the can be achieved via Intz, Rawz, DF16 or DF24 formats:

    • Intz is for recent (dx10+) hardware. With recent drivers, all three major IHVs expose this. According to ATI [1], the It also allows using stencil buffer while rendering. Also allows reading from depth texture while it's still being used for depth testing (and not depth writing). Looks like this applies to NV & Intel parts as well.
    • Rawz is for GeForce 6 & 7 Series only. Depth is specially encoded to four channels of returned value.
    • DF16 and DF24 are for ATI and Intel cards, including older cards, that don ' t support Intz. Unlike Intz, this does isn't allow using a depth buffer or using the surface for both sampling & depth testing at the same Time.

Also useful when using depth textures:

    • Creating NULL color surface to keep D3D runtime happy and save on video memory.
    • Resz allows resolving multisampled depth surfaces into non-multisampled depth textures (Result would be is sample zero for EAC H pixel).

Caveats:

    • Using Intz for both depth/stencil testing and sampling at the same time seems to has performance problems on ATI cards (c hecked Radeon HD 3xxx to 5xxx with Catalyst 9.10 to 10.5). A workaround is to the render to Intz Depth/stencil first and then use Resz to "blit" it into another surface. Then does sampling from one surface, and depth testing on another.

Depth Bounds Test

Direct equivalent of gl_ext_depth_bounds_test OpenGL extension. See [3] for more information.

Transparency anti-aliasing

NVIDIA exposes, Controls:transparency multisampling (ATOC) and transparency supersampling (SSAA) [5]. ATI says that all radeons since 9500 support "alpha to Coverage" [1]. Intel supports ATOC with Sandybridge (GMA HD 2000/3000) GPUs.

Render into Vertex Buffer

Similar to ' stream out ' or ' memexport ' in the other apis/platforms. See [2] for more information. Apparently some NVIDIA GPUs (or drivers?).

Geometry instancing

Instancing is supported the Shader Model 3.0 hardware by Direct3D 9.0c, so there ' s no extra hacks necessary there. ATI have exposed a capability to enable instancing in their Shader Model 2.0 hardware as well. Check for ' INST ' support, and does dev->setrenderstate (d3drs_pointsize, kfourccinst); at startup to enable INST Ancing.

I can ' t find any document on instancing from AMD now. Other references: [7] and [8].

ATI1N & ati2n Compressed Texture Formats

Compressed texture formats. ATI1N is known as BC4 format in DirectX ten land; Ati2n as BC5 or 3Dc. Since They is just DX10 formats, support for this is quite widespread, with NVIDIA exposing it a while ago and Intel Expo Sing it recently (drivers 15.17 or higher).

Thing to keep in Mind:when DX9 allocates the MIP chain, they check if the format is a known compressed format and Allo Cate the appropriate space for the smallest MIP levels. For example, a 1x1 DXT1 compressed level actually takes up 8 bytes, as the block size was fixed at 4x4 texels. This is true if the block compressed formats. Now when using the hacked formats DX9 doesn ' t know it ' s a block compression format and would only allocate the number of by TES the MIP would has taken, if it weren ' t compressed. For example a 1x1 ati1n format would only has 1 byte allocated. What are need to the stop the MIP chain before the size of the either dimension shrinks below the block dimensions ot Herwise you risk have memory corruption.

Another thing to keep in Mind:on vista+ (WDDM) driver model, textures in these formats would still consume application add Ress space. Most of regular textures like DXT5 don ' t take up additional address space in WDDM (see here). For some reason ati1n and ati2n textures on D3d9 is deemed lockable.

References

All the information gathered mostly from:

    1. Advanced DX9 capabilities for ATI Radeon Cards (pdf)
    2. ATI R2VB Programming (PDF)
    3. NVIDIA GPU Programming Guide (PDF)
    4. ATI tesselation
    5. NVIDIA Transparency AA
    6. NVIDIA Coverage Sampled AA
    7. Humus ' instancing Demo
    8. Arseny ' s article on particles

Changelog
    • 11:one More note on ATI1N/ATI2N format virtual address space issue (Thanks jseb!).
    • 09:turns out since sometime Intel have DF24 and FETCH4 for Sandybridge and later.
    • 09:intel implemented ATOC for Sandybridge, and NULL for GMA HD and later.
    • 25:intel implemented DF16, Intz, Resz for g45+ gpus!
    • 25:added Note on Intz performance issue with ATI cards.
    • 19:intel implemented ati1n/ati2n support for g45+ GPUs in the latest drivers!
    • 08:added Note on ati1n/ati2n texture formats, with a caveat pointed off by Henning Semler (thanks!)
    • 06:hey, shadow map hacks is also supported on Intel 965!
    • 09:shadow map Hacks is supported on Intel g45!
    • 21:added instancing on SM2.0 hardware.
    • 20:added Fetch-4, CSAA.
    • 20:initial version.

Original link: http://aras-p.info/texts/D3D9GPUHacks.html

D3d9 GPU Hacks (reprint)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.