Since the launch of NVIDIA gtx480, we have seen many comparative evaluations on the Internet. However, we generally use large evaluation methods such as 3dmark to obtain general results. The method I plan to evaluate here is to use the samples of DX SDK feb2010 to evaluate different GPU modules separately. Such results are more meaningful for graphic personnel.
My evaluation is going to be conducted in three rounds to evaluate the performance of Traditional Graphics pipelines, directcompute and tessellation. The tested machines are two Dell t5400 workstations (Xeon e5440 4-core, 4 GB memory) with one gtx133 and one hd5870 respectively. Both graphics cards are in the public version. The OS is win7 64bit. The traditional interface is used to eliminate the PS overhead of the areo interface. The graphics card drivers are forceware 197.41 and catalyst 10.3.
First-round traditional graphic Assembly Line
The first test is the graphic performance of two graphics cards in common game scenarios. The sample is cascaded shadow depth map, contact hardening shadows, variance shadows 11 and dynamic shadow linkage 11. The test results are as follows:
|
NVIDIA geforce GTX 480 |
ATI radeon HD 5870 |
Cascadedshadowdepthmap |
402.7 |
366.4 |
Contact hardening shadows |
716.8 |
740.5 |
Varianceshadows11 |
391.2 |
346.5 |
Dynamicshaderlinkage11 |
1151.2 |
998 |
From the results, we can see that gtx480 has some advantages over traditional graphic pipelines. In particular, CSM and VSM are both used in many games. gtx480 is 10% and 13% faster than hd5870. Dynamic shader linkage 11 is worth noting. Dynamic shader linkage is a new function of d3d11, which allows shader to use virtual functions. When using this function, you only need to access the interface, and the Implementation part can be dynamically linked. It is useful for dealing with the combination explosion of shader. In this sample, gtx480 is faster than 15%. Contact hardening shadows is the only example to defeat gtx480 (perhaps because it is a demo from AMD ). Its shader uses the new d3d11 function such as gathercmpred (ATI started to develop gather in d3d10.1), and involves a lot of vector computing, which is more suitable for the hd5870 architecture.
In the first round, gtx480 was a small winner.
Second round of directcompute
Directcompute is the first scene of d3d11. For the first time, it brought gpgpu into mainstream graphics APIs. So what about the performance of directcompute on these two graphics cards? Examples of this selection are nbodygravitycs11, adaptivetessellationcs40, and hdrtonemappingcs11. Note that adaptivetessellationcs40 uses compute shader to implement tessellation, rather than hardware tessellator. The test results are as follows:
|
NVIDIA geforce GTX 480 |
ATI radeon HD 5870 |
Nbodygravitycs11 |
275.6 |
364.7 |
Adaptivetessellationcs40 |
554.4 |
837.4 |
Hdrtonemappingcs11 |
1710.9 |
3393.3 |
Obviously, hd5870 wins, respectively leading 32%, 51%, and 98%! This shows that ATI supports directcompute quite well. For example, hdrtonemappingcs11 can switch to PS or CS for post process. On hd5870, CS is a little faster than PS, while on gtx480, CS is only half the PS. It is worth noting that, starting from g80, there will always be a negligible overhead for NV graphics cards to switch back to graphics pipeline from gpgpu. This causes the performance to be greatly affected if the program needs to switch back and forth. This is not only true for directcompute, but also for Cuda.
Eye-catching students will surely find that the oit11 sample uses directcompute. Why didn't they participate? It is because I found that the sample running result on gtx480 is incorrect and cannot be recorded in statistics.
In the second round, hd5870 won the championship.
Third Round of tessellation
The tessellation section is added to the graphic assembly line of d3d11, so that subdivision, surface, and other fine-grained features can be simply presented in real-time rendering. NVIDIA pays great attention to tessellation during its promotion, and claims to be up to eight times faster than hd5870. The test result is as follows:
|
NVIDIA geforce GTX 480 |
ATI radeon HD 5870 |
Detailtessellation11 |
1065.1 |
743.2 |
PN-triangles |
731.4 |
715.6 |
Simplebezier11 |
1217.9 |
1165.1 |
Subd11 |
471.7 |
360.9 |
Glasses fell down. Gtx480 leads a maximum of 43% of these samples. Two simple examples show only the advantages of <5%. This may be because these samples are too simple to take advantage of gtx480.
In the third round, gtx480 Xiaosheng.
The results of this multipart test show that directcompute and tessellation are respectively the strengths of ATI and NV, and their graphics cards have not yet arrived on Earth. If you are doing pure graph rendering, gtx480 is a good choice. Hd5870 is much better for gpgpu + rendering.