From cache to display
Recently, I was looking at the d3d architecture. In this process, I had a better understanding of the frame rate. In the past, frame rate was the last time the video card used to render a frame. It seems far from this.
To really understand this problem, we need to start from the process of drawing data from the display to the screen.
There is a front cache in the video memory. The front cache is the final pixel seen on the screen, and then the cache is used for painting. After the cache is used, a frame is drawn, which is usually exchanged once and written to the front cache, the monitor constantly reads data from the cache.
Switching Frame Rate and refreshing Frame Rate
We usually care about the rendering efficiency, which is actually the rendering efficiency. This is the F1 in the figure and the frame rate switching. F1 determines how many times the video card can be drawn in one second, I used to think that this is the user's frame rate. Actually, it is not because there are other factors.
We can see that there is also a frequency before the display reads the cache, that is, F2, F2 is also called the refresh frequency of the video card, that is, the video card is based on the frequency of talking about the data cached before the display is drawn once, it does not matter whether the cached data is new or old. It can be seen that the frame rate we end users see is the same result of F1 and F2.The actual frame rate F should be expressed as the cache swap frame rate that the monitor can represent, that is, the number of cached frames transmitted to the monitor within one second. In this case, F = min (F1, F2) is restricted by the two frame rates.For example, if your rendering is very fast and 60 times a second, but the video card only needs to be refreshed 30 times a second, the frame rate is only 30. For example, your rendering is very slow because the model is very large, 10 screenshots per second and 60 screenshots per second, the actual frame rate shown by the user is only 10. It seems that the frame rate is determined by the two, but this is not the case yet.
Vertical synchronization and Frame Rate
One thing we can see is that the front cache is in the status of being cached and written by the monitor. This process is very likely to cause read/write conflicts, the display is refreshed from top to bottom lines. A typical case is that the display is written to the next frame of data in the cache before reading this frame, the upper part and lower part of the monitor will display different frames. This is a common phenomenon of "image tearing". It is because the cache exchange is too fast to wait for the display to finish reading.
In order to solve this problem, we have introduced technologies related to "Vertical synchronization". Vertical synchronization refers to the process in which the monitor draws a complete frame from top to bottom. In this process, the video card ensures that the front cache is not changed. If a frame is drawn during this process, the cache will not swap the cached image when it is read. Such a result will ensure that the display will not be torn apart, but it also brings about another problem, that is, the card frame rate, because the normal switching frame rate is interrupted by the display vertical synchronization, the switching frame rate is greatly reduced, and the final frame rate is reduced.
Vertical synchronization and non-Vertical synchronization are two extremes. They represent the highest picture integrity and the highest frame rate respectively. Therefore, in practice, there are many ways to compromise, that is, to allow the video card to interrupt cache switching only once during a maximum of N frames. The larger the N, the closer the N, the less vertical synchronization is required, the higher the frame rate, the smaller the N value, the closer it is to vertical synchronization, and the lower the probability of tearing. Vertical synchronization and non-Vertical synchronization correspond to the exchange parameters of d3d9.D3dpresent_donotwait, d3dpresent_interval_immediate, and d3dpresent_interval_one (~ Four ). Then the final frame rate F should be close to min {f1-min {F1, F2}/(1 + n), F2}, usually F2 is large enough.
Therefore, the frame rate is related not only to the switching Frame Rate and refresh frame rate, but also to the vertical synchronization policy,So we can see that some players close the vertical synchronization card in the game, and some players turn on the vertical synchronization will reduce the frame rate, that is why.
Memory
Of course, when we see that vertical synchronization restricts the frame rate, it is because there is only one cache in front of this figure, which is in a read/write conflict state, so I want to make it possible that the video memory does not exist in this status. The video memory needs to be very large, the post-cache is the producer, and the video card is the consumer, if the Pre-cache is large enough (it can be divided into N multiple blocks), the producer may pile up new things on the cache without scruples. In fact, there is no conflict at all, because there is always a limit on the size of the video memory, as long as there is a limit on the cache size, there may inevitably be conflicts between producers and consumers, there is a conflict, or select producers and other consumers (production is reduced, that is, the lower frame rate), or the consumer will get things out of disorder (that is, tearing), but the larger the video memory, the smaller the possibility of such potential conflicts, and the easier the problem to avoid.
So I can see that the client with a large video card memory can immediately completely disable the vertical synchronization (that is, the theoretical maximum frame rate) is not easy to tear, or the Frame Rate of the fully open vertical synchronization is still very high, video memory is not only conducive to drawing, but also helps to solve pre-Cache conflicts and increase the frame rate.
Based on these ideas, we must fully consider the vertical synchronization strategy when selecting a rendering policy, and consider it based on the hardware of potential users, the rendering efficiency of the game, and the refresh efficiency, it is known that the frame rate is determined by the rendering efficiency, the refresh efficiency of the video card, and the vertical synchronization policy. In d3d9, the vertical synchronization policy recommended by MicrosoftD3dpresent_interval_one, that is, the compromise policy closest to full vertical synchronization. That is, a cache swap is interrupted at most once during a video refresh. The actual frame rate should be close to F1/2, if the final user sees 30 frames, if z is 60 for F2, F1 must have more than 60 frames.