The appearance of the hidden area removal algorithm is based on the following reality. In a game scenario, you do not have to render all objects to the screen at each frame. It has two main functions: one is to reduce the screen fill rate of the GPU, and the other is to reduce the transmission bandwidth of the CPU and GPU. According to the current hardware status, there is no problem with the filling rate of the GPU, And the only restriction is now on the AGP. The poor bandwidth has become the bottleneck of graphic rendering, therefore, most of the current HSR algorithms are implemented on the CPU end to minimize the amount of data to be transmitted per frame. You do not need to worry about the CPU burden here, because the current CPU can fully adapt to such requirements. When you organize a scenario using appropriate scene segmentation algorithms such as BSP and OC tree, you need to remove useless information from each frame by using the appropriate HSR algorithm during rendering, currently, HSR algorithms can be divided into four types: backface culling, frustum culling, portal culling, and occlusion culling. Their functions are as follows: backface culling is used to remove the triangle on the back of the mesh, frustum culling is used to remove objects outside the screen, and portal culling is used to remove objects that cannot be seen through the portal, the occlusion culling is used to remove the objects that are blocked by the objects in the scene. The first two algorithms are the most commonly used algorithms. For backface culling, they are already implemented on GPU hardware. However, to reduce the transmission bandwidth, you can also use the CPU in a proper place. It mainly refers to a large-scale mesh like terrain. How to Use it depends on how you organize terrain data, when you use triangle strip to render a terrain, we try to use block-based terrain for culling. When you use Triangle List to render the terrain, you can try to use a triangle to perform culling on it. You need to experiment with the method you actually use. Frustum culling is not explained here because it is too common. If it is described in detail in portal culling, it is another article. Here is a brief introduction. Generally, there are two types of portal culling, one is the PVS algorithm used for BSP. The advantage of this method is that visibility information is obtained through pre-calculation, which is very fast. In addition, the portal in the scenario can be obtained through a program without manual designation, however, the disadvantage is also obvious. Basically, it can only be used in indoor scenarios, and because PVS is a possible visible set, it cannot completely remove invisible scenes. The other can be called real-time portal culling. This method has obvious advantages. If the scenario is reasonably arranged, it can be completely eliminated, and it can be used in any scenario, indoor, city, and terrain can be used, but their disadvantages are also obvious. The portal must be manually specified, in addition, since each frame of the portal culling needs to be calculated, each frame appears on the screen of the portal must be restricted, the most important thing is that due to the limitation of the scenario, especially in the outdoor scenario, it cannot be completely eliminated. Let's take a simple example to see why it cannot be completely eliminated:
(Thanks to siney for providing illustrations)
The above can be seen as a House. B is the door of the room, that is, a portal. A is an object in the scenario, if an NPC is visible at a point D, but when it is located at a point C, it should be invisible because object a exists in front of it, however, portal culling does not remove it, so the NPC will still be transferred to the GPU for rendering.
At this time, we need to use the occlusion culling technology to make up for this defect. Therefore, the OC technology must be used in a portal engine. Here is another outdoor example, there is a building outside. We can use the building doors and windows as the portal to perform culling on the building, and use the building as an occluder to perform OC operations on the entire scenario. Based on the information provided by the relevant literature, when the OC technology is used in the scenario, the FPS is increased by 30% to 70% than when the OC technology is not used, therefore, if possible, we should try to add the OC technology to your engine, and the OC technology is not only applicable to the portal engine, it can be used in any technology-built engine.
There are many OC algorithms, such as hzb and Hom, which are relatively powerful. However, if they are used in CPU computing, the efficiency is still very low and the real-time requirements are not met, however, the current GPU does not support its hardware implementation, so it is not the focus of this article. The OC algorithm introduced here is called the Z buffer algorithm of the range scanning line, this is an OC algorithm that can be completely eliminated and is very efficient. Before introducing it, we must clarify several concepts. In oC, blocks are called occluder, and blocks are called occludee. The OC algorithm checks whether occludee is blocked by occluder. In a scenario, large objects can be used as occluder, such as houses and castles. The polygen on occluder located on the same plane is called surface, and the surface is connected by the beginning and end of edge to form a closed space.
Before performing OC computing, we first need to obtain all occluder in the scenario. This is a very important step, which is usually performed during scenario preprocessing, generate an occluder list based on the scenario and save it to the file.
Next we will discuss how to obtain occluder. In this scenario, occluder usually needs to be manually specified because developing an algorithm to automatically obtain occluder is very complicated, at the same time, the number of occluder in the same frame must be limited due to the OC operation efficiency problem. Next let's take a look at the situation: for a standard BSP engine, the entire scenario is composed of a brush, so it is relatively simple to obtain the occluder, you can manually specify which brushes are used as occluder During scenario modeling, and then process the brushes separately during BSP splitting, you can use the same method as the scenario to obtain the appearance of these brushes as occluder. However, it is relatively simple to obtain occluder for engines built by other technologies. For example, the terrain-based engines widely used in MMO, in which the objects that can be used as occluder are actually independent models, therefore, the occluder is actually the convex hull of the model. There are many such algorithms, such as qhull. However, if you use a program to obtain the hull, there are some restrictions on the shape of the model, an alternative method is to specify a hull for the object during the artist's modeling and save it to the file, so that the shape of the model is not limited, therefore, the second case is to specify an appropriate object as the occluder when editing the scenario, and then save the convex hull of the object to the occluder list.
When we obtain occluder in the scenario, we can perform OC operations. First, let's take a look at the principle of the Z-buffer algorithm of the range scanning line.
|
(Thanks to siney for providing illustrations)
When N is occluder, A, B, and C are occludee, we assume that they are projected onto the screen as a rectangle. Here we call the top and bottom two edges A and B of N as scanning lines, it can be seen that if object A is behind N, then a must be blocked by N, which can be compared by comparing the Z value of the surface where N and a are located, this is the source of the Z buffer in this algorithm. Now we assume that all objects A, B, and C are behind n. We can see that A is completely blocked, and B is partially blocked, C is not blocked. How can we determine it? First, let's look at the situation of A. There are four scanning lines of N and A in the ascending order of Y: B, D, C,, in the range composed of scanning line B and D, because there is no edge of a, skip and check the range composed of scanning line D and C. We call the edge between the two scanning lines as the active edge, they are saved in the active edge list. Each active edge stores two surface pointers, left surface and right surface. For example, for the left edge of N, its left surface does not exist, so it points to the screen The background of the screen, that is, the surface with the Z value as the largest, and the right surface with the N value as the surface. Others are similar. Now we check the list of activated edges between scanning line D and line C. First, we obtain the left edge with the smallest x value of A as, then obtain the activation edge of the left edge of X greater than a in the activation edge list of N, which is the right edge of N and compares the left surface of the two edges, the Z of the Left surface of the left edge of A is greater than the Z of the right edge of N, so the left edge of a is blocked by N, then we can use the same method to check whether the right edge of a is blocked, so that we can determine that A is completely blocked by N. The same method is used to check B. in the range between scanning line f and B, because n does not exist in the activation edge list, B can be determined that B is not completely blocked by N. Check C again. In the activation edge list between scanning line H and G, all edge X in the edge list is smaller than the left edge of C, and it can be determined that C is not blocked by N. For other complex situations, such as multiple occluder cases, you can verify by yourself that this algorithm can accurately determine whether an object is completely blocked under any circumstances.
I believe that you have fully understood the basic principle of the Z-buffer algorithm of the range scanning line through the above introduction. The following describes how to implement this algorithm. The steps are as follows:
1. Obtain all occluder in the scenario and remove all surface pairs of camera on the occluder.
2. Use near clip plane to perform clip operations on all surfaces, removing the surface located outside of near clip plane. If the surface and near clip plane intersect, in this way, the edges outside the nearclip plane are removed.
3. transform all vertices to the projection space. Pay attention to the nature of the projection space. The projection transformation is actually to convert frustum into a box, which is a cube for OpenGL, the minimum inflection point is (-1,-1,-1), and the maximum inflection point is (, 1). For dx, the minimum inflection point is (-1, -), the maximum inflection point is (, 1 ). The shape of all objects in the projection space on the screen does not change, which is also the reason for the transformation to the projection space. Calculate the projected area of each surface on the screen, remove the surface with a small area, and remove the surface outside of frustum, this is to prevent errors due to computational accuracy. The plane of the surface is also transformed to the projection space to facilitate the calculation of the Z value of the specified vertex.
4. Save all vertices converted to the projection space to the edge list. Note that the edge is saved in the method where the Y value of the edge starting point is smaller than the Y value of the end point, in addition, the Y value at the starting point is the Y value of edge. In the edge list, sort all edges in the ascending order of edge y. If Y is the same as X, if X is the same as the slope of edge. Because edge list is fully sorted by Y, we can also call it the y bucket. Note that if edge is parallel to the X axis, you do not need to add it to the edge list.
5. Search for scanning lines and calculate the list of active edges between scanning lines. From the above introduction, we can also get a general idea of how to obtain the scanning line. The scanning line must be on the edge endpoint. If the edge exists intersection, the intersection must also be used as the scanning line, then we need to traverse the y bucket to find all scanning lines between the scanning line y =-1 and y = 1, after obtaining all scanning lines, you need to calculate the list of active edges in the scanning line range. Note that the edge in the list must use the intersection between the edge and the scanning line as the endpoint, the list of activated edges is sorted by X, which is also called the X bucket.
6. For scenes, you need to check whether the blocked object has passed in its AABB as occludee. First, check whether the projection area of occludee on the screen is large enough. If it is too small, you do not need to perform OC on it, then, the AABB is converted into a box, and the surface and edge of the box are calculated. Then, the surface and edge of the box are converted to the projection space and scanned lines are searched based on the above method to obtain the list of activated edges. Starting from the smallest Scan Line of Y in occluder and occludee, the activation edge list is compared by interval to check whether occludee is completely blocked.
In this algorithm, the list of activated edges needs to be updated almost every frame. Therefore, the most computational part is here, however, after the edge list is activated, occlusion Calculation for occludee is very convenient and quick. It is suitable for OC operations on a large number of dynamic objects, especially for role models with high rendering costs.