Write in front
This article is based on Digital tutors a series of tutorials to summarize the expansion of the ~digital tutors is a very good tutorial site, contains a wide range of multimedia information, very cool! In addition, a tutorial in unity cookies is also referenced. There are many other references in the links below.
This article is intended to briefly describe the various optimization strategies that are common. However, each foundation has a very in-depth explanation, the need for children's shoes can go to the relevant information on their own.
And there's something I think is very good reference article: Performance optimization for Mobile Devices 4 Ways to increase performance of your Unity Game Unite 2013 Optimizing Unity Games for Mobile platforms Unity Optimization Tips
Factors that affect performance
First, we need to understand what factors affect the performance of the game, to the remedy. For a game, there are two main computing resources: CPU and GPU. They will work with each other to allow our games to operate at the desired frame rate and resolution. The CPU is responsible for the frame rate, and the GPU is primarily responsible for some of the resolution-related things.
To summarize, the main performance bottlenecks are:
- Cpu
- Too many draw Calls
- Complex scripting or physical simulations
- Vertex processing
- Too many vertices
- Excessive vertex-wise calculation
- Pixel (Fragment) processing
- Too much fragment,overdraws.
- Excessive per-pixel calculation
- Bandwidth
- Large and uncompressed textures
- Framebuffer with high resolution
For CPUs, the main limitation is the in-game draw Calls. So what is draw call? If you have learned OpenGL, then you must remember that before each drawing, we need to prepare the vertex data (position, normal, color, texture coordinates, etc.), and then call a series of APIs to place them in the specified location that the GPU can access, and finally, we need to call the _gldraw* command, To tell the GPU, "Hey, I got everything ready, you lazy guy, get out of the work." ”。 When invoking the _gldraw* command, it is a draw call. So why is draw call a performance bottleneck (and a CPU bottleneck)? As mentioned above, we need to invoke draw call when we want to draw an image. For example, in a scene where there is water and a tree, we use a material and a shader when we render the water, but when we render the tree we need a completely different material and shader, then we need the CPU to re-prepare the vertex data, reset the shader, And this kind of work is actually very time-consuming. If each object in the scene uses a different material, different textures, then there will be too many draw call, affecting the frame rate, the game performance will be degraded. Of course, here is very simple, more detailed please Google yourself. Other CPU performance bottlenecks are also physical, fabric simulation, particle simulation, and so on, are computationally significant operations.
For the GPU, it is responsible for the entire rendering pipeline. It starts with the model data that the CPU passes over, vertex Shader, Fragment Shader, and so on, and then outputs each pixel on the screen. Therefore, its performance bottleneck may be related to the number of vertices to be processed, screen resolution, video memory and other factors. The overall inclusion of both vertex and pixel performance bottlenecks. One of the most common performance bottlenecks in pixel processing is overdraw. Overdraw means that we may have drawn multiple pixels on the screen.
After understanding the above basic content, the following optimization techniques are:
- Vertex optimization
- Using LOD (level of detail) technology
- Using occlusion culling (occlusion culling) technology
- Pixel optimization
- Control Drawing Order
- Beware of Transparent objects
- Reduce real-time lighting
- CPU optimization
- Bandwidth optimization
- Reduce texture size
- Leverage scaling
The first is the Vertex optimization section.
Vertex optimizationOptimize geometry This step is mainly for the "vertex processing" in the bottleneck of the target energy. The geometry here refers to the grid structure that makes up the objects in the scene. 3D game production begins with the creation of a model. While modeling, there is a piece we need to remember:
minimize the number of triangles in the model, some vertices that have no effect on the model, or are very difficult to perceive the difference with the naked eye, are removed as much as possible. For example, in the left-hand image below, many vertices inside the cube are unwanted, and importing this model into unity is the right side of the story: in the game view, we can see the number of triangles and vertices in the scene: You can see a simple square that produces so many vertices, That's what we don't want to see. While
reuse vertices as much as possible。 In many three-dimensional modeling software, there are corresponding optimization options, can automatically optimize the grid structure. After the final optimization, a cube may have only 8 vertices: it corresponds to the number of vertices and triangles as follows: Wait! Here, you might ask, why is the vertex number 24 instead of 8? Art friends often encounter the problem that modeling software shows a different number of vertices than unity, usually a lot more unity. Who is right? In fact, they are calculated from different angles and have their own reasons, but what we really should care about is the number of unity. Let's explain briefly here. The three-dimensional software is more in the perspective of human beings to understand the vertex, that is, we see a point is one. Unity, in the perspective of the GPU, calculates the number of vertices. In the GPU's view, it seems likely that it will be processed separately, resulting in additional vertices. This divides the vertex into many reasons, there are two main: one is the UV splits, the other is smoothing splits. The essence of all this is that for the GPU, each attribute and vertex of a vertex must be a one to the same relationship. UV splits is generated because there are multiple UV coordinates for a vertex when modeling. For example, as in previous cubes, because each polygon has a common vertex, the UV coordinates of the same vertex may change on different faces. This is not understandable for the GPU, so it has to split the vertex into two fixed vertices with different UV coordinates before it is reconciled. The generation of smoothing splits is similar, when different, this time a vertex may correspond to a number of normal information or tangent information. This is usually because we have to decide whether a side is a hard edge or a smooth edge. Hard edge is usually the result of the following (note the Middle crease section): And if you look at its vertex normals, you will find that each vertex in the crease contains two different normals. As a result, for the GPU, it is also unable to understand such things, so it divides the vertices into a split. Smooth Edge, on the other hand, is the following: For the GPU, it essentially cares only how many vertices it has. As a result, minimizing the number of vertices is actually what we really need to be concerned about. Therefore, the last optimization suggestion is:
Remove unnecessary
Hard edge and texture cohesion, which avoids
Smoothing Splits and Uv splits 。 using LOD (level of detail) technology lod technology is somewhat similar to mipmap technology, but the LOD is a model pyramid built on the model that chooses different precision models depending on the distance of the camera from the object. The advantage is that you can reduce the number of vertices you need to draw at the right time. Its drawbacks also require more memory, and may cause a simulated mutation if the distance is not adjusted. in Unity, the LOD group can be used to achieve: by using the LOD Group panel above, we can select the model to be controlled and the distance setting. The following shows an example of an oil barrel from a complete mesh to a simplified mesh, and finally completely rejected: using occlusion culling (occlusion culling) technology occlusion culling is used to eliminate objects that are not visible behind other objects, which means that resources are not wasted on computing those unseen vertices, thereby improving performance. About occlusion culling, Unity Taiwan has a series of articles you can look at (need fq): unity 4.3 about occlusion culling: basic article unity 4.3 about occlusion culling: best practices unity 4.3 about occlusion culling: Error diagnosis specific content you can find it on your own. Now let's talk about pixel optimization. pixel optimization pixel optimization is focused on reducing overdraw. As mentioned before, overdraw refers to a pixel that has been drawn several times. The key is to control the drawing order. unity also provides a view of the view Overdraw, in the scene view of the render Mode->overdraw. Of course, the view here simply provides a view of the number of layers that are obscured by the object, not the overdraw of the actual final screen drawing. In other words, it can be understood that it shows the overdraw if no depth test is used. This view is done by rendering all objects into a transparent outline and judging the occlusion of the object by looking at the cumulative extent of the transparent color. , the thicker the red, the more serious the overdraw, and the more transparent objects involved, which means that performance will be greatly affected. Control the drawing order need to control the drawing order, the main reason is to avoid overdraws, that is, the same position of the pixels can be drawn changeable. On the PC, the resources are unlimited, in order to get the most accurate rendering results, the drawing order may be drawn from the back to the forward opaque objects, and then draw a transparent object to mix. But on the mobile platform, this will cause a lot of overdraw of the way is obviously not suitable, we should try to draw back from the previous. The reason why we can reduce overdraw in the past is because of the merit of the depth test. in unity, those objects in shader that are set to the "Geometry" queue are always drawn from the go, while other fixed queues (such as "Transparent", "Overla", etc.) are drawn from behind. This means that we can set the object's queue to "Geometry" as much as possible. Also, we can take advantage of Unity's queues to control the drawing order. For example, for a Sky box, it covers almost all of the pixels, and we know that it will always be behind all objects, so its queue can be set to "geometry+1". In this way, you can guarantee that it will not cause overdraws. be wary of transparent objects for transparent objects, because of its own characteristics (you can see a previous article on Alpha Test and alpha blending), if you want to get the correct rendering effect, Must be rendered back-to-front (this is not discussed here), and the depth test is discarded. This means that transparent objects will almost certainly cause overdraws. If we do not pay attention to this, on some machines may cause serious performance below. For example, for GUI objects, most of them are set to translucent, and if the GUI occupies too much of the screen, and the main camera is not adjusted but the entire screen is projected, then the GUI creates a lot of overdraws on the screen. Therefore, if a large area of transparent objects in the scene, or a number of layers of transparent objects covered by a layer (even if each of them can be small), or a transparent particle effect, on the mobile device will also cause a lot of overdraws. This should be avoided as much as possible. for this situation of the GUI above, we can minimize the area occupied by the GUI in the window. If there is nothing to do, we can give the GUI drawing and three-dimensional scene drawing to different cameras, which are responsible for the camera's viewing range of the three-dimensional scene as far as possible and the GUI overlap. In other cases, it can only be said, as little as possible. Of course this will have a certain impact on the aesthetics of the game, so we can in the code to the machine'sPerformance, such as shutting down all the performance-consuming features first, and if you find the machine performing very well, try to turn on some special effects. reducing real-time lighting real-time lighting is a very expensive operation for mobile platforms. If only one parallel light is OK, but if the scene contains too many light sources and uses many passes shader, it is likely to cause performance degradation. And on some machines, the risk of shader failure is also faced. For example, if a scene contains three pixel-wise point lights, and the pixel-per-shader is used, then it is likely that the draw calls will be raised three times times and the overdraws will be increased. This is because the objects illuminated by these lights are re-rendered once for the pixel-wise light source. What's worse, whether it's dynamic batching or dynamic batching (in fact the document only mentions the impact of dynamic batching, but doesn't know why the results are useless for static batching), it's not possible to batch the per-pixel pass, that is, they interrupt the batch process. For example, in the following scenario, four objects are identified as "Static", and the shader they use are bumped diffuse. All point lights are identified as "Important", which is pixel-by-light. As you can see, the draw calls after the run is 23, not 3. This is because static batching occurs only for "Forward Base" passes (where the dynamic batch process is completely invalidated due to multiple passes), saving a draw Calls, and the back "Forward Add" pass, Each render is a separate draw call (and you can see that the number of Tris and Verts has also increased): This is what the document says: The draw calls for "additional Per-pixel Lights "won't be batched. The reason I am not very clear, here is a discussion, but the meaning of the inside is no effect on the static batch, and I here the result is not the same, know the cause of the trouble to give me a message, thank you very much. I also have questions in the Unity forum. We see a lot of successful mobile games, their picture effect seems to contain a lot of light, but in fact this is deceptive. a very common optimization strategy for using Lightmaps lightmaps. It is mainly used for the overall lighting effect in the scene. This technique is primarily to store the lighting information in a scene in a light texture in advance, and then only need to sample the texture at run timeto the lighting information. Of course with it is the light probes technology. Wind Yu Chong has a series of articles, but the time is more distant, but the tutorial I believe there are many online. using the god rays scene, many small light source effects are simulated in this way. They are not normally generated by light sources, and in many cases are simulated with transparent textures. Refer to the previous article for details.
CPU OptimizationThe optimization tutorial for reducing Draw Calls batch (batching) is probably the most. The most common is through batch processing (batching). Understanding from a name is the meaning of dealing with multiple objects. So what kind of objects can be processed together? The answer is
objects using the same material。 This is why, for objects that use the same material, the difference between them is only the difference in vertex data, that is, the mesh used is different. We can combine these vertex data together and send it together to the GPU to complete a batch processing. There are two kinds of batching in unity: One is dynamic batching, and the other is static batch processing. For dynamic batching, the good news is that all processing is automatic, does not require us to do anything, and the object can be moved, but the bad news is that there is a lot of restrictions, and it may be accidentally that we break the mechanism, causing unity to be unable to batch some objects that use the same material. For a static batch, the good news is that there is a high degree of freedom, few restrictions, and the bad news is likely to consume more memory, and all objects after a static batch cannot be moved again. First, the dynamic batch process.
Unity's dynamic batching condition is that objects use the same material and meet certain conditions。 Unity is always doing dynamic batching for us unconsciously. For example, the following scenario: This scene contains 4 objects, two of which use the same material. As you can see, its draw calls is now 3 and shows that save by batching is 1, which means that unity has saved 1 draw call for us by batching. Next, let's change the size of one of the boxes and see what happens: Draw calls becomes the number of 4,save by batching and becomes 0. What is this for? They still only use a material ah. The reason for this is the other conditions that were mentioned earlier that need to be met. Dynamic batching, while automatically moving, has many requirements for the model:
The maximum limit for
- vertex properties is 900, and it is possible to change in the future. Do not rely on this data.
- Generally, all objects must use the same scaling scale (1, 1, 1), (1, 2, 3), (1.5, 1.4, 1.3), and so on, but must all be the same. However, if the non-uniform scaling (that is, the scaling scale of each dimension is different, for example (1, 2, 1)), then it can be batched if all the objects are using distinct, heterogeneous scaling. This is a weird requirement, why is batching related to scaling? It has something to do with the technology behind Unity and is interested in Google itself, like here.
- objects that use Lightmap are not batched. Multiple passes shader will interrupt the batch process. Objects that accept real-time shadows are not batched.
In addition to the most common cases in which scaling results in the destruction of batches, there is the limitation of vertex properties. For example, in the above scenario we add the previously not optimized box model: can see that draw calls suddenly becomes 5. This is because the newly added box model contains 474 vertices, and it uses a vertex attribute with information such as position, UV coordinates, normals, and more than 900 of the sum used. Dynamic batching has so many conditions that it won't do it accidentally, so unity provides another method for static batching. Then the above example, we maintain the modified zoom, but the four objects "static Flag" tick on: click on the static Triangle drop-down box, we will see in fact this step set a lot of things, here we want is just "batching Static" one. Then we look at draw Calls, yes, still no change. But don't worry, we click to run, the change appears: draw calls back to 3, and shows that save by batching is 1. This is beneficial to static batching. And, if we look at the grid of the model at runtime, we'll see that they all become a thing called combined mesh (roo:scene). This grid is the result of unity merging all objects labeled "Static" in our case, that is, four objects: you can ask, these four objects obviously not all use a material, why can merge into one? If you look closely, you will find that there is a "4 submeshes" in it, which means that the merged grid actually contains 4 sub-grids, which is our four objects. For the post-merged grid, Unity will determine which sub-meshes of the same material are used and then batch them. However, we can be more careful to find that our boxes are actually using the same grid, but after merging it becomes two. Furthermore, we observed "VBO total" in the stats window before and after the operation, and its size changed from 241.6KB to 286.2KB and became larger! Do you remember the drawbacks of static batch processing? It is possible to consume more memory. The document is written in such a way that: "Using static batching would require additional memory for storing the combined geometry." If several objects shared the same geometry before statIC batching, then a copy of geometry is created for each object, either in the Editor or at runtime. This might is a good idea-sometimes you'll have the sacrifice rendering performance by avoiding static batch ing for some objects to keep a smaller memory footprint. For example, marking trees as static in a dense forest level can has serious memory impact. " that is, if some objects share the same grid before a static batch (for example, two boxes here), then each object will have a copy of the grid, i.e. a grid will become multiple meshes sent to the GPU. In the above example, the size of the VBO is significantly increased. If there are many objects of this type using the same grid, then this is a problem, and we may need to avoid using static batching, which means sacrificing some rendering performance. For example, if you use static batching in a forest that uses 1000 repeating tree models, the result is 1000 times times more memory, which can have a serious memory impact. At such times, the solution is either that we can tolerate this method of sacrificing memory for performance, or not using static batching, but using dynamic batching (provided everyone uses the same scaling size, or everyone uses a different non-uniform zoom smaller), or writes the batch method on its own. Of course, I think the best thing to do is to use dynamic batching to solve it. There are a few tips you can use:
- Select the static batch as much as possible, but always be careful about the memory consumption.
- If static batching is not possible and you want to use dynamic batching, be careful with the various considerations mentioned above. For example:
- Make such objects as small as possible and include a small number of vertex attributes as much as possible.
- Do not use unified scaling, or use different non-uniform scaling.
- For small items in the game, such as coins that can be picked up, you can use dynamic batching.
- For objects that contain animations, we cannot use static batching in all, but if there are parts that are not moving, you can identify this part as "static".
Some discussion: How static batching works static batching use a ton of memory? Unity3d Draw Call Optimization merging textures (Atlas) Although batching is a good way, it's easy to break its rules. For example, objects in a scene use diffuse materials, but they may use different textures. Therefore, it is a good idea to combine multiple small textures into one large texture (Atlas) whenever possible. Using the vertex data of a mesh but sometimes, in addition to textures, there are a few variations on the material for different objects, such as different colors and some floating-point parameters. But the law of iron is,
whether it's a dynamic batch or a static batch, the premise is to use the same material。 Are the same, not the same, meaning that the material they point to must be the same entity. This means that as long as we adjust the parameters, it will affect all objects that use the material. So what if you want a small adjustment? Because the rules in unity are very dead, we have to think of some "astray", one of which is to use the vertex data of the mesh (the most common is vertex color data). As mentioned earlier, after the batch of the object will be processed into a VBO sent to GPU,VBO data can be passed as input to vertex Shader, so we can skillfully control the data in Vbo, so as to achieve different effects. As an example, as in the previous forest, all trees use the same material, and we want them to be implemented by dynamic batching, but different trees may have different colors. Then I can use the vertex data of the grid to adjust. For a specific method, see an article that will be written later. But the disadvantage of this approach is that more memory is needed to store the vertex data used to adjust the parameters. There is no way, never absolutely perfect method.
Bandwidth Optimization reduce texture size mentioned earlier, using texture Atlas can help reduce draw Calls, and the size of these textures is also an issue to consider. One problem to mention before is that the aspect ratio of all textures is preferably square, and the length value is preferably an integer power of 2. This is because there are many optimization strategies that can only be used at such times to maximize their effectiveness. unity View texture parameters can be adjusted through the texture of the panel: and adjust the parameters can be obtained through the texture of the advance panel: the description of the various parameters can be found in the document. Among the major optimizations are the "Generate Mip Maps", "Max Size", and "Format" options. Generate Mip Maps creates a texture pyramid by creating a small texture of different sizes for the same texture. In the game, you can dynamically choose which texture to use depending on the distance of the object. This is because, when we are far away from the object, even if we use a very fine texture, but the naked eye is also not able to distinguish, this time can be used to use a smaller, more blurred texture instead, which can save the number of pixels to access. But the downside is that it takes more memory to create an image pyramid for each texture. For example, in the above example, before checking "Generate MIP maps", the memory consumption is 0.5M, and when the "Generate MIP maps" is checked, it becomes 0.7M. In addition to the memory footprint, there are times when we don't want to use mipmaps, such as GUI textures. We can also view the generated MIP maps: unity in the panel and also provide the use of MIP Maps to view objects in the scene. To be more exact, it shows the ideal texture size of the object. Red indicates that the object can use a smaller texture, and blue indicates that a larger texture should be used. Max Size determines the length-width value of a texture, and if we use a texture that exceeds this maximum, unity shrinks it to meet this condition. Here again, the aspect ratio of all textures is preferably square, and the length value is preferably an integer power of 2. This is because there are many optimization strategies that can only be used at such times to maximize their effectiveness. "Format" is responsible for the compression mode used by the texture. This automatic mode is usually selected, and unity will be responsible for choosing the appropriate compression mode based on the different platforms. For GUI-type textures, we can choose whether or not to compress according to the requirements for picture quality, see the previousQuality of the article. we can also choose textures with different resolutions based on different machines, so that the game can run on some old machines as well. Zoom Many times resolution is also the cause of performance degradation, especially now many domestic shanzhai machine, in addition to the high resolution of other hardware is simply a mess, and this exactly in the game performance of the two bottlenecks: too large screen resolution + bad GPU. Therefore, we may need to shrink the resolution for a particular machine. Of course, this can cause the game to degrade, but the performance and the picture will always be a trade-off topic. Set the screen resolution in unity to call Screen.setresolution directly. Actual use may encounter some situations, rain pine Momo has an article about this technology, you can go to see. written at the end This article is summarized in nature, so there is no very detailed explanation for each technique. It is strongly recommended that you read the various links given at the beginning of the article and write them well. Original address: http://blog.csdn.net/candycat1992/article/details/42127811
"Unity tricks" optimization techniques in Unity