Unity + NGUI performance optimization method summary, unityngui

Source: Internet
Author: User

Unity + NGUI performance optimization method summary, unityngui

A total of 9 moves.

 

1. Package and load resources separately

 

There are many places in the game that use the same resource. For example, some interfaces share the same font and gallery, some scenes share the same texture, and some monsters use the same Animator. You can separate these public resources from other resources and package them separately when creating the game installation package. For example, if both resource A and resource B reference resource C, separate resource C and create A bundle separately. When the game is running, if you want to load A, load C first; then, if you want to load B, because the C instance is already in the memory, as long as you load B directly, point B to C. If C is not separated from A and B During packaging, A's package contains C, B's package also contains C, and redundant C will support the installation package; in addition, if both A and B are loaded into the memory during runtime, there will be two C instances in the memory, increasing the memory usage.

 

Package and load resource separation is the most effective way to reduce the size of the installation package and the memory usage during runtime. Generally, the smaller the packaging granularity, the smaller the two indicators. And when two adjacent drawcalls of the renderQueue use the same texture, material, and shader instances, the two drawcalls can be merged. But the packaging granularity is not as detailed as possible. If you need to load a large number of small bundle at the same time during the runtime, the loading speed will be very slow-the time will be wasted on scheduling between coroutines and multiple batches of small I/O; in addition, DrawCall merging does not necessarily improve the performance, but sometimes reduces the performance, which will be mentioned later. Therefore, it is necessary to control the packaging granularity with policies. Generally, only the large public resources such as fonts and textures are separated.

 

You can use AssetDatabase. GetDependencies to know which other resources are used by a resource.

 

2. transparent texture channel separation. The compression format is set to ETC/PVRTC.

 

We initially used DXT5 as the texture compression format, hoping to reduce the memory usage of the texture, but soon found that the mobile platform's graphics card does not support hardware decompression DXT5. Therefore, for a x RGBA32 texture, although DXT5 can compress it from 4 MB to 1 MB, before the system sends it to the video card, it will first decompress it into the 4 MB RGBA32 format (software decompress) in the memory with the CPU, and then send the 4 MB to the memory. Therefore, during this period, this texture occupies 5 MB of memory and 4 MB of memory, while mobile platforms often do not have independent memory and need to swap a piece of memory as memory, so I thought that only 1 MB of memory actually occupied 9 MB!

 

All compression formats that do not support hardware decompression have this problem. After some research, we found that the most widely supported hardware formats on Android are ETC, while those on apple are PVRTC. However, these two formats do not contain transparent (Alpha) channels. Therefore, the transparent channels of each original texture are separated and written into the Red Channels of another texture. Both images are compressed using ETC/PVRTC. During rendering, both images are sent to the video storage. At the same time, we modified the NGUI shader and wrote the red channel of the second texture to the transparent channel of the first texture during rendering to restore the original color:

 

fixed4 frag (v2f i) : COLOR{    fixed4 col;    col.rgb = tex2D(_MainTex, i.texcoord).rgb;    col.a = tex2D(_AlphaTex, i.texcoord).r;    return col * i.color;}

 

In this way, a 4 MB x RGBA32 original texture will be separated and compressed into two mb etc/PVRTC maps (we use ETC/PVRTC 4 bits ). Their rendering memory usage is 2x0.5 + 2x0.5 = 2 MB.

 

3. Disable texture read/write.

 

Each texture imported in Unity has a Read/Write Enabled switch. The corresponding program parameter is TextureImporter. isReadable. After selecting the texture, you can see this switch on the Import Setting tab. You can use Texture2D for textures only when this switch is enabled. getPixel reads or modifies the pixel of the texture resource, but this requires the system to keep a copy of the texture in the memory for CPU access. Generally, this is not required when the game is running. Therefore, we disable this function for all textures and only perform post-texture import after editing (for example, separating transparent channels from original textures) open it. In this way, the size of the 1024x1024 texture mentioned above can be reduced to 1 MB, and the memory usage of 2 MB during running can be reduced to half.

 

4. Reduce the number of gameobjects in a scenario

 

On one occasion, we reduced the number of gameobjects in the scenario by nearly 20 thousand, and the memory usage of the game on the iPhone 3 S was immediately reduced by 20 MB. Although these gameobjects are basically in the hidden state (activeInHierarchy is false), they still occupy a lot of memory. Many scripts are mounted to these gameobjects. Each script in each GameObject needs to be instantiated, which is a relatively high memory usage. Therefore, we later stipulated that the number of gameobjects in the scenario should not exceed 10 thousand, and listed the number of gameobjects as the performance monitoring indicator for the weekly version.

 

5. Organize the gallery

 

The main purpose of organizing the gallery is to save the runtime memory (although it can also merge DrawCall ). From this point of view, the smaller the sum of the size of the gallery sent to the Display memory is, the better. The following methods can help us achieve this:

 

1) in the interface design, try to make art design controls as simple as possible, that is, the UISprite type is Sliced. In this way, the art can cut out a small image, and we will expand it in Unity. Of course, the number of vertices in a single control increases from four to at least 16 (the number of vertices will increase if Tiled is used as the tile type in the center lattice ), building a DrawCall has a higher overhead (see point 6th), but generally there is no problem as long as the DrawCall arrangement is reasonable (see point 6th as well.

 

2) In the same interface design, try to make art design symmetric. In this way, the art can only cut a part of the image, and we will spell out the complete pattern in Unity. For example, for a circular pattern, the art can cut out only 1/4; for a face, the art can cut out only half. However, similar to 1st), this method also has other performance costs-the number of vertices and the number of gameobjects corresponding to a pattern increases. As mentioned at, the increase in the number of gameobjects sometimes significantly occupies more memory. Therefore, this method is generally used only for larger patterns.

 

3) ensure that unnecessary texture materials do not reside in the memory, and do not send irrelevant texture materials to the video memory during rendering. For this reason, the gallery should be separated by the interface. Generally, a gallery only contains the material of one interface, and UISprite in one interface should not use the gallery of another interface. Assume that there is A small identical gold coin icon on both interface A and interface B. Do not make it easy to create A picture, so that UISprite of interface A can directly reference the gold coin material in interface B; otherwise, when interface A is displayed, the gallery of interface B will also be sent to the Display memory, and as long as A is still in the memory, the gallery of B will also reside in the memory. In this case, you should place an identical gold coin icon in the gallery A and B. UISprite in a uses only the gallery A, and UISprite in B uses only the gallery B.

 

However, if there are a lot of identical materials between the two interfaces, the two interfaces can share the same gallery. This reduces the total memory usage on all interfaces. The specific operation must be weighed according to the design of the art. Generally, the more common materials the same between interfaces, the smaller the program memory burden. However, if there are too many things on the interface, the art effect may not be vivid. This is another place between art and programs that needs to be balanced.

 

In addition, a large number of ICON resources (such as item icons) should not be used in the gallery, but UITexture.

 

4) reduce the space in the gallery. The memory space occupied by completely transparent pixels in the gallery is the same as that occupied by untitled pixels. Therefore, when the amount of materials remains unchanged, we should minimize the gaps in the gallery. Sometimes the size of a x gallery is less than half the size of the clip. You can consider cutting the gallery into two x512 atlas. (Some may ask why it cannot be made into a x512 Gallery, because the iOS platform seems to require a square texture to be sent to the display .) Of course, the DrawCall of the two different Atlas cannot be merged, but this is not a problem (see Figure 6th ).

 

It should be said that there are no static standards in the specific operations of the gallery, and many times we need to weigh the advantages and disadvantages to decide how to organize it, because either of the measures will have a performance price.

 

6. Place a Panel based on the design of each UI control to separate DrawCall

 

Once we found that the NGUI UIPanel. LateUpdate function has a high CPU overhead. After careful research, it is found that there are too many drawcalls merged, especially when the runtime changes the UI control and the DrawCall of the static UI control are combined. When the position, size, color, and other attributes of a UI widget change, UIPanel needs to reconstruct the DrawCall used by the widget. In some cases, all drawcalls on the Panel must be rebuilt. Sometimes recreating a DrawCall consumes a lot of CPU resources. It needs to re-calculate the vertex information of all controls on the DrawCall, including the vertex position, UV, and color. If many controls are concentrated on the same DrawCall, the vertices of all controls on the DrawCall must be traversed once as long as a control changes a little bit; while our UI has adopted a large number of jiugong lattice stretching to make the number of vertices of the control more, so the overhead of rebuilding a DrawCall is larger.

 

Therefore, we group the UI controls and place the controls that will change over a period of time, for example, the blood bar and the damage text on the head of the monster on the same Panel, and this Panel only has these controls, the remaining unchanged widgets are placed on another Panel. In this way, the two controls are separated into different DrawCall panels. When a control changes and the DrawCall reconstructs, it is not necessary to traverse those unchanged controls. Because in art design, there are always a few controls that change within a period of time, the optimization effect is very obvious, and the CPU usage can be reduced to 25%.

 

This method adds some DrawCall, but it does not have any effect. We put too much emphasis on the number of DrawCall compression in the early stage of our project, but it was not so terrible to add several DrawCall. The main process once even with the Cocos2d-x has been tested, even in the case of 500 DrawCall, the animation can still run very smoothly, compared to the effect of texture size on smoothness is much greater.

 

7. Optimize the internal logic of the anchor so that it is updated only when necessary.

 

After optimizing the DrawCall reconstruction efficiency of the Panel, we found that the update logic of the NGUI anchor will consume a lot of CPU overhead. Even when the control remains static, the control's anchor is updated every frame (see the UIWidget. OnUpdate function), and its update is recursive, so that the CPU usage is higher. Therefore, we modified the NGUI internal code so that the anchor can be updated only when necessary. It is generally only updated when the control initialization and screen size change. However, the cost of this optimization is that when the vertex position of the control changes (for example, the control is moving or the control size changes), the upper-layer logic needs to be responsible for updating the anchor.

 

8. Reduce texture resolution

 

To put it bluntly, it is actually to reduce the size of the texture material. For example, for a texture with a size of x in the original image, we can import it to Unity to reduce it to 50x40, that is, double. The game actually uses the reduced texture. However, this move will inevitably significantly reduce the quality of the art, and the art will immediately find that the picture is blurred, so it is generally not used when the program is unable to support it.

 

9. The interface's delayed loading and timed offloading policies (not implemented yet)

 

If some interfaces are of low importance and are not frequently used, you can load resources from bundle only when the interface needs to be opened and displayed, and detach yourself from the memory when it is disabled, or wait for a while before uninstalling. However, this method has two costs: first, it will affect the user experience. When the player asks to open the interface, the display of the interface will be delayed; second, it is easier to generate bugs, asynchronous mode should be considered when writing logic at the upper layer. When a programmer wants to access an interface, the interface may not be in the memory. Therefore, we have not implemented this scheme so far. Currently, it is only for entering a new scenario to uninstall the interface used in the previous scenario but not used in the new scenario.

 

Of the nine methods above, 4, 5, and 6 need to be considered from the perspective of planning and art to a certain extent, in addition, continuous monitoring is required to maintain the optimization status (because there are always new requirements in the design or the need to change the old interface). Other solutions are designed once and for all, as long as the implementation is stable, you don't need to spend any more energy on it. However, both 2 and 8 are ways to reduce the quality of art, especially 8. If art cannot help reduce its quality, it may not be allowed to adopt these two methods.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.