Moba Hand Tour "Millet Super God" case explaining

Source: Internet
Author: User
Tags memory usage cpu usage

Original link: https://blog.uwa4d.com/archives/2130.html

Today we bring to you by the Fuzhou Rose Finch Network Research and development of Moba hand Tour "Mi Super God" UWA Assessment Report analysis. The game is very well-equipped for mobile devices in different configurations, both in terms of screen expressiveness and performance overhead. In this, we will be in depth analysis of the performance data of this game, hope that through this article can let everyone on the mobile game each module operation efficiency has a more profound cognition, and to everyone's project research and Development has helped. First, CPU performance

The game has a very good performance in terms of CPU usage, and the following figure is the performance data for the game in a 5v5 battle on a red rice Note2 device.

It can be seen that in the 19876 frames running on the red Rice Note2, the ratio of frames to more than 33ms is 13.9%, the ratio of frames to more than 50ms is 1.7%, and as can be seen from the figure, The high CPU time is mainly focused on the resource loading phase of the 5v5 scenario. Therefore, the game in combat performance can be said to be very good, most of the time the game runs very smoothly.

At the same time, by further statistics, the game's CPU performance exceeds the 64% of the same device (red rice Note2) tested on other games, its energy consumption is less than 86% of the same device test game. Because the current domestic Moba game is less, so the above ranking is not in the Moba type of ranking, but in all types of Game Rankings. for an overweight Moba mobile game, the CPU performance and power consumption rankings can be said to be pretty good.

The excellent performance of its overall CPU is inseparable from the rational use of its various modules. Below, we will explain in detail the highlights of its CPU performance.

1. Rendering Module

With the UWA performance Assessment report, we can see the game's detailed rendering module performance overhead. The CPU overhead of the game when running on a red rice Note2 device is shown in the following figure. By statistics, the average CPU consumption of translucent objects is 1.7 Ms, mainly concentrated in the range of 0.8~3.0 Ms (5%~95%). The average CPU consumption of an opaque object rendering is 1.0 Ms, mainly concentrated in the 0.2~1.7 Ms Range (5%~95%). It can be seen that during the entire 5v5 battle, whether it's moving, Farm, Gank or regiment, or even Kamikochi, its rendering time is stabilized in a low time-consuming interval. This is due to the development team's control of the scene model, the skinning grid, and the UI.

Draw Call Peak is 167, and mainly concentrated in the 45~130 Range (5%~95%), rendering triangle patch Single frame peak is 64900, the above values are within a reasonable range.

2. UI Module

The CPU overhead of the game's UI module when running on a red rice Note2 device is shown in the following figure. The game uses Ugui as a solution for the UI interface. After statistics, the UI module overall CPU occupies the average value of 1.5 Ms, mainly concentrated in 0.1~3.5 Ms(5%~95%), within a reasonable range. Heap memory is allocated at 16000 frames of 2.4MB, with an average allocation of 155.4Bper frame of heap memory, which indicates that the game UI interface is made and the scope of the UI rebuild is very reasonable. Currently, the UWA recommended Ugui module, the average per frame heap memory allocation as much as possible under 200B control.

In the battle scene, the performance time of the UI system is mainly caused by the change of the State of UI elements, such as the movement of the HUD such as Blood bar, fluttering word, etc. This kind of operation is slightly unnoticed, resulting in a higher cost of UI mesh reconstruction. Therefore, the development of UI interface seems intuitive, simple, but its production at the level of exquisite and the patience of the operation of the tune, is a product is "ingenuity" of the touchstone. The following is a comparison chart of UI performance after several rounds of optimization in the "Xiaomi Super God" product.

3. Animation module

In the UWA evaluation report, the CPU overhead of the game Runtime Animation module is shown in the following figure. As you can see, the CPU overhead in the battle copy is controlled at a lower level in addition to the high CPU value that occurs when the scene is entered. Animator.update CPU Average value is 2.0 MS, mainly concentrated in the 0.1~4.3ms interval, for the MOBA project 5v5 scene, basically every frame has 90-130 objects in motion (in addition to heroes, creeps, as well as messenger pets, wild monsters, Towers, Plug-in and so on), because players are free to view any corner of the map, their animation system pressure per frame is several times larger than regular MMO games. therefore, "Xiaomi super God" can be controlled in the mean 2.0ms of the horizontal line, is already very good data.

At the same time, after further testing, The time-consuming of animation module is mainly caused by animators.processanimationsjob and Animators.fireanimationeventsandbehaviours, the former is mainly continuous accumulation time, while the latter is non-continuous " Spike "overhead. The former is the animation system for animationclip reading and calculation time-consuming, its time-consuming size and the current frame participating in the calculation of the number of bone nodes, animation curves, animation execution state and animator controller specific settings are related to the specific instructions can be found in the A review of performance optimization scenarios for animation systems in unity; the latter is the time-consuming, mainly the performance overhead of the project logic code, where the development team can further see whether the overhead of its logical code is further optimized through the performance stack.

4. GC Call

The research and development team is very good at the frequency of GC calls, the game in the process of operation, theGC call frequency of 1656 frames/times, better than the current 93% of industry games. in general, we recommend that the GC call frequency of a project be controlled at 1000 frames per second.

The game's GC invocation frequency is so good, mainly due to the development team's control over the project code heap memory. The following figure is a detailed allocation of the code heap memory for the game running 20000 frames, the sum of the heap memory allocations of the TOP10 function is not more than 80MB, which reflects the team's understanding of the heap memory allocation very deeply.

The current version of heap memory allocation still has a further decline in the space, from the stack information can be seen, its log output is still a certain amount of memory allocation, it is recommended that the development team in release version of the non-critical log masked.

second, Memory module

The memory performance of MI Super God is shown in the following figure. The total memory peak reaches 297MB, and the mono heap memory peak is 48.9MB. With a relatively high total memory allocation of 297MB, the development team can attempt to further control resources on low-memory machines, reducing memory consumption on low-memory machines.

1. Mono Heap Memory

The following figure shows that the overall mono heap memory control of the game is very good, in 20000 frames, Mono's heap memory peak is 48.9MB, this value is slightly higher (UWA recommended <40MB). As you can see from the graph, a high heap memory allocation has suddenly occurred in the last highlands of the battle, forcing the mono heap memory to rise by 8MB (as shown in the red box below), in which the research and development team could locate its heap memory allocations based on its location, which could be used to pinpoint the root of its heap allocation.

However, from a trend point of view, its used mono heap memory occupied at the end of the 5v5 battle, did not completely fall back to the scene (as shown in the blue box), this need to arouse the attention of the research and development team, confirm whether it is a partial specified container cache caused, so as to troubleshoot the project there is a heap memory leak hidden.

2. Resource Memory

Statistically, the game runs with a peak of 1003 texture resources and a peak memory consumption of 54.8MB. The research and development team controlled the texture memory footprint very low, currently only more than 30% of industry projects. After statistics, in the memory consumption peak, ETC1 and ETC2 format texture occupies 835, RGBA32 format Texture Total 89, RGB24 format texture occupies 5, the rest is RGBA16 format texture.

overall usage of the texture resource during project run time:

For the optimization of texture resources, it can be divided into the following types:

(1) Use a more appropriate texture format

As can be seen from the above image, in the entire game in a 5v5 battle process, the RGBA32 format texture of the use of more than 91% of the total use of industry projects. For this, we recommend replacing with the ETC1 format texture (Android platform) whenever possible with visual effects, not only to achieve a smaller footprint, but also to get faster loading efficiency. For textures that cannot be compressed by hardware, the Dither method can be used to try to convert it into a RGBA16 format texture, which can be referred to the Unity picture optimization artifact-dither algorithm advanced scheme.

(2) Use more precise texture resolution

In general, in order to make the model look more beautiful, the corresponding texture will choose a higher texture size in addition to the mesh model itself, making it look more refined. However, since the rendered object is far and away from the camera in the real game, its graphics layer often does not (or does not need to) use such a high resolution to render, resulting in a great deal of resource waste, which is ubiquitous in our deeply optimized projects. Therefore, we recommend that you directly view the underlying texture resolution usage of the game runtime, as shown in the following figure, by UWA the Mipmap feature page of the GPU performance on the online evaluation report.

At the same time, in the depth optimization report, we will also conduct a larger number of tests and billions of pixel cumulative analysis, so as to pinpoint exactly which texture resource resolution is too high, can not affect the visual effects of the premise, the resolution will be reduced by 4 times times, 16 times times or more. "Xiaomi Super God" project in this way constantly optimized, its 5v5 scene texture memory has been reduced from the previous 78MB to 52MB now.

These are the usage of the texture resource, and the memory usage of other resources is as follows:

overall usage of grid resources during project run time:

Mesh resources have a high memory footprint, higher than 60% of industry projects. There are several main ways to optimize the memory of mesh resources:

(1) Controlling the use of the vertex properties of the grid

Detailed detection of the color data in the mesh model and tangent data, when not required to remember to remove it, otherwise it will be in the scene model to create a lot of memory waste, detailed instructions can refer to "mobile game load performance and memory management full resolution."

(2) control the number of mesh vertices

Detailed detection of the number of vertices (or number of patches) of the mesh model is too large to use. This is an easy-to-say but difficult-to-implement problem, and the difficulty is how to define whether the number of vertices of a mesh mesh is too large. A recommendation in the UWA Assessment report is that the mesh mesh vertex count should be controlled below 1500. But in fact, this is actually a "unscientific" rule, because there is no one theory can prove that its grid number is more than 1500 is unqualified, or less than 1500 will certainly pass. So, we have spent a lot of time this year looking for a more reasonable and scientific rule of judgement. We believe that determining whether a grid model is compliant should not be a "static" test, but rather a "dynamic" detection, as with the texture resources above, we should see how many pixels the grid resource occupies in the underlying rendering. Thus, a density statistic is used to judge whether the grid usage is reasonable. Therefore, we propose a grid model measurement standard, that is, "mesh model rendering Density", which represents the number of rendered vertices of the mesh model in pixels per unit quantity (for example, 10,000).

The following figure renders the density statistics for the grid model during game run. As you can see, although the first model "mod_xyc_fz1001" has a vertex count of only 1168, its average render pixel per frame at the time of the game runs is only 34.91, which means that the average pixel to render 33 mesh vertices is a waste. As a result, the "rendering density" method can be used to reflect the use of any model in the game runtime. In our view, this is a more reasonable rule of judgement than simply setting 1500.

Animationc

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.