This is a simple translation of the diagnosing performance problems using the Profiler window, the Unity official tutorial, the Performance Optimization series.
Brief introduction
If the game is slow and sluggish, we know the game has a performance problem. Before we try to solve the problem, we need to know the cause of the problem first. Different problems require different solutions. If we rely on speculation or other project experience to solve problems, then we may waste a lot of time, and even make the problem worse.
At this point we need performance analysis, which measures the performance of all aspects of the game runtime. Through the performance analysis tool, we are able to get in-depth information through the scene surface of the game running, and through this information we can trace the cause of the performance problem. With the performance analysis tool, when we modify it, we can measure whether our changes are valid and whether the performance issue is fixed.
In this article, we will:
-Use Unity's built-in profiling tool Profiler to collect runtime data for our low-performance games.
-Analyze data and track the cause of performance problems
-Share links to articles that fix these specific issues
Making the game run quickly and smoothly is a balanced effort. Before we get the results we want, we may need to make several rounds of changes and measure the effect. Knowing how to use profiling Tools to analyze a problem means that we can identify what went wrong and understand what to do next.
Before you begin
This article will help us to track the problem of running slow and sluggish for unity games. This article may not be helpful if there are other problems with our game, such as a game crash or a graphic performance of the game that is not as expected. If the game has problems that are not covered in this article, try Unity Forums or unity answers to find the answer in Unity Manual.
If we are not familiar with the use of the profiler window, please read Unity performance optimization (1)-Official document translation.
A brief introduction to game performance
Framerate frame rate is a basic indicator of game performance. In the game, a frame is similar to a frame in an animation, it is a still picture of our game drawn to the screen. Draw a frame to the screen called Render one frame. The frame rate, or how fast the frame is rendered, is measured in fps (frames per second).
The target of most current games is frame rate 60FPS. In general, the frame rate above 30FPS is acceptable, especially for games that do not require rapid interaction, such as casual decryption adventure games. Some projects have special needs, in VR games, you need at least 90FPS. When the frame rate drops below 30FPS, the player usually has a bad experience, the graphics may be unstable and feel the operation feedback is not timely. However, it is important not only for speed, but also for the frame rate to be very stable. The player is usually sensitive to changes in frame rate, and the unstable frame rate is usually worse than a lower but stable frame rate.
Although the frame rate is a basic standard for us to talk about game performance, when we try to optimize our game performance, it is more useful for us to render a frame with a number of milliseconds. There are two reasons for this. First, this is a more precise metric. When we try to optimize performance, each millisecond calculation contributes to our goal. Second, the relative changes in frame rates mean different changes in different ranges. The 60 to 50FPS rendering is an extra 3.3 milliseconds of elapsed time, but 30 to 20FPS renders an additional 16.6 milliseconds of uptime. In this case, the same is reduced by 10FPS, but the time difference between rendering a frame is significant.
The frame rate is useful for us to understand how many milliseconds must be spent in rendering a frame. Pass equation 1000/The frame rate you want to achieve. With this formula you can get the game to render 30 frames per second (30FPS), then you must render each frame within 33.3 milliseconds. A 60FPS running game that must render every frame within 16.6 milliseconds.
To render each frame, unity must perform a number of different tasks. Simply put, unity must update the state of the game, get a snapshot of the game, and draw the snapshot onto the screen. Some tasks are performed at each frame, including reading user input, executing scripts, running lighting calculations, and so on. In addition, there are many operations that are performed multiple times in a frame, such as physical operations. When all of these tasks are executed quickly enough, our game will have a stable and acceptable frame rate. When these tasks do not perform fast enough, rendering a frame takes too long and the frame rate decreases.
Knowing which tasks take too long to execute is critical for us to know how to solve performance problems. Once we know which tasks have lowered the frame rate, we can try to optimize the part of the game. That's why the profiling tools are so critical: the profiling tool tells us how long each task took in a frame.
Recording analysis data
In order to investigate performance issues, we must first record analysis data from the poorly performing parts of our game. To accurately record the analysis data, we must generate a development build version of the game and analyze it when the game is running on the target hardware.
If you are not yet familiar with how to build a development build version of the game and analyze it when running on target hardware, check out this article Unity Performance Optimization (1)-Official document simplified translation.
Record data from our game's development build
If you are not yet familiar with how to build a development build version of the game and analyze it when running on target hardware, check out this article Unity Performance Optimization (1)-Official document simplified translation.
-Generates a development build on the target device.
-Start recording analysis when our game runs to the part of the performance problem.
-Once the data we recorded contains a sample of performance issues, click anywhere on the top of the profiler window to pause the game and select a frame.
-In the upper part of the window, select the frame that shows the performance problem. This may be a sudden low frame count, or it may just be a constant number of frames, just a lower number of frames than we want. We can use the left and RIGHT arrow keys of the keyboard or the front and rear buttons of the upper control bar of the profiler to move frames, better control the selection frame.
We've collected performance data from the bad parts of the game, so let's learn how to analyze the data.
Analyze performance data
Before we get to the cause of the game performance problem, we must first learn how to analyze the performance data displayed on the profiler window. We know that when unity does not complete all the tasks required to render a frame in a timely manner, the frame rate drop occurs. We can see in the profiler window which tasks are executing, how long it takes to execute the task, and the order in which the tasks are executed. This information will help us understand what part of the game takes too long to render the frame.
It's best to practice using Profiler, which is much better than trying to learn in precise order. It is very useful for us to understand the meaning of the data ourselves so that we can investigate ourselves when we encounter new problems. Or just knowing how to find it on unity answers is a good start.
To learn how to analyze performance, we will use CPU usage Profiler as an example. When we look at frame rate issues, this is probably the most profiler we've used.
The CPU Usage Profiler
When we look at the top of the profiler window, we can see how much CPU time is spent on completing each frame of the task.
We can see that the time spent is marked by the color of the classification. Different colors represent the time spent on rendering operations, the time spent on physical operations, and so on. The left side of the profiler indicates which color represents which type of task.
As we can see from the following, the main time in this frame is spent on rendering operations. At the bottom of the window, all CPU time in this frame is displayed for a total of 85.95 milliseconds.
Hierarchical View the Hierarchy view
We use the hierarchical view to dig deeper into the information and see exactly which tasks in this frame took the most CPU time. When we select the CPU Usage Profiler, the details of this frame are displayed in the lower part of the profiler window. We can select a hierarchy view from the drop-down menu at the bottom left of the profiler, which allows us to view the details of the CPU task.
In the hierarchy view, you can click any of the column headings to sort by the values of this column of information. For example, click Time ms to sort by the function, and click Calls to sort by the number of times the function is executed in the currently selected frame. In the above, we sort by time, and we can see that Camera.render spends the most CPU time.
If a function has an arrow to the left of its name, you can click Expand to see what other functions the function calls, and how those functions affect performance. The self Ms represents the time spent by the function itself, which represents the total time of the function and the other functions it calls.
In this example, we can see that in Camera.render, the most consumption is related to the shadows.renderjob function. Even though we don't know much about this function, we've got a lot of information on performance issues. We know that the problem is related to rendering, and the most expensive task is to deal with shadows.
Another advantage in the hierarchy view is that we can compare each frame in the game so that we can understand how performance changes over time. Using the CPU Usage Profiler, we can track the CPU consumption of a function from frame to frame. When we click on the function name in the hierarchy view, the CPU Usage Profiler highlights the function's information in the graphical view at the top of the profiler window.
For example, if you click gfx.waitforpresentin the hierarchy view, the rendering data directly related to Gfx.waitforpresent will be highlighted in the profiler's graphical display.
Timeline View the Timeline view
Now let's use the timeline view to learn more about our rendering problems. The timeline view shows two things: the order in which the CPU tasks are executed and which tasks the thread is responsible for. We can select the timeline view from the drop-down menu in the lower left of the profiler (and select the same level view location).
Threads allow different tasks to execute at the same time. When a thread executes a task, another thread can perform another completely different task. There are three threads associated with Unity's rendering process: The main thread, the render thread, and the worker thread (worker threads). It is useful to know which thread is responsible for which tasks: once we know which thread is performing the slowest task, we should focus on optimizing the operation on that thread.
We can zoom in on the timeline view to see the individual tasks more closely. The function that is called is displayed below the function that called him. In this example, we have zoomed in to see the other separate tasks that make up the shadows.renderjob task. We see that the function called by shadows.renderjob occurs in the main thread, and can also see worker thread execution and shadow related tasks. A task waitingforjob of the main thread indicates that the main thread is waiting for the worker thread to complete the task. From here, we can infer that shading-related rendering operations, in the main thread and worker threads, take too much time. We already know a lot about the problem-related information now.
The other profiler
While the CPU Usage Profiler is the most commonly used profiler when tracking frame rate-related performance issues, the other profiler is also useful. It's a good idea to be familiar with the information they provide.
Try using the other profiler as described above, using different views to learn the information they provide per frame. For example, try to use the render profiler to see how the rendered data changes frame by frames.
Determine the cause of a performance problem
Now that we are familiar with how to read and analyze performance data in the profiler window, we can start looking for the cause of the performance problem.
Exclude the effects of vertical synchronization
Vertical sync (VSync) is used to synchronize the game's frame rate and screen refresh rate. Vertical synchronization affects the game's frame rate and can be seen in the profiler window. If we are not particularly sure where the problem is, the impact of vertical synchronization may look like a performance issue, so you should learn how to exclude the effects of vertical synchronization before you continue looking for problems.
Hide vertical synchronization information in the CPU Usage Profiler
We can choose the information we want to hide in the CPU Usage Profiler, which allows us to ignore information that is irrelevant to the current issue.
To hide vertical sync information in the CPU Usage Profiler, follow these steps:
-Check the CPU Usage Profiler.
-You can hide information by clicking the Yellow box on the left side of the CPU Usage Profiler window and marking vertical synchronization.
Ignore vertical synchronization information in the hierarchy view
There is no way to hide vertical sync information in the hierarchical view of CPU usage profile, but we can learn what he is showing so we can ignore him.
When we see a function called waitfortargetfps in the hierarchy view, which means that our game is waiting for vertical synchronization, we do not need to investigate this function and safely ignore him.
Turn off vertical synchronization
Vertical synchronization is not available on all platforms, and many platforms (such as iOS) are forced to be turned on. When we develop on a flat without forcing vertical synchronization, when we analyze performance, we can turn off vertical synchronization for the entire project. Open quality settings via the menu Edit **> **project Settings >quality , and select Don ' t Sync in the VSync Count drop-down menu .
Analysis rendering
Rendering is a common cause of performance problems. Before we try to fix a rendering performance issue, it's important to verify that our game is CPU-bound or GPU-constrained, because they need to be resolved in different ways.
Simply put, the CPU is responsible for deciding what needs to be rendered, and the GPU is responsible for rendering it. When the rendering performance problem is because the CPU has taken too long to render a frame, it is CPU constrained. The GPU is limited when rendering performance problems because the GPU takes too long to render a frame.
Identify if the game is GPU-constrained
The fastest way to identify whether the GPU is limited is to use the GPU usage Profiler. Unfortunately, not all devices and drivers support this profiler. Before judging whether the GPU is constrained, we need to check whether the GPU Usage Profiler is available on the target device of our game.
Check that the GPU Usage Profiler is available and follow these steps:
-click ADD Profiler in the upper left corner of the profiler window .
-Select the GPU from the drop-down menu.
If the target device is not supported, we can see that the message on the right shows "GPU analytics not supported."
If you do not see this information, it means that the GPU Usage Profiler supports our target device. In our example, if GPU usage Profiler is available, you can quickly and easily determine if the game is GPU-constrained by following these steps.
-Check the GPU Usage Profiler.
-View the CPU time and GPU time in the middle section below the area, which shows the information for the currently selected frame.
If the GPU time is greater than the CPU time, we can be sure that the GPU is limited in this frame of the game run.
If GPU Usage Profiler is not available, we still have a way to confirm that the game is GPU-constrained. We can confirm by CPU usage. If we see the CPU waiting for the GPU to complete the task, it means that the GPU is limited. We can use the following steps:
-Select CPU Usage Profiler.
-View the details of the selection frame in the lower part of the window.
-Select a hierarchy view
-Select Sort by time Ms column
If the function gfx.waitforpresent the longest time spent in the CPU Usage Profiler, this indicates that the CPU is waiting for the GPU. This means that the GPU is limited.
Solve GPU-limited rendering problems
If we have determined that the GPU is limited, then read this article learning solution.
Identify if the game is CPU constrained
If the cause of the game performance problem is not identified here, let's now investigate CPU-related rendering problems.
-Select CPU Usage Profiler.
-In the upper part of the profiler window, check the data representing the rendered color. We can hide or show different kinds of information by clicking on the color squares.
In slow frames, if a large portion of the time is spent on rendering, rendering causes problems. We can further explore the performance information by following these steps:
-Select CPU Usage Profiler.
-Check the details of the selected frame below the window.
-Select a hierarchy view.
-click on the time Ms column to sort the functions by consumption.
-click the function at the top of the list.
If the selected function is a render function, the CPU Profiler highlights the rendered portion. If this is the case, then it means that rendering related operations are causing the performance of the game and that the frame is CPU constrained. Note that the function names and functions are executed on which thread, and this information is useful when we try to fix the problem.
Resolving CPU-Limited rendering problems
If we have determined that the CPU is limited, then read this article learning solution.
Garbage Collection Performance Analysis
Next, we check to see if this is a performance bottleneck caused by garbage collection. Garbage collection is a feature associated with Unity's automatic memory management, which can be a slow operation.
-Click to select CPU Usage Profiler
-On the left side of the profiler window, CPU Usage Profiler, you can click on the color below to quickly, control the display or hide the relevant information, you can also drag him, according to their own will sort. In the following, we hide all the other information that the garbage collector unexpectedly has, and drag the garbage collector to the top.
If in a slow frame, a large portion of the time is consumed by garbage collection, this indicates that we may have too many garbage collection problems. We can do more in-depth analysis of the data to confirm.
-Check the CPU Usage Profiler to see details about the currently selected frame shown in the window below.
-Select a hierarchy view
-Choose to sort by time Ms
If the function is GC. Collect () appears at the top, and it takes too much CPU time, we can confirm that garbage collection is the problem with our game.
Resolving garbage collection issues
If we have identified a problem with garbage collection for the game, please read this article learning solution.
Physical performance analysis
So far, if we've ruled out rendering problems and garbage collection issues, let's look at the problem of complex physical operations.
-click to select CPU Usage Profiler
-On the left side of the profiler window, CPU Usage Profiler, you can click on the color below to quickly, control the display or hide the relevant information, we focus on the physical information (orange)
If in a slow frame, a large portion of the time is consumed by physical operations, this indicates that the physical operation may have caused a performance problem. We can do more in-depth analysis of the data to confirm.
-Check the CPU Usage Profiler to see details about the currently selected frame shown in the window below.
-Select a hierarchy view
-Choose to sort by time Ms
-click the top of the information list and select
If it is a physical function, then the part of the physical operation is highlighted above the profiler. If this is the case, the performance problem of the game is related to the physical operation.
Solving physical problems
If you determine that a performance problem is caused by a physical operation, the following resources will help you solve the problem:
-this page of the Unity Manual, though written for iOS developers, some of the physics optimizations are for all Unity games.
-this tutorial on optimising physics in a Unity game with some helpful tips.
-unite on Optimization A section on physics that contains useful summaries of some common physical problems.
Problems with scripts running slowly
Now let's check whether slow or overly complex scripts are the cause of performance problems. The script, here, refers to the code of the non-unity engine. This usually means that we write our own scripts, or some of the plugins we use in our projects.
-click to select CPU Usage Profiler
-On the left side of the profiler window, CPU Usage Profiler, you can click on the color below to quickly, control the display or hide the relevant information, we focus on the script information
If in a slow frame, a large portion of the time is consumed by the script, this indicates that these slow user scripts can cause performance problems. We can do more in-depth analysis of the data to confirm.
-Check the CPU Usage Profiler to see details about the currently selected frame shown in the window below.
-Select a hierarchy view
-Choose to sort by time Ms
-click the top of the information list and select
If it is a function of a user script, then the part of the script is highlighted above the profiler. If this is the case, the performance problem with the game is related to the user script.
Please note that there is an unexpected situation here: If our game contains code related to rendering, such as a screen post effect script or code in a onwillrenderobject or onprecull function, The rendering data may be highlighted above the profiler, not the script data.
While this may initially cause a bit of confusion, in the hierarchical view and timeline view, it is often possible to trace the responsible code.
Solve the problem of slow script
If we identify a performance issue that is caused by scripting, here are some tips for improving performance, and the recommended resources for several scripting optimizations are:
-this page in the Unity Manual focuses on scripting optimizations for mobile platforms, but the recommendations are useful for all developers.
-this page in the Unity Manual contains some suggestions on how to avoid expensive function calls in user scripts.
-unite on optimization contains useful summaries of some common scripting issues.
Other causes of performance problems
Although we have discussed four of the most common causes of performance problems, our games are still likely to encounter issues unrelated to these aspects. If this is the case, we should follow the above approach to solve the problem: Collect data, use the CPU Usage Profiler to investigate the information, and find the function that caused the problem. Once we know the name of the function that causes the problem, we can find information about the function in Unity Manual, unity Forums or unity answers, which may save you time.
Extended Reading
Unity manual:execution Order
Unite 2012:performance optimization Tips and Tricks for Unity
Unite Europe 2016:optimizing Mobile applications
Unity Performance Optimization (2)-Official document Simplified translation