At the beginning of 2015, Google released a feature on Android performance optimization, with 16 short videos, each 3-5 minutes, to help developers create faster and better Android apps. The topic of the course not only introduces the underlying workings of performance issues in Android, but also describes how to use tools to identify performance issues and improve performance recommendations.
Mainly from three aspects, Android rendering mechanism, memory and GC, power optimization. Here is a summary of these questions and recommendations.
0) Render Performance
The most important root cause of performance problems that most users perceive as lag is because of rendering performance. From the designer's point of view, they want the app to have more animation, pictures and other fashion elements to achieve a smooth user experience. But the Android system is likely to be unable to complete those complex interface rendering operations in time. The Android system emits vsync signals every 16ms,
Triggers a rendering of the UI, and if each render succeeds, it will be able to achieve the 60fps required for a smooth picture, which means that most of the program's operations must be done within 16ms in order to achieve 60fps.
If one of your operations takes time of 24MS, the system will not be able to render normally when the vsync signal is received, so there is a drop frame phenomenon. Then the user will see the same frame in 32ms.
It is easy for users to perceive that the lag is not smooth when the UI is performing animations or sliding the ListView, because the operation is relatively complex and the phenomenon of dropping frames is prone to lag. There are a number of reasons for dropping frames, perhaps because your layout is too complex to render within 16ms, possibly because there are too many drawing units stacked on your UI.
There may also be an excessive number of animations being executed. These can cause CPU or GPU overloading.
There are tools that can be used to locate the problem, such as using Hierarchyviewer to find out if the layout in the activity is too complex, or by using the developer options in the phone settings and opening the show GPU overdraw. You can also use TraceView to see how the CPU is performing and to find performance bottlenecks more quickly.
1) Understanding Overdraw
Overdraw (Over-drawn) describes that a pixel on the screen is drawn several times within the same frame. In a multi-layered UI structure, if the invisible UI is also doing the drawing operation, this causes some pixel areas to be drawn multiple times. This wastes a lot of CPU and GPU resources.
When designing for a more ornate visual effect, it is easy to get caught up in the vicious circle of using more and more cascading components to achieve this visual effect. This can easily lead to a lot of performance problems, in order to get the best performance, we have to minimize the occurrence of overdraw situations.
Fortunately, we can open the show GPU overdraw option by using the developer options in the phone settings to see the overdraw situation on the UI.
Blue, light green, light red, crimson represents 4 different levels of overdraw, and our goal is to minimize the overdraw and see more blue areas.
Overdraw sometimes because there are a lot of overlapping parts in your UI layout, or because you don't have to overlap the background. For example, an activity has a background, and then the layout has its own background, and the sub-view has its own background. Simply by removing the necessary background image, this reduces the number of red overdraw areas,
Increase the percentage of the blue area. This measure can significantly improve program performance.
2) Understanding VSYNC
To understand how the app is rendered, we have to understand how the mobile hardware works, so we have to understand what VSYNCis.
Before we explain the vsync, we need to know two related concepts:
- Refresh Rate: Represents the number of times a screen is refreshed in a second, depending on the hardware's fixed parameters, such as 60Hz.
- Frame rate: Represents the number of frames that the GPU draws operations within one second, such as 30fps,60fps.
The GPU gets the graphics data to render, and the hardware is responsible for rendering the rendered content onto the screen, and they collaborate on both sides.
Unfortunately, the refresh rate and frame rate do not always keep the same rhythm. If there is an inconsistency between the frame rate and the refresh frequency, the phenomenon of tearing is prone to occur (the contents of the upper and lower part of the screen are broken, and the data from different two frames overlap).
Understanding the dual and triple caching mechanisms inside the image rendering, this concept is more complex, please visit here: http://source.android.com/devices/graphics/index.html, and here. article.yeeyan.org/view/37503/304664.
In general, the frame rate exceeds the refresh rate is only an ideal condition, in the case of more than 60fps, the GPU generated frame data will be held to wait for VSync refresh information, so as to keep each refresh has actual new data can be displayed. But we are experiencing more cases where the frame rate is less than the refresh frequency.
In this case, some frames display the same screen content as the previous frame. The bad thing is, the frame rate from more than 60fps suddenly dropped to below 60fps, so there will be LAG,jank,hitching , such as Kaka dropped frame of the situation is not smooth. This is also the reason why users feel bad.
3) Tool:profile GPU Rendering
The performance problem is so troublesome, fortunately we can have tools to debug. Open the developer options in your phone, select the profile GPU Rendering, and select the option on the screen as bars.
After choosing this, we can see the rich GPU drawing information on the phone screen, respectively, about Statusbar,navbar, the GPU rending information of the active program activity area.
As the interface refreshes, the vertical histogram is scrolled to indicate the time required to render each frame, and the higher the histogram, the longer the rendering time is spent.
There is a green line in the middle, representing 16ms, and we need to make sure that the total time spent on each frame is lower than the horizontal line, so that the problem of stalling can be avoided.
Each bar has three parts, and blue represents the time the display list was plotted, and red represents the time it takes for OpenGL to render the display list, and yellow represents the time the CPU waits for GPU processing.
4) Why 60fps?
We usually refer to 60fps and 16ms, but do you know why it is a measure of app performance in terms of whether the program is up to 60fps? This is because collaboration between the human eye and the brain does not perceive more than 60fps of screen updates.
12fps is probably similar to manual fast-flipping the frame rate of a book, which is clearly perceived to be not smooth enough. 24fps makes the human eye aware of continuous linear motion, which is actually attributed to the effect of motion blur. 24fps is the frame rate commonly used in film aprons, as this frame rate is sufficient to support the content that most movie images need to express, while minimizing expense.
But less than 30fps is not smooth performance of beautiful picture content, at this time need to use 60fps to achieve the desired effect, of course, more than 60fps is not necessary.
The performance goal of developing apps is to keep 60fps, which means you only have 16ms=1000/60 of time to handle all the tasks in each frame.
5) Android, UI and the GPU
Understanding how Android uses the GPU for screen rendering can help us better understand performance issues. So one of the most practical questions is: How does the activity picture go to the screen? How can the complex XML layout files be identified and plotted?
resterization rasterization is the most basic operation for drawing those components such as button,shape,path,string,bitmap. It splits those components into different pixels for display. This is a time-consuming operation, and the introduction of the GPU is to speed up rasterization operations.
The CPU is responsible for computing the UI component as a polygons,texture texture and then handing it to the GPU for rasterized rendering.
However, it's a hassle to move from CPU to GPU every time, fortunately OpenGL ES can hold the textures that need to be rendered in the GPU memory and work directly on the next time it needs to be rendered. So if you update the texture content held by the GPU, the previously saved state is lost.
In Android, the resources provided by the theme, such as Bitmaps,drawables, are packaged together into a unified texture texture and then transferred to the GPU, which means that each time you need to use these resources, you get the rendering directly from the texture. Of course, with the increasing richness of the UI components, there are more evolving patterns.
For example, when a picture is displayed, it needs to be loaded into memory by the CPU's calculation before it is passed to the GPU for rendering. The display of the text is more complex, requiring the CPU to be transformed into textures before being rendered to the GPU, returning to the CPU to draw a single character, then re-referencing the GPU-rendered content. Animation is a more complex process.
In order to make the app smooth, we need to process all CPU and GPU compute, draw, render and so on within 16ms each frame.
6) Invalidations, Layouts, and performance
Smooth and subtle animations are one of the most important elements of app design, and these animations can dramatically improve the user experience. The following explains how the Android system handles update operations for UI components.
Typically, Android needs to convert an XML layout file into an object that the GPU can identify and draw. This operation is done with the help of displaylist . Displaylist holds all the data information that will be given to the GPU to draw to the screen.
Displaylist is created for the first time a view needs to be rendered, and when the view is displayed on the screen, we execute the GPU drawing instructions to render. If you need to render the view again when you have to do something similar to moving the view, then we just need to do an extra render instruction.
However, if you modify some of the visible components in the view, then the previous displaylist cannot continue to use, we need to go back and recreate a displaylist and re-execute the rendering instructions and update to the screen.
It is important to note that whenever the drawing content in the view changes, the creation of displaylist, rendering displaylist, and updating to the screen will be performed in a series of actions. The performance of this process depends on the complexity of your view, the state of the view, and the execution performance of the rendering pipeline. For example, suppose that the size of a button needs to be twice times larger than the current one, and the position of the other child view needs to be recalculated and placed by the parent view before the button size is increased. Modifying the size of the view will trigger the entire hierarcyview operation of the recalculated size. Modifying the location of the view triggers the Hierarchview to recalculate the location of the other view. If the layout is complex, this can easily lead to serious performance problems.
We need to minimize overdraw.
We can look at the performance of the rendering by using the monitor GPU rendering described earlier, or you can see the view update operation through the show GPU view updates in the developer options. Finally, we can look at the layout by hierarchyviewer this tool to make the layout as flat as possible, moving the UI component unless necessary.
These operations can reduce the computational time of Measure,layout.
7) Overdraw, Cliprect, Quickreject
One of the most important aspects of performance problems is the excessive complexity of drawing operations. We can use tools to detect and fix overdraw problems with standard UI components, but it's a bit of a problem for highly customizable UI components.
One trick is that we can significantly improve the performance of drawing operations by executing several APIs methods. As mentioned earlier, rendering updates for non-visible UI components can cause overdraw. For example, if the nav drawer is sliding out of the previously visible activity, it will lead to overdraw if you continue to draw UI components that are not visible in the nav drawer.
To solve this problem, the Android system minimizes overdraw by avoiding drawing components that are completely invisible. View that is not visible in the nav drawer will not be wasted resources.
Unfortunately, for overly complex custom view (overriding the OnDraw method), the Android system cannot detect what is going on in OnDraw, and the system cannot monitor and automatically optimize, and there is no way to avoid overdraw. But we can use Canvas.cliprect () to help the system identify those areas that are visible.
This method allows you to specify a rectangular area that will be drawn only within this area, and other areas will be ignored. This API can be very helpful for those custom view that has multiple sets of overlapping components to control the displayed area. The Cliprect method can also help conserve CPU and GPU resources, and drawing instructions outside of the Cliprect area will not be executed.
Components that are part of the content within the rectangular area will still be drawn.
In addition to the Cliprect method, we can also use Canvas.quickreject () to determine if a rectangle is not intersected, skipping the drawing operations in the non-rectangular areas. With those optimizations, we can look at the results using the show GPU Overdraw described above.
8) Memory Churn and performance
Although Android has a mechanism for automating memory management, improper use of memory can still cause serious performance problems. Creating too many objects in the same frame is something that needs special attention.
The Android system has a generational Heap memory model that performs different GC operations based on the different memory data types in memory. For example, recently allocated objects are placed in the young generation area, where objects are often created quickly and destroyed quickly.
At the same time, the GC operation speed of this area is also faster than the GC operation of the old generation region.
In addition to the speed difference, any action on any thread will need to be paused when the GC operation is performed, and other actions will continue to run after the GC operation is complete.
In general, a single GC does not take up too much time, but a large number of non-stop GC operations can significantly occupy the frame interval (16ms). If there are too many GC operations in the frame interval, then natural other similar calculations, rendering operations such as the availability of time has become less.
There are two reasons why a GC can be performed frequently:
- Memory jitter is churn , and memory jitter is caused by a large number of objects being created and released in a short period of time.
- Generating a large number of objects in an instant can seriously occupy young generation's memory area, and when the threshold is reached, the GC will also be triggered when there is not enough space left. Even though each allocated object consumes little memory, stacking them together increases the pressure on the heap, triggering more other types of GC. This operation has the potential to affect the frame rate and make the user aware of performance issues.
There is a simple and intuitive way to solve the above problem, and if you look inside memory Monitor to see how many times the ram has occurred in a short time, this means that memory jitter is likely to occur.
We can also use Allocation Tracker to see the same objects that are constantly in and out of the same stack within a short period of time. This is one of the typical signals of memory jitter.
After you have roughly positioned the problem, the next fix is relatively straightforward. For example, you need to avoid allocating objects to memory in a for loop, and you need to try to move the creation of objects outside the loop body, and the OnDraw method in the custom view needs to be noticed, and the OnDraw method is called every time the screen is drawn and the animation is executed.
Avoid complex operations within the OnDraw method and avoid creating objects. For those who cannot avoid the need to create objects, we can consider the object pool model, through the object pool to solve the problem of frequent creation and destruction, but it is important to note that after the end of use, you need to manually release objects in the object pool.
9) Garbage Collection in Android
The recycling mechanism of the JVM is a great benefit to developers, and instead of dealing with the allocation and recycling of objects at all times, you can focus more on more advanced code implementations. Rather than Java,c and C + + and other languages with higher execution efficiency, they need developers to pay attention to the allocation and recycling of objects, but in a large system,
It is also inevitable that some objects forget to recycle, this is the memory leak.
The GC mechanism in the original JVM has been largely optimized in Android. Android is a three-level Generation memory model, the most recently allocated objects will be stored in the young Generation area, when the object stays in this area for a certain amount of time, it will be moved to the old Generation, Finally to the permanent generation area.
Each level of memory area has a fixed size, and since then constantly new objects are assigned to this area, when the total size of these objects quickly reached the threshold of this level of memory area, will trigger the GC operation, in order to make room for other new objects.
As mentioned earlier, every time a GC occurs, all of the threads are in a paused state. The time taken by GC is also related to which generation it is, and young generation each GC operation time is shortest, old generation second, Permanent generation the longest. The duration of execution is also related to the number of objects in the current generation,
Traversing a lookup of 20,000 objects is much slower than traversing 50 objects.
While Google's engineers are trying to shorten the time it takes for each GC, it's important to pay particular attention to GC-led performance issues. If you accidentally perform an action to create an object inside the smallest for loop unit, this can easily cause a GC and cause performance problems. With memory monitor, we can see the footprint of
Every moment of memory degradation is due to the GC operation, if a large amount of memory increases and decreases in a short period of time, this indicates that there is a potential performance problem here. We can also use the Heap and Allocation Tracker tool to see what objects are allocated in memory at this time.
Performance cost of Memory Leaks
Although Java has a mechanism for automatic recycling, this does not mean there is no memory leak in Java, and memory leaks can easily lead to serious performance problems.
Memory leaks mean that objects that are no longer used by the program cannot be recognized by the GC, which causes the object to remain in memory and consume valuable memory space. Obviously, this also makes the memory area of each level of generation less space available, and GC is more susceptible to triggering, which can cause performance problems.
Finding a memory leak and fixing the vulnerability is tricky, and you need to be familiar with the code you're executing, knowing exactly how it works in a particular environment, and then carefully troubleshooting it. For example, do you want to know if an activity in the program exits with a complete release of the memory it used before?
First you need to use the heap tool to get a memory snapshot of the current state while the activity is in the foreground, and then you need to create a blank activity that takes up almost no memory to jump to the previous activity. Next, when jumping to the blank activity, call the System.GC () method to ensure that a GC operation is triggered.
Finally, if the memory of the previous activity is all properly freed, there should be no object in the previous activity in the memory snapshot after the blank activity is started.
If you find that there are some suspicious objects that are not released in the memory snapshot of the blank activity, then you should use the alocation track Tool to look up the specific suspicious object. We can start listening from the blank activity, start to observe the activity, and then go back to the blank activity end listener.
After doing this, we can look closely at those objects and find out the real killer of the memory leaks.
One) Memory performance
Typically, Android has a lot of optimizations for GC, and while performing GC operations will pause other tasks, in most cases GC operations are relatively quiet and efficient. However, if our use of memory is inappropriate, which causes the GC to execute frequently, this can cause a small performance problem.
To find memory performance issues, Android Studio provides tools to help developers.
- Memory Monitor: It's a dangerous signal to see the entire app's RAM and the time the GC is taking place, and a lot of GC operations can occur in a short period of time.
- Allocation Tracker: Use this tool to track the allocation of memory, as mentioned earlier.
- Heap Tool: View the current memory snapshot to facilitate a comparative analysis of which objects may be leaking, please refer to the previous case.
Tool-memory Monitor
Memory Monitor in Android Studio is a great help for us to see how our programs are being used.
) Battery Performance
Power is actually one of the most valuable resources in handheld devices, and most devices need to be recharged continuously to maintain their use. Unfortunately, for developers, power optimization is the last thing they'll think about. But you can be sure that you can't make your application a big drain on your battery.
Purdue University studied the power consumption of some of the most popular applications, with an average of about 30% of the power being used by the program's core methods, such as drawing pictures, placing layouts, and so on, and the remaining 70% or so of the charge is reported data, check location information, Time to retrieve background advertising information used.
How to balance the power consumption of both is very important.
There are several measures that can significantly reduce the consumption of electricity:
- We should try to reduce the number of wake-up screens and duration, use Wakelock to handle wake-up problems, be able to perform wake-up operations correctly, and close the operation to sleep according to the set-up time.
- Some actions that do not have to be performed immediately, such as uploading songs, picture processing, etc., can wait until the device is charging or has sufficient power.
- Triggering the operation of the network request, each time will keep the wireless signal for a period of time, we can package the fragmented network requests for one operation, to avoid excessive wireless signal caused by the power consumption. About the network request caused by wireless signal power consumption, you can also refer to here http://hukai.me/android-training-course-in-chinese/connectivity/efficient-downloads/ Efficient-network-access.html
We can find the power consumption statistics of the corresponding app using the phone settings option. We can also view detailed power consumption through the Battery Historian Tool .
If we find that our app has too much power consumption, we can use the Jobscheduler API to handle some tasks on a timed basis, for example, we can take those task heavy operations until the phone is charging, or when connected to WiFi.
For more information about Jobscheduler, refer to Http://hukai.me/android-training-course-in-chinese/background-jobs/scheduling/index.html
Understanding Battery Drain on Android
The calculation and statistics of power consumption is a troublesome and contradictory thing, recording the power consumption itself is also a charge of the matter. The only viable option is to use a third-party monitoring device so that the actual power consumption can be obtained.
When the device is in standby, the power consumption is very small, take N5 as an example, turn on airplane mode, you can standby for nearly 1 months. But to light the screen, the hardware modules need to start working, which consumes a lot of power.
After you use the Wakelock or Jobscheduler wake-up device to handle timed tasks, be sure to get the device back to its original state in time. Every time you wake up the wireless signal for data transfer, it consumes a lot of power, which is more expensive than WiFi and other operations.
For details, please follow http://hukai.me/android-training-course-in-chinese/connectivity/efficient-downloads/efficient-network-access.html
Repairing the power consumption is another big issue, and it's not going to go ahead.
Battery Drain and Wakelocks
It is a contradictory choice to keep more power efficiently and to encourage users to use your app to eliminate electricity consumption. But we can use some better ways to balance the two.
Let's say you have a lot of social apps in your phone, and even when your phone is on standby, it's often woken up by these apps to check for new data syncing. Android will constantly turn off all kinds of hardware to extend the phone's standby time, first the screen will gradually darken until shut down, and then the CPU goes to sleep, all this is to save valuable power resources.
But even in this state of sleep, most applications will still try to work, and they will constantly wake up the phone. One of the simplest ways to wake up a phone is to use the Powermanager.wakelock API to keep the CPU working and prevent the screen from darkening off. This allows the phone to be awakened, perform work, and then go back to sleep.
Knowing how to get Wakelock is simple, but releasing wakelock in a timely manner is also very important, and improper use of wakelock can lead to serious errors. For example, the data return time of the network request is uncertain, which causes only 10s of things to wait for 1 hours, which will make the power wasted.
This is also why it is critical to use the Wakelock.acquice () method with timeout parameters. But just setting the timeout is not enough to solve the problem, such as setting how long the timeout is appropriate? When to retry and so on?
To solve the above problem, the correct way may be to use a non-precision timer. Normally, we set a time to do something, but it might be better to change the time dynamically. For example, if you have another program that needs to wake up 5 minutes later than the time you set, it's best to wait until that time when the two tasks bundle together simultaneously,
This is the core principle of the non-precision timer. We can customize the scheduled tasks, but if the system detects a better time, it can postpone your task to save power consumption.
This is exactly what the Jobscheduler API does. It combines the ideal wake-up time based on the current situation and task, such as when you are charging or connecting to WiFi, or when the task is centralized. We can implement many free scheduling algorithms through this API.
The battery history Tool has been released from Android 5.0, which can be used to see how often the program wakes up, and who wakes up and how long it lasts.
Keep an eye on the program's power consumption, and the user can see the big spenders by setting options on the phone, and may decide to uninstall them. Therefore, it is necessary to minimize the power consumption of the program.
Video address on YouTube: Https://www.youtube.com/playlist?list=PLWz5rJ2EKKc9CBxr3BVjPTPoDPLdPIFCE
Article Source: http://hukai.me/android-performance-patterns/
Thank the original author for bringing us the wonderful translation.
[Android Pro] Android performance Optimization Model first season