In fact, when the user volume of Android projects is particularly large, some small problems will be magnified, the ANR problem is a typical example.
Some of the ANR problems will only occur in the user's actual use of the situation, when the system resources are more nervous and other special cases will be encountered, and these ANR problems are due to our code is unreasonable, this requires us to locate the problem, fix the problem, and in the future code design to avoid these unreasonable.
Recent work focused on the project's large number of users automatically reported the ANR issue log, although the online ANR related articles have been many, here or do a summary.
Outline
I. In what circumstances will the ANR appear
Two. Principles of the ANR mechanism
Three. How to analyze the ANR problem
Four. How to avoid the ANR problem
I. What is the situation where the ANR problem arises:
ANR (application not responding). In Android, the main thread (UI thread) will appear ANR if it has not finished processing the work within the specified time.
Specifically, ANR occurs in the following cases:
- Input events (keystrokes and touch events) are not processed within 5s: input event dispatching timed out
- Broadcastreceiver Event (Onrecieve method) is not processed within the specified time (foreground broadcast is 10s, background broadcast is 60s): Timeout of broadcast Broadcastrecord
07-27 19:18:47.448 1707 1766 W Broadcastqueue:receiver during timeout:resolveinfo{ccd831e com.example.qintong.myapplication/. Mybroadcastreciever m=0x108000}
07-27 19:18:47.502 3513 3728 I wteventcontroller:anr com.example.qintong.myapplication 7573
- Service Front 20s background 200s not completed start Timeout executing service
- ContentProvider's publish did not complete within 10s: Timeout Publishing content providers
In the Android document (https://developer.android.com/training/articles/perf-anr.html), only the first and second cases are written, and according to the source code and the actual experiment, We can find that service startup and provider publish also cause ANR problems.
It is important to note that in the latter three cases, in the case of Broadcastreviever, the first type of ANR does not occur within 10 seconds of the execution of the Onrecieve () method (that is, no input events or input events in the process are not yet 5s). Timeout, otherwise the event will occur without the corresponding ANR, so Onrecieve () is likely to execute less than 10s on the occurrence of ANR, so do not work in the Onrecieve () method, service OnCreate () and ContentProvider's OnCreate () Also, they are the main thread, do not work in these methods, this will be at the end of this article to elaborate.
Two. How the ANR mechanism is implemented:
Article: http://gityuan.com/2016/07/02/android-anr/from the source point of view detailed analysis of the principle of the implementation of the ANR mechanism. For the 1-4 in the previous chapter, we found out how it was implemented in the source code, for each of the approximate principle is as follows: 1. Call Hander.sendmessageattime () to send a ANR message in the related operation, the time delay is the duration of the ANR occurrence ( If the front service is 20s after the current time). 2. Perform the relevant operation 3. Remove the message from the end of the operation. If the associated operation is not completed at the specified time, the message is handler out and executed, and the ANR occurs.
The following is a detailed description of Broadcastreceiver as an example:
Broadcastqueue.processnextbroadcast ()
FinalvoidProcessnextbroadcast(Boolean frommsg) {...Synchronized (mservice) {...do {if (morderedbroadcasts.size () = =0) {...if (Mservice.mprocessesready && r.dispatchtime >0) {Long now = Systemclock.uptimemillis ();if (Numreceivers >0) && (now > R.dispatchtime + (2 * mtimeoutperiod * numreceivers))) {1. Send Delay Message broadcasttimeoutlocked (FALSE);Forcibly finish this broadcast forcereceive =True R.state = Broadcastrecord.idle; } }if (r.state! = broadcastrecord.idle) {if (debug_broadcast) slog.d (TAG,"Processnextbroadcast (" + Mqueuename +") Called when not idle (state=" + R.state +")");Return }if (r.receivers = =null | | R.nextreceiver >= Numreceivers | | R.resultabort | | forcereceive) {No more receivers for this broadcast! Send the final//result if requested ... if (r.resultto! = null) {try { Span class= "Hljs-comment" >//2. Processing broadcast Messages performreceivelocked (R.callerapp, R.resultto, new Intent (r.intent), R.resultCode, R.resultdata, R.resultextras, false, false, r.userId); //Set this to null so that the reference //(local and remote) isn ' t kept In the mbroadcasthistory. R.resultto = null;} catch (remoteexception e) {...}} //3. Cancel delay message cancelbroadcasttimeoutlocked (); ...} } while (r = = null); ...}}
1. Send delay message: broadcasttimeoutlocked (FALSE):
Finalvoidbroadcasttimeoutlocked (boolean FROMMSG) {... long now = Systemclock.uptimemillis (); if (frommsg) {if (mservice.mdiddexopt) {//Delay timeouts until dexopt finishes. mservice.mdiddexopt = FALSE; long timeouttime = Systemclock.uptimemillis () + mtimeoutperiod; setBroadcastTimeoutLocked (Timeouttime); return;} if (!mservice.mprocessesready) {return;} long timeouttime = R.receivertime + mtimeoutperiod; if (Timeouttime > Now) {setbroadcasttimeoutlocked (timeouttime); return; } }
He called the setbroadcasttimeoutlocked (long Timeouttime):
final void setBroadcastTimeoutLocked(long timeoutTime) { if (! mPendingBroadcastTimeoutMessage) { Message msg = mHandler.obtainMessage(BROADCAST_TIMEOUT_MSG, this); mHandler.sendMessageAtTime(msg, timeoutTime); mPendingBroadcastTimeoutMessage = true; } }
The time of the incoming setbroadcasttimeoutlocked (long timeouttime) XXX + Mtimeoutperiod,mtimeoutperiod is the time that Onrecieve () can execute, When Broadcastqueue is initialized, the foreground queue is a 10s background queue of 60s:
Activitymanagerservice.java:
PublicActivitymanagerservicestatic final int BROADCAST_FG_ TIMEOUT = 10 * 1000; static final int BROADCAST_BG_ TIMEOUT = 60 * 1000 ... mfgbroadcastqueue = new broadcastqueue (this, Mhandler, " foreground ", Broadcast_fg_timeout, false); Mbgbroadcastqueue = new broadcastqueue (this, MHandler, "background", Broadcast_bg_timeout, true); ... }
- Performreceivelocked () for the actual processing of the broadcast, it does not unfold
- Cancelbroadcasttimeoutlocked ():
The main work of this method is to remove the service timeout message service_timeout_msg when the services start to complete.
final void cancelBroadcastTimeoutLocked() { if (mPendingBroadcastTimeoutMessage) { mHandler.removeMessages(BROADCAST_TIMEOUT_MSG, this); mPendingBroadcastTimeoutMessage = false; } }
Three. How to analyze the ANR problem:
It is clear from the previous article that the ANR problem is due to the fact that the main thread's task did not finish the task within the specified time, and the reasons for this are roughly the following:
- The main thread is doing some time-consuming work
- The main thread is locked by another thread
- The CPU is occupied by other processes, and the process is not allocated enough CPU resources.
The key to analyzing the ANR problem is to determine which of the ANR is the case. Then get a ANR log, how should you analyze it?
In the case of ANR, the system collects ANR-related information for developers: first, there is the ANR-related information in log, followed by the use of the ANR CPU, and the trace information, that is, the execution of each thread at that time. The trace file is saved to/data/anr/traces.txt, and the log that the process prints before and after the ANR has some value. In general, you can follow the idea to analyze:
Find the ANR inverse information from log: You can search for "ANR in" or "Am_anr" from log, and you will find the log that the ANR occurred, which contains the time, process, and the ANR information of the ANR. If it is broadcastreceiver the ANR can doubt the problem of Broadcastreceiver.onrecieve (), if the service or provider is suspicious of its oncreate ().
After this log, there will be CPU usage information indicating the amount of CPU usage before and after the ANR (log will indicate the time to intercept ANR), and the following can be analyzed from various CPU usage information:
(1). If some processes have a high percentage of CPU consumption and consume almost all of the CPU resources, while the ANR process has a CPU consumption of 0% or very low, it is assumed that the CPU resources are occupied and the process is not allocated sufficient resources for the ANR to occur. Most of this can be considered a matter of system state, not caused by this application.
(2). If the process in which the ANR occurs is CPU intensive, such as 80% or more than 90%, you can suspect that some code in the application is not reasonably consuming the CPU resources, such as a dead loop or a lot of threads in the background to perform tasks, etc. This will be combined with trace and ANR before and after the log further analysis.
(3). If the total CPU usage is not high, the process and other processes are too expensive, there is a certain probability that the operation of some main thread is too long, or because the main process is locked.
In addition to the above scenario (1), after analyzing CPU usage, determining the problem requires us to further analyze the trace file. The trace file records the stack for each thread of the process that occurred before and after the ANR. For our analysis of the ANR problem is most valuable is one of the main thread of the stack, the general main thread trace may have the following situations:
(1). The main thread is running or native and the corresponding stack corresponds to the function in our application, it is most likely that a timeout occurred while executing the function.
(2). The main thread is block: Very obvious threads are locked, this time can be seen by which thread is locked, you can consider optimizing the code. If it is a deadlock problem, it needs to be resolved in a timely manner.
(3). Because the time to capture the trace is likely to be time-consuming operation has been completed (ANR--time-consuming operation completed, System catch trace), then the trace is no use, the main thread of the stack is this:
"Main" prio=5 tid=1 Native| group= "main" Scount=1 dscount=0 Obj=0x757855c8self=0xb4d76500 | Systid=3276 nice=0 Cgrp=default sched=0/0 handle=0xb6ff5b34| State=s schedstat= (50540218363 186568972172 209049) utm=3290 stm=1764 core=3 hz=100 | stack=0xbe307000-0xbe309000 stacksize=8MB| Held mutexes= kernel: (couldn ' t read/proc/self/task/3276/stack) Native: #00 pc 0004099c/system/lib/libc.so (__epoll_pwait+20) Native: #01 pc 00019f63/system/lib/ Libc.so (epoll_pwait+26) Native: #02 pc 00019f71/system/lib/libc.so (epoll_wait+6) Native: #03 pc 00012ce7/system/lib/li Butils.so (_zn7android6looper9pollinnerei+102) Native: #04 pc 00012f63/system/lib/libutils.so (_ zn7android6looper8pollonceeipis1_ppv+130) Native: #05 pc 00086abd/system/lib/libandroid_runtime.so (_ zn7android18nativemessagequeue8pollonceep7_jnienvp8_jobjecti+22) Native: #06 pc 0000055d/data/dalvik-cache/arm/[ Email protected] @boot. Oat (java_android_os_messagequeue_nativepollonce__ji+96) at Android.os.MessageQueue.nativePollOnce (Native method) at Android.os.MessageQueue.Next (messagequeue.java:323) at Android.os.Looper.loop (looper.java:138) at Android.app.ActivityThread.main ( activitythread.java:5528) at java.lang.reflect.method.invoke! (Native method) at Com.android.internal.os.zygoteinit$methodandargscaller.run (zygoteinit.java:740) at Com.android.internal.os.ZygoteInit.main (zygoteinit.java:630)
Of course this situation is most likely due to the other threads of the process consuming CPU resources, which need to analyze the trace of other threads and the log that the process itself output before and after the ANR.
Four. How to reduce the probability of ANR:
There are some operations that are dangerous, very prone to ANR, and must be avoided when writing code:
- The main thread reads the data: The main thread in Android to read the data is very bad, Android is not allowed to read the main thread from the network, but the system allows the main thread from the database or other places to obtain data, but this operation of the ANR risk is very large, will also cause dropped frames, etc., affecting the user experience.
(1). Avoid the main thread query provider, first this will be time-consuming, and this operation provider the side of the process if it hangs or is starting, we apply the query will not return for a long time, we should be in other threads to execute the database query, Provider query and so on to get data operations.
(2). Sharepreference Call: There are many optimization points for sharepreference, article http://weishu.me/2016/10/13/sharedpreference-advices/ This paper introduces several points for attention in the use of sharepreference in detail. First, the commit () method of sharepreference is synchronous, and the Apply () method is usually executed asynchronously. So do not use its commit () in the main thread, replace with apply (). Second Sharepreference write is full-volume write rather than incremental write, so try to modify the same apply, to avoid a point to apply once (the Apply () method in the activity stop when the main thread will wait for the write to complete, submit multiple times Easy card). and storing the text should not be too large, this will be very slow. In addition, if you are writing JSON or XML, you will be slow to add and delete escape symbols.
- Do not work in Broadcastreciever's Onrecieve () method, which is easily overlooked, especially when used in the background. To avoid this situation, a solution is to directly open the asynchronous thread execution, but at this time the application may be in the background, the system priority is low, the process is easily killed by the system, so you can choose to open a intentservice to do the corresponding operation, even the background service will improve the process priority, Reduce the likelihood of being killed.
- The life cycle functions of each component should not have too time-consuming operations, even for background service or ContentProvider, where the application does not have user input causing the event to be unresponsive when OnCreate () is running in the background. However, its long execution time can also cause the service's ANR and ContentProvider.
- Try to avoid the main thread of the lock, in some synchronous operation of the main thread may be locked, you need to wait for other threads to release the corresponding lock to continue execution, there is a certain ANR risk, for this situation can sometimes use asynchronous threads to execute the corresponding logic. In addition, we want to avoid the occurrence of deadlocks (the main thread is deadlocked basically equal to the occurrence of ANR).
Android ANR Optimization 2