Android Bug (iii)-widely known criticism: frequent restart issues

Source: Internet
Author: User

Students who have used Android, especially with a cottage pad, should be impressed by the Android restart problem. Android has designed a watchdog mechanism that automatically restarts when a problem is detected, due to the complexity of its design, which may cause the system to inadvertently fall into an abnormal state.

Say the problem I met, just brought up Android, Android Restart problem is very serious, often operation after a while, the interface stuck, wait 1 minutes or so, restart ..., trace about the following:

W/watchdog (813): * * * Watchdog killing SYSTEM Process:com.android.server.am.activitymanagerservicew/audioflinger (745 ): Power Manager service died!!! I/servicemanager (737): Service ' Input_method ' Diedi/servicemanager (737): Service ' textservices ' Diedi/servicemanager (737): Service ' Uimode ' Diedi/servicemanager (737): Service ' vibrator ' Diedi/servicemanager (737): Service ' battery ' d Iedi/servicemanager (737): Service ' permission ' Diedi/servicemanager (737): Service ' Cpuinfo ' died

From this trace, the problem is in the activitymanangerservice, but what is the problem? Continue to follow the watchdog restart mechanism, you can see that its implementation mechanism is through the detection system of the various service lock is normal to achieve (details not detailed, want to know can see Dengfanping "deep Understanding of Android: Volume 1", this book is quite good), when there is a deadlock situation, Will kill the system server process so that the Android framework restarts and the systems continue to work.

This question, the beginning also let me puzzled half a day, start, Android due to its complex architecture and a huge amount of source code, it is very painful to chew up. Fortunately, the debugging methods and tools provided by Android are quite complete, and it is found from log that it generates the ANR trace before watch dog exits, so let's analyze it from this place.

Just get the ANR trace, or no clue, are call stack dump, take a closer look, found a good information hidden in this stack frame information:
such as the next stack frame:

-----pid 861 at 2012-02-11 14:57:50-----CMD line:system_server
DALVIK THREADS: (mutexes:tll=0 tsl=0 tscl=0 ghl=0) "main" prio=5 tid=1 MONITOR | group= "main" scount=1 dscount=0 obj=0x2ba9c460 self=0x8e820 | systid=861 nice=0 sched=0/0 Cgrp=[fopen-error:2] handle=716342112 | schedstat= (0 0 0) utm=464 stm=65 core=0 at Com.android.server.am.ActivityManagerService.isUserAMonkey ( activitymanagerservice.java:~6546)-Waiting to lock <0x2c1141c8> (a com.android.server.am.activitymanagerservice) held by tid=59 (Binder Thread #6)   at Android.app.ActivityManagerNative.onTransact (activitymanagernative.java:1273)   at Com.android.server.am.ActivityManagerService.onTransact (activitymanagerservice.java:1545)   at Android.os.Binder.execTransact (binder.java:338)   at com.android.server.SystemServer.init1 (Native Method)   at Com.android.server.SystemServer.main (systemserver.java:808)   at Java.lang.reflect.Method.invokeNative (Native Method)   at Java.lang.reflect.Method.invoke (method.java:511)   at Com.android.internal.os.zygoteinit$methodandargscaller.run (zygoteinit.java:784)   at Com.android.internal.os.ZygoteInit.main (zygoteinit.java:551)   at Dalvik.system.NativeStart.main (Native Method)    

What does that mean? Look at the red section above, indicating that the main thread is waiting to lock an object 0x2c1141c8 (usually synchronized operation, here is the Com.android.server.am.ActivityManagerService type of a An object), but was tid=59 occupied, and then look at the tid=59 stack frame:

"Binder Thread #6" prio=5 tid=59 MONITOR | group= "main" scount=1 dscount=0 obj=0x2c3bd838 Self=0x34c5d8 | systid=1120 nice=0 sched=0/0 Cgrp=[fopen-error:2] handle=3460688 | schedstat= (0 0 0) utm=168 stm=48 core=0 at Com.android.server.am.BatteryStatsService.noteStopWakelock ( batterystatsservice.java:~114) -Waiting to lock <0x2c117d50> (a Com.android.internal.os.BatteryStatsImpl) held by Tid=13 (Processstats)   at Com.android.server.PowerManagerService.noteStopWakeLocked (powermanagerservice.java:798)   at Com.android.server.PowerManagerService.releaseWakeLockLocked (powermanagerservice.java:1015)   at Com.android.server.PowerManagerService.releaseWakeLock (powermanagerservice.java:967)   at Android.os.powermanager$wakelock.release (powermanager.java:319)   at android.os.powermanager$ Wakelock.release (powermanager.java:300)   at Com.android.server.am.ActivityStack.activityIdleInternal ( activitystack.java:3254)   at Com.android.server.am.ActivityManagerService.activityIdle ( activitymanagerservice.java:3953)   at Android.app.ActivityManagerNative.onTransact ( activitymanagernative.java:362)   at Com.android.server.am.ActivityManagerService.onTransact ( activitymanagerservice.java:1545)   at Android.os.Binder.execTransact (binder.java:338)   at Dalvik. System. Nativestart.run (Native Method)

Why did Tid not release the lock object 0x2c1141c8? Because it waits until the lock object 0x2c117d50 (a Com.android.internal.os.BatteryStatsImpl type of objects)! If you have a richer experience in catching insects, see this, it must be clear that when holding the lock and request the lock, It's probably a dead lock!

Let's look at the situation where the requested lock is tid=13 held:

"Processstats" prio=5 tid=13 MONITOR | group= "main" scount=1 dscount=0 obj=0x2c146f58 self=0x2954f0 | systid=877 nice=0 sched=0/0 Cgrp=[fopen-error:2] handle=2709096 | schedstat= (0 0 0) utm=6 stm=4 core=0 at Com.android.server.am.ActivityManagerService.broadcastIntent ( activitymanagerservice.java:~12430)-Waiting to lock <0x2c1141c8> (a Com.android.server.am.ActivityManagerService) held by tid=59 (Binder Thread #6) &nb Sp At Android.app.ContextImpl.sendBroadcast (contextimpl.java:909)   at Com.android.server.DropBoxManagerService.add (dropboxmanagerservice.java:236)   at Android.os.DropBoxManager.addText (dropboxmanager.java:272)   at Com.android.server.am.ActivityManagerService $11.run (activitymanagerservice.java:7630)   at Com.android.server.am.ActivityManagerService.addErrorToDropBox (activitymanagerservice.java:7635)   at COM.ANDROID.SERVER.AM.ACTIVITYMANAGERSERVICE.HANDLEAPPLICATIONWTF (activitymanagerservice.java:7448)   at COM.ANDROID.INTERNAL.OS.RUNTIMEINIT.WTF (runtimeinit.java:345)   at Android.util.log$1.onterriblefailure ( log.java:103)   at ANDROID.UTIL.LOG.WTF (log.java:278)   at Com.android.internal.os.BatteryStatsImpl.getNetworkStatsDetailGroupedByUid (batterystatsimpl.java:5738)   at com.android.internal.os.batterystatsimpl.access$100(batterystatsimpl.java:76)   at com.android.internal.os.batterystatsimpl$uid.computecurrenttcpbytesreceived ( batterystatsimpl.java:2457)   at com.android.internal.os.batterystatsimpl$uid.gettcpbytesreceived ( batterystatsimpl.java:2446)   at Com.android.internal.os.BatteryStatsImpl.writeSummaryToParcel ( batterystatsimpl.java:5437)   at com.android.internal.os.BatteryStatsImpl.writeLocked (Batterystatsimpl.java : 4836)   at com.android.internal.os.BatteryStatsImpl.writeAsyncLocked (batterystatsimpl.java:4818)   at Com.android.server.am.ActivityManagerService.updateCpuStatsNow (activitymanagerservice.java:1649)   at Com.android.server.am.activitymanagerservice$3.run (activitymanagerservice.java:1531)

OK, here is the lock request lock, the requested lock is tid=59 occupied! This is the deadlock between tid=59 and tid=13!

The problem has been found here, how to fix it? In fact, the cause of the problem is not complex, careful analysis of the stack of errors, you can find the rule, error is the system in the use of LOG.WTF () record errors caused. WTF is the abbreviation of what a terrible failure, stating that the system has encountered a serious error. With this problem, tracing down is kernel version is too low and does not support netfilter cause.

However, this does not explain the Android bug, but take a closer look: LOG.WTF will end up in Adderrortodropbox Called Com.android.server.am.ActivityManagerService.broadcastIntent, which is required to lock Com.android.server.am.ActivityManagerService This object, if you write code (including Android and subsequent developers) accidentally, In some places to catch the exception to a LOG.WTF, will cause a system restart, and the intention of the Add Dropbox function is not normal, it seems that Android in the design and testing this feature is not seriously.

Solution: Either seriously solve the WTF, or simply comment out the Activitymanagerservice.java method handleapplicationwtf in the Adderrortodropbox. The job is not good, and just generate debugging information, the product meaning is not too large.

The other thing to note is that this is just one reason for the restart problem. Android because it is open source, we will change or add code, the inside of the lock operation of a little careless will make the system restart problem, according to the experience of the serious problems I have debugged, about 80% of these are the multi-threaded synchronization/state machine problems caused.

One of the main reasons for Android restart is this: The System Monitor's lock is dead or locked for a long time. This article only said an Android public problem, the actual development, due to custom and hardware problems (such as the GPU) caused by the deadlock and the lock time is too long to restart the bug has a lot of, can be found through this method and further find ways to solve, a good stability system, Now the Android restart problem is still being complained about miserably.


Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

Android Bug (iii)-widely known criticism: frequent restart issues

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.