简单定位android重启 一 简介
flyme重启异常重启是指由于system_server、surfaceflinger及其同进程组的进程发生异常而引起的进程重启。这时用户会看到flyme气球界面(原生是android界面)。
1.1 需要的log
1. mtk平台: sdcard/mtklog (必选) aee_exp (db.xx.NE, db.xx.JE,db.fatal.xx.JE) mobilelog
mtklog中设置mobilelog的current size和total size,分别对应每一份aplog的大小和总的
mobilelog的大小。
2. 三星平台: android/log (必选) 3. adb pull data/anr [your dir] (可选) 4. dropbox (可选)
adb shell dumpsys dropbox -p [tag] adb pull data/system/dropbox [dir] 5. tomstone (可选)
adb pull data/tomstone [dir] (需要root权限) 1.2 异常类型
我们可以简单的将android重启按照直接原因分类为:NativeCrash,JavaCrash,WatchDog 。
而dropbox会捕捉记录的类型还有:anr、watchdog、crash、lowmem, wtf、strict_mode,这里不仅仅包含systemserver进程还会存储应用级别的关键日志。
如需详细了解,请阅读:
http://xiaocong.github.io/blog/2012/11/21/to-introduce-android-dropboxmanager-service/
二 如何定位问题 2.1 分析db.fatal.xx.xx
如果是mtk平台,systemserver进程崩溃时会在\"sdcard/mtklog/aee\"下产生db文件。mtk的aee会在系统出现严重问题时将所有关键信息压缩到该db文件中。
所以应该首先查找aee_exp中是否有db.fatal.xx.xx,如果有可使用MTK提供的AEE解压GAT解压后分析:
1 NativeCrash和JavaCrash:
解压db.fatal.01.JE 或者db.fatal.01.NE后打开__exp_main.txt,如:
1. Backtrace:
2. Process: system_server 3. ‐keys
4.
http://www..com/doc/f79533474.html,ng.NullPointerException: Attempt
to
invoke
virtual
method
'int
http://www..com/doc/f79533474.html,ng
.Object.hashCode()' on a null object reference 5. oseAndClean
upJobH(JobServiceContext.java:4) 6. andleFinishe
dH(JobServiceContext.java:417) 7.
at
com.android.server.job.JobServiceContext$JobServiceHandler.h
at
com.android.server.job.JobServiceContext$JobServiceHandler.h
at
com.android.server.job.JobServiceContext$JobServiceHandler.cl
Build:
Meizu/mt6795/mt6795:5.1/LMY47I/1448095923:userdebug/test
andleMessage
(JobServiceContext.java:324)
8. at android.os.Handler.dispatchMessage(Handler.java:111) 9. at android.os.Looper.loop(Looper.java:194) 10.
com.android.server.SystemServer.run(SystemServer.java:405)
11.
com.android.server.SystemServer.main(SystemServer.java:283)
12.
Native Method)
13.
Method.java:372)
14. ygoteInit.ja
va:1004) 15.
com.android.internal.os.ZygoteInit.main(ZygoteInit.java:799)
上面一段即做为关键日志粘贴到bug描述中,其中的NullPointerException一行即做为bug主题的关键字。
2 Watchdog
解压db.fatal.02.SWT后打开__exp_main.txt 1. Backtrace:
2. Process: system_server
3. Subject: Blocked in handler on ui thread (android.ui), Blocked in handler on Po
werManagerService (PowerManagerService) 4. Build: Meizu/mt6795/mt6795:
at at
com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(Z
at
http://www..com/doc/f79533474.html,ng.reflect.Method.invoke(
at
http://www..com/doc/f79533474.html,ng.reflect.Method.invoke(
at at
5.1/LMY47I/1448095923:userdebug/test‐keys 5. Debugger: Connected
意思是阻塞在android.ui和PowerManagerService两个线程,对android watchdog机制感兴趣的请阅读《Android Watchdog机制解析》。
请将上面“Subject: Blocked xxx ”作为bug主题的关键字。 继续搜索android.ui和PowerManagerService线程 1. \"android.ui\" prio=5 Thread id=30 WAITING 2.
http://www..com/doc/f79533474.html,ng.Object.wait(Native Method)
3.
com.android.server.power.PowerManagerService.shutdownOrRebootInternal(Power
ManagerService.java:2907) 4.
com.android.server.power.PowerManagerService.access$6100(PowerManagerServic
e.java:112) 5.
com.android.server.power.PowerManagerService$BinderService.reboot(Power
ManagerService.java:38)
6. android.os.PowerManager.reboot(PowerManager.java:948) 7.
com.android.internal.policy.impl.MzGlobalActions$MzGlobalActionsDialog$5.on
Click(MzGlobalActions.java:365)
8. android.view.View.performClick(View.java:4909) 9. android.view.View$PerformClick.run(View.java:20390)
10. android.os.Handler.handleCallback(Handler.java:815) 11. android.os.Handler.dispatchMessage(Handler.java:104) 12. android.os.Looper.loop(Looper.java:194)
13. android.os.HandlerThread.run(HandlerThread.java:61) 14.
com.android.server.ServiceThread.run(ServiceThread.java:46)
15.
16. \"PowerManagerService\" prio=5 Thread id=34 WAITING 17.
com.android.server.power.ShutdownThread.reboot(ShutdownThread.java:402)
18.
com.android.server.power.PowerManagerService$9.run(PowerManagerService.java
:21)
19. android.os.Handler.handleCallback(Handler.java:815) 20. android.os.Handler.dispatchMessage(Handler.java:104) 21. android.os.Looper.loop(Looper.java:194)
22. android.os.HandlerThread.run(HandlerThread.java:61) 23.
com.android.server.ServiceThread.run(ServiceThread.java:46)
到这里就定位到问题了,当然还有死锁等情况需要更多日志分析,这里就不详细介绍了。
请将上面两段log粘贴到bug的描述中,能更快的帮助开发同学分析问题。
如果非mtk平台,或者没有生成db文件,那么请继续往下看。 2.2 在main_log中定位
1 在aplog的所有main_log中搜索died关键字。
如果没有db.fatal生成:打开mtklog中的mobilelog,如APLog_2015_0528_144050, 文件名表示开始输出log的时间。如果
里面有多份log,我们需要拿到最近的一份。
搜索main_log中died,如果找到如下打印,表示定位正确。 1. 05‐29 06:23:06.285 262 262 I ServiceManager: service 'input_method' died
2. 05‐29 06:23:06.285 262 262 I ServiceManager: service 'accessibility' died
3. 05‐29 06:23:06.285 262 262 I ServiceManager: service 'mount' died
4. 05‐29 06:23:06.285 262 262 I ServiceManager: service 'lock_settings' died
5. 05‐29 06:23:0
6.285 262 262 I ServiceManager: service 'device_policy' died 6. 05‐29 06:23:06.285 262 262 I ServiceManager: service 'statusbar' died
7. 05‐29 06:23:06.285 262 262 I ServiceManager: service 'flyme_statusbar' died
8. 05‐29 06:23:06.285 262 262 I ServiceManager: service 'clipboard' died
9. 05‐29 06:23:06.285 262 262 I ServiceManager: service 'network_management' d
ied
10. 05‐29 06:23:06.285 262 262 I ServiceManager: service 'networkmanagement_ser
vice_flyme' died
11. 05‐29 06:23:06.285 262 262 I ServiceManager: service 'notification' died
12. 05‐29 06:23:06.285 262 262 I ServiceManager: service 'devicestoragemonitor'
died
13. 05‐29 06:23:06.285 262 262 I ServiceManager: service
'location' died
2 搜索fatal
如果1成功:在当前文件中向上搜索fatal,如找到下面的log,确认771是系统进程并且是fatal异常则会引起进程退出。
也可以省略1,直接搜索“ FATAL EXCEPTION IN SYSTEM PROCESS”即可。
1. 05‐29 06:23:01.132 771 787 E AndroidRuntime: *** FATAL EXCEPTION IN SYSTEM
PROCESS: FinalizerWatchdogDaemon
2. 05‐29 06:23:01.132 771 787 E AndroidRuntime: java.util.concurrent.TimeoutEx
ception: android.os.BinderProxy.finalize() timed out after 10 seconds
3. 05‐29 06:23:01.132 771 787 E AndroidRuntime: at android.os.BinderProxy.
destroy(Native Method)
4. 05‐29 06:23:01.132 771 787 E AndroidRuntime: at android.os.BinderProxy.
finalize(Binder.java:8)
5. 05‐29 06:23:01.132 771 787 E AndroidRuntime: at http://www..com/doc/f79533474.html,ng.Daemons$Final
izerDaemon.doFinalize(Daemons.java:190)
6. 05‐29 06:23:01.132 771 787 E AndroidRuntime: at http://www..com/doc/f79533474.html,ng.Daemons$Final
izerDaemon.run(Daemons.java:173)
7. 05‐29 06:23:01.132 771 787 E AndroidRuntime: at http://www..com/doc/f79533474.html,ng.Thread.run(Th
read.java:818)
注意:全局搜fatal并不是一个好的方法,因为systemserver的重启会导致很多进程crash,所以你搜到的fatal未必是root cause。
到这里已经定位到异常点为JE,log定位结束。请将上面一段日志作为关键日志粘贴到bug描述中。2.3 在system log中定位
1 keyword确认是否发生重启
如果main_log中没有发现died,说明log有可能被冲掉。一般而言main_log是比sys_log要多的,也更容易被冲掉。在sys_log中搜索\"Entered the Android system server\"定位是否发生重启.
2 搜索“goodbye”定位是watchdog kill system_server 如果在main_log搜索不到system的fatal异常,那么有可能是阻塞导致的watchdog kill, log如下:
1. Line 181147: 05‐29 13:11:00.470 799 1268 W Watchdog: *** WATCHDOG KILLIN
G SYSTEM PROCESS: Blocked in handler on main thread (main)
2. Line 181148: 05‐29 13:11:00.473 799 1268 W Watchdog: main thread stack t
race:
3. Line 181150: 05‐29 13:11:00.475 799 1268 W Watchdog: at com.android.
server.AlarmManagerService$ResultReceiver.onSendFinished(AlarmManagerService.ja
va:2607)
4. Line 181151: 05‐29 13:11:00.475 799 1268 W Watchdog: at android.app.
PendingIntent$FinishedDispatcher.run(PendingIntent.java:219)
5. Line 181152: 05‐29 13:11:00.475 799 1268 W Watchdog: at android.os.H
andler.handleCallback(Handler.java:815)
6. Line 181153: 05‐29 13:11:00.475 799 1268 W Watchdog: at android.os.H
andler.dispatchMessage(Handler.java:104)
7. Line 1811: 05‐29 13:11:00.475 799 1268 W Watchdog: at android.os.L
ooper.loop(Looper.java:194)
8. Line 181155: 05‐29 13:11:00.475 799 1268 W Watchdog: at com.android.
server.SystemServer.run(SystemServer.java:398)
9. Line 181156: 05‐29 13:11:00.475 799 1268 W Watchdog: at com.android.
server.SystemServer.main(SystemServer.java:276)
10. Line 181157: 05‐29 13:11:00.475 799 1268 W Watchdog: at http://www..com/doc/f79533474.html,ng.re
flect.Method.invoke(Native Method)
11. Line 181158: 05‐29 13:11:00.475 799 1268 W Watchdog: at http://www..com/doc/f79533474.html,ng.re
flect.Method.invoke(Method.java:372)
12. Line 181159: 05‐29 13:11:00.475 799 1268 W Watchdog: at com.android.
internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:968)
13. Line 181160: 05‐29 13:11:00.475 799 1268 W Watchdog: at com.android.
internal.os.ZygoteInit.main(ZygoteInit.java:763)
14. Line 183341: 05‐29 13:11:25.476 799 1268 W Watchdog: *** GOODBYE!
注意:如果log中打印上面的“WATCHDOG KILLING SYSTEM PROCESS”,那么我们还需要data/anr 的trace以进一步分析问题。
分析trace的方法与2.1节中分析Watchdog的方法一样,在trace中找到阻塞的线程backtrace即可。 粘贴上面一段日志作为关键日志粘贴到bug描述中,上面日志中的“ WATCHDOG KILLING
SYSTEM PROCESS”一行作为关键字粘贴到bug主题中。
3 假如运气不好都没有搜到,那么请直接拿log给我们分析哦! 三 案例 3.1 案例一
1 先搜索关键字\"died\",或者\"ServiceManager: service\";如果搜索到一下
1. Line 279383: 05‐12 16:51:33.306 787 908 I Process : Sending signal. PID: 78
7 SIG: 9
2. Line 279385: 05‐12 16:51:3
3.456 258 258 I ServiceManager: service 'vibrator ' died
3. Line 279477: 05‐12 16:51:33.465 258 258 I ServiceManager: service 'power' d
ied
从前面的log可知787为system_server进程,如何定位比较简单此处不详述。 如上面看到787收到
singal9[SIGKILL],后面各种抛在system_server进程的服务died。 所以确定定位准确。
2 然后往前搜索fatal,时间必须是在died的前面,找到如下log: 1. 05‐12 16:51:27.396 787 908 E AndroidRuntime: *** FATAL EXCEPTION IN SYSTEM
PROCESS: WifiStateMachine
2. 05‐12 16:51:27.396 787 908 E AndroidRuntime: http://www..com/doc/f79533474.html,ng.IllegalStateExceptio
n: command '7358 interface clearaddrs wlan0' failed with 'null'
3. 05‐12 16:51:27.396 787 908 E AndroidRuntime: at http://www..com/doc/f79533474.html,w
orkManagementService.clearInterfaceAddresses(NetworkM
anagementService.java:956)
4. 05‐12 16:51:27.396 787 908 E AndroidRuntime: at com.android.server.wifi
.WifiStateMachine$L2ConnectedState.processMessage(WifiStateMachine.java:8424)
5. 05‐12 16:51:27.396 787 908 E AndroidRuntime: at com.android.internal.ut
il.StateMachine$SmHandler.processMsg(StateMachine.java:966)
6. 05‐12 16:51:2
7.396 787 908 E AndroidRuntime: at com.android.internal.ut il.StateMachine$SmHandler.handleMessage(StateMachine.java:7)
7. 05‐12 16:51:27.396 787 908 E AndroidRuntime: at android.os.Handler.disp
atchMessage(Handler.java:111)
8. 05‐12 16:51:27.396 787 908 E AndroidRuntime: at android.os.Looper.loop(
Looper.java:194)
9. 05‐12 16:51:27.396 787 908 E AndroidRuntime: at android.os.HandlerThrea
d.run(HandlerThread.java:61)
10. 05‐12 16:51:27.396 787 908 E AndroidRuntime: Caused by: com.android.server.
NativeDaemonConnector$NativeDaemonFailureException: command '7358 interface cle araddrs wlan0' failed with 'null'
11. 05‐12 16:51:27.396 787 908 E AndroidRuntime: at com.android.server.Nati
veDaemonConnector.execute(NativeDaemonConnector.java:414)
12. 05‐12 16:51:27.396 787 908 E AndroidRuntime: at com.android.server.Nati
veDaemonConnector.executeForList(NativeDaemonConnector.java:365)
13. 05‐12 16:51:27.396 787 908 E AndroidRuntime: at com.android.server.Nati
veDaemonConnector.execute(NativeDaemonConnector.java:330)
14. 05‐12 16:51:27.396 787 908 E AndroidRuntime: at http://www..com/doc/f79533474.html,w
orkManagementService.clearInterfaceAddresses(NetworkManagementService.java:9) 15. 05‐12 16:51:27.396 787 908 E AndroidRuntime: ... 6 more
3 定位kernel log
其实从上面的log基本已经确定问题了。如果你不死心,那么还可以看下kernel log,定位kernellog方法有两个
a. 计算时间 由于kernellog打印的时间戳是运行时间,所以先找到与android log对应的时间(kernellog是不包含睡眠时间的)。 首先搜索\"2015-05-12 16:46\";找到如下
1. 12 08:51:
20.735116 UTC; android time 2015‐05‐12 16:51:20.735116 然后719+13 = 7532.所以找7532对应的时间的kernellog。即找到下面的log。
b.搜索pid kernel中会打印一些pid关联信息,如binder,hang_detect等。所以在全部kernellog中搜索\"878\",找到最近的一份log(mtklog中,序号越小越近)。如下:
1.
Line
198428:
<7>[731.863094]‐
(2)[908:WifiStateMachin][908:WifiStateMachin] si
Line
192237:
<4>[719.301126][thread:171][RT:719291182900] 2015‐05‐
g 9 to [787:system_server] stat=S 2. 3.
Line Line
198438: 198438:
<7>[731.8310] <7>[731.8310] exit
解
析
:
(1)[787:system_server][787:system_server] exit (1)[787:system_server][787:system_server] 程发送sigkill,然后system_server退出。
3.2 案例二
原log地址:E:\\Work\\crash&hang\\未分类\\m85\\20150526-monkey停止\\2 搜索systemlog,keyword:“GOODBYE” “WATCHDOG KILLING SYSTEM PROCESS”
1. Line 50518: 05‐26 11:51:2
2.593 817 1256 W Watchdog: SWT Watchdog after wait current time:603872
2. Line 50518: 05‐26 11:51:22.593 817 1256 W Watchdog: SWT Watchdog after w
ait current time:603872
3. Line 50519: 05‐26 11:51:22.606 817 1256 W Watchdog: **Get SF Time **1
4. Line 50520: 05‐26 11:51:22.606 817 1256 W Watchdog: SWT Watchdog before
synchronized:603885
5. Line 50520: 05‐26 11:51:22.606 817 1256 W Watchdog: SWT Watchdog before
synchronized:603885
6. Line 50521: 05‐26 11:51:22.606 817 1256 W Watchdog: SWT Watchdog after s
ynchronized:603885
7. Line 50521: 05‐26 11:51:22.606 817 1256 W Watchdog: SWT Watchdog after s
WifiStateMachine线程(908是787的子线程)向system_server线
ynchronized:603885
8. Line 50522: 05‐26 11:51:22.607 817 1256 W Watchdog: SWT Watchdog before
wait timeout:30000
9. Line 50522: 05‐26 11:51:22.607 817 1256 W Watchdog: SWT Watchdog before
wait timeout:30000
10. Line 50523: 05‐26 11:51:22.607 817 1256 W Watchdog: SWT Watchdog before
wait current time:603886
11. Line 50523: 05‐26 11:51:22.607 817 1256 W Watchdog: SWT Watchdog before
wait current time:603886
12. Line 50524: 05‐26 11:51:22.607 817 1256 W Watchdog: SWT Watchdog before
wait start:603886
13. Line 50524: 05‐26 11:51:22.607 817 1256 W Watchdog: SWT Watchdog before
wait start:603886
14. Line 50525: 05‐26 11:51:22.607 817 1256 W Watchdog: SWT Watchdog before
wait CHECK_INTERVAL:30000
15. Line 50525: 05‐26 11:51:22.607 817 1256 W Watchdog: SWT Watchdog before
wait CHECK_INTERVAL:30000
16. Line 50635: 05‐26 11:52:25.227 817 1256 W Watchdog: SWT Watchdog after w
ait current time:666506
17. Line 50635: 05‐26 11:52:25.227 817 1256 W Watchdog: SWT Watchdog after w
ait current time:666506
18. Line 504: 05‐26 11:52:25.273 817 1256 W Watchdog: **Get SF Time **1
19. Line 508: 05‐26 11:52:25.297 817 1256 E Watchdog: **SWT happen **Block
ed in handler on ActivityManager (ActivityManager) 20. Line 52278: 05‐26 11:52:42.181 817 1256 V Watchdog: ** save all info bef
ore killnig system server **
21. Line 52288: 05‐26 11:52:44.204 817 1256 I Watchdog: Reporting stuck stat
e to activity controller
22. Line 522: 05‐26 11:52:44.205 817 1256 I Watchdog: Binder.setDumpDisabl
ed
23. Line 52310: 05‐26 11:52:49.939 817 1256 I Watchdog: Activity controller
requested to reboot
24. Line 52311: 05‐26 11:52:49.940 817 1256 W Watchdog: *** WATCHDOG KILLING
SYSTEM PROCESS: Blocked in handler on ActivityManager (ActivityManager)
25. Line 52311: 05‐26 11:52:49.940 817 1256 W Watchdog: *** WATCHDOG KILLING
SYSTEM PROCESS: Blocked in handler on ActivityManager (ActivityManager)
26. Line 52312: 05‐26 11:52:49.959 817 1256 W Watchdog: ActivityManager stac
k trace:
27. Line 52313: 05‐26 11:52:49.973 817 1256 W Watchdog: at
android.os.Me
ssageQueue.nativePollOnce(Native Method)
28. Line 52314: 05‐26 11:52:49.973 817 1256 W Watchdog: at android.os.Me
ssageQueue.next(MessageQueue.java:148)
29. Line 52315: 05‐26 11:52:49.973 817 1256 W Watchdog: at android.os.Lo
oper.loop(Looper.java:151)
30. Line 52316: 05‐26 11:52:49.973 817 1256 W Watchdog: at android.os.Ha
ndlerThread.run(HandlerThread.java:61)
31. Line 52317: 05‐26 11:52:49.973 817 1256 W Watchdog: at com.android.s
erver.ServiceThread.run(ServiceThread.java:46)
32. Line 52680: 05‐26 11:53:15.002 817 1256 W Watchdog: *** GOODBYE!
33. Line 53632: 05‐26 11:53:34.212 19721 19721 I SystemServer: Init Watchdog kernel log
1. <7>[673.4615]‐(0)[1256:watchdog][1256:watchdog] sig 3 to [817:system_server
] stat=S 2.
3. <7>[69016.291887]‐(2)[1256:watchdog][1256:watchdog] sig 9 to [817:system_server
] stat=S main log
1. 05‐26 11:53:15.083 286 961 W AudioFlinger_Threads: power manager service di
ed
2. 05‐26 11:53:15.085 263 263 I ServiceManager: service
'wifip2p' died
3. 05‐26 11:53:15.085 263 263 I ServiceManager: service 'connectivity' died
4. 05‐26 11:53:1
5.085 263 263 I ServiceManager: service 'servicediscovery' die
d
5. 05‐26 11:53:15.085 263 263 I ServiceManager: service 'updatelock' died
6. 05‐26 11:53:15.085 263 263 I ServiceManager: service 'notification' died
7. 05‐26 11:53:15.085 263 263 I ServiceManager: service 'devicestoragemonitor'
died
8. 05‐26 11:53:15.085 263 263 I ServiceManager: service 'location' died
9. 05‐26 11:53:15.085 263 263 I ServiceManager: service 'country_detector' die
d
3.3 案例三 nativecrash
1. F/libc ( 4508): Fatal signal 11 (SIGSEGV), code 1, fault addr 0x30 in tid 45
08 (system_server)
因篇幅问题不能全部显示,请点此查看更多更全内容
Copyright © 2019- xiaozhentang.com 版权所有 湘ICP备2023022495号-4
违法及侵权请联系:TEL:199 1889 7713 E-MAIL:2724546146@qq.com
本站由北京市万商天勤律师事务所王兴未律师提供法律服务