App Debug 主线程,出现ANR后被系统Kill

痛点

笔者手上有个Reamle GT,日常开发工作中,是不是会debug,断点在主线程里面,基本被kill,没办法调试代码,自动重启给开发带来很大困扰,实在受不了,着手解决。


环境:realme gt,Android11,realme ui 2.0
image.png

思路

先全局找log,发现信息,框架里面一定会留下下什么。复现一次,采集logcat,尝试搜索包名,kill app 多关注 ActivityManager。

找实现逻辑

仔细筛选一下, 发现了这个日志,就是罪魁祸首了

1
04-23 14:07:36.545  1397  1695 I ActivityManager: Killing 13134:com.hi.dhl.startup.simple/u0a297 (adj 0): user request after error:Input dispatching timed out (3f0c272 com.simple/com.simple.MainActivity (server) is not responding. Waited 5001ms for MotionEvent)

接着过滤一下ActivityManager log,可以看到有出现ANR之后,处理完ANR之后,app就被kill了

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
7485: 04-23 14:07:36.527  1397 13246 E ActivityManager:   0% 12803/kworker/4:2-mm_percpu_wq: 0% user + 0% kernel
7486: 04-23 14:07:36.527 1397 13246 E ActivityManager: +0% 13207/logcat: 0% user + 0% kernel
7487: 04-23 14:07:36.527 1397 13246 E ActivityManager: 3.7% TOTAL: 1% user + 1.8% kernel + 0% iowait + 0.6% irq + 0.1% softirq
7488: 04-23 14:07:36.527 1397 13246 E ActivityManager: CPU usage from 61ms to 379ms later (2022-04-23 14:07:36.082 to 2022-04-23 14:07:36.400):
7489: 04-23 14:07:36.527 1397 13246 E ActivityManager: 37% 1397/system_server: 10% user + 27% kernel / faults: 921 minor
7490: 04-23 14:07:36.527 1397 13246 E ActivityManager: 31% 13246/AnrConsumer: 6.8% user + 24% kernel
7491: 04-23 14:07:36.527 1397 13246 E ActivityManager: 3.4% 2468/SensorService: 0% user + 3.4% kernel
7492: 04-23 14:07:36.527 1397 13246 E ActivityManager: 3.4% 2790/OplusAppBwCore: 0% user + 3.4% kernel
7493: 04-23 14:07:36.527 1397 13246 E ActivityManager: 3.2% 356/kworker/u24:6-ufs_clk_gating_0: 0% user + 3.2% kernel
7494: 04-23 14:07:36.527 1397 13246 E ActivityManager: 3.3% 1008/usbtemp_kthread: 0% user + 3.3% kernel
7495: 04-23 14:07:36.527 1397 13246 E ActivityManager: 4.2% 11938/kworker/u24:2-memlat_wq: 0% user + 4.2% kernel
7496: 04-23 14:07:36.527 1397 13246 E ActivityManager: 7.2% TOTAL: 2% user + 4.8% kernel + 0.4% irq
7497 04-23 14:07:36.527 1397 13246 V java.lang.ASSERT: copyAnr filePath = /data/anr/anr_2022-04-23-14-07-36-428

7499 04-23 14:07:36.528 1397 13246 D OplusManager: send on stamp success
7500: 04-23 14:07:36.528 1397 13246 D ActivityManager: Completed ANR of com.simple in 508ms, latency 1ms
7501: 04-23 14:07:36.529 1397 1695 W ActivityManager: Dismiss app ANR dialog : com.simple
7502: 04-23 14:07:36.529 1397 1695 W ContextImpl: Calling a method in the system process without a qualified user: android.app.ContextImpl.sendBroadcast:1161 com.android.server.am.OplusExtraActivityManagerService.setKeyLockModeNormal:51 com.android.server.am.ActivityManagerService.killAppAtUsersRequest:10688 com.android.server.am.AppErrors.handleShowAnrUi:966 com.android.server.am.ActivityManagerService$UiHandler.handleMessage:1884
7503 04-23 14:07:36.529 1397 13247 I DropBoxManagerService: add tag=data_app_anr isTagEnabled=true flags=0x2

7527 04-23 14:07:36.544 1397 1695 D ScreenMode: switch from 2 to 3
7528: 04-23 14:07:36.545 1397 1695 I ActivityManager: Killing 13134:com.simple /u0a297 (adj 0): user request after error:Input dispatching timed out (3f0c272 com.simple /com.simple .MainActivity (server) is not responding. Waited 5001ms for MotionEvent)
7529 04-23 14:07:36.545 1397 1700 D OplusFeatureHDREnhanceBrightness: onConfigChangedthreadId=22, threadName=android.display, configId=1

基本可以确定,是出现ANR之后,被kill了,但是其他手机上,会出现个ANR对话框,可以确定的是,开发者选项里面的已经开启了。
image.png
那么厂商是怎么做的呢?去拉一下框架代码

1
adb pull /system/framework

首先来看一下 framework.jar ,去看下Activity Manager的代码
image.png
去搜索一下 日志里面的字段,很遗憾没有找到什么。
除了 framework.jar 还有 services.jar,一些系统服务在里面。
image.png
在services.jar 里面,发现一些端倪,去追 ,果然是在ActivityManagerService里面

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
public void killAppAtUserRequestLocked(ProcessRecord app) {
String killAnnotation;
ProcessRecord.ErrorDialogController controller = app.getDialogController();
int reasonCode = 6;
int subReason = 0;
if (controller.hasDebugWaitingDialog()) {
reasonCode = 13;
subReason = 1;
}
controller.clearAllErrorDialogs();
if (app == null || app.mAnrAnnotation == null) {
killAnnotation = "user request after error";
} else {
String killAnnotation2 = "user request after error:" + app.mAnrAnnotation;
app.mAnrAnnotation = null;
killAnnotation = killAnnotation2;
}
killAppImmediateLocked(app, reasonCode, subReason, "user-terminated", killAnnotation);
}

调用来着

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
public void killAppAtUsersRequest(ProcessRecord app) {
synchronized (this) {
try {
boostPriorityForLockedSection();
if (app.pid > 0 && app.pid != MY_PID) {
OplusExtraActivityManagerService.setKeyLockModeNormal(this.mContext, app.processName, this.mSystemReady);
}
this.mAppErrors.killAppAtUserRequestLocked(app);
} catch (Throwable th) {
resetPriorityAfterLockedSection();
throw th;
}
}
resetPriorityAfterLockedSection();
}

继续追调用,发现有4个地方在引用,我们挨个确认一下,
image.png
在在 handleShowAnrUi 里面,可以发现这段逻辑
image.png
向上去追这段逻辑,发现有个控制逻辑_ java.lang.String r5 = “persist.sys.assert.panic”, _
在shell里面getpro试一下,得到默认false。我们现在要跳过kill的逻辑,改掉这个值不就可以啦。

修改

_ 不幸的是,persist.sys.是系统属性,普通用户没有权限修改,也就是说需要root权限。
root权限,需要解锁。解锁部分可以去xda看看,也是申请解锁,启动twrp,刷写magisk那一套。
root完之后,shell里面,su setpro _persist.sys.assert.panic = true。

验证

现在断点在主线程里面,就出现弹窗,不会直接kill debug的app了。想断多久就多久了。
image.png

建议

国内可以考虑小米和一加,解锁和刷机起来比较简单,可以刷一些简单的ROM,调试和可玩性都比较高。