0%

java | 线程诊断「栈」

这个只在 Ubuntu 中实践成功,Macbook 部分参数不能用。

某一个 java 程序 CPU 占有率 100%。需要查询究竟是哪行代码出的问题。

思路

  • 查询进程 ID
  • 查询线程 ID
  • 查询线程 ID 中的哪行代码。

首先,直接贴一下代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
package com.redisc;

import lombok.extern.slf4j.Slf4j;

@Slf4j(topic = "c.Test")
public class Run {
public static int count;

public static void main(String[] args) throws Exception {
new Thread(() -> {
while (true) {

}
}).start();

new Thread(() -> {
try {
Thread.sleep(1000000);
} catch (InterruptedException e) {
e.printStackTrace();
}
}).start();

new Thread(() -> {
try {
Thread.sleep(1000000);
} catch (InterruptedException e) {
e.printStackTrace();
}
}).start();

new Thread(() -> {
try {
Thread.sleep(1000000);
} catch (InterruptedException e) {
e.printStackTrace();
}
}).start();


}

}

打包成 jar 包之后,拿到 ubuntu 上运行。

查询进程 ID

使用 top

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
top - 22:38:03 up 12 min,  1 user,  load average: 2.18, 2.01, 1.36
任务: 282 total, 1 running, 208 sleeping, 0 stopped, 0 zombie
%Cpu(s): 27.6 us, 0.5 sy, 0.0 ni, 65.3 id, 5.8 wa, 0.0 hi, 0.8 si, 0.0 st
KiB Mem : 3887968 total, 212104 free, 1023372 used, 2652492 buff/cache
KiB Swap: 2097148 total, 2096368 free, 780 used. 2384260 avail Mem

进�� USER PR NI VIRT RES SHR � %CPU %MEM TIME+ COMMAND
8360 xiniu 20 0 3597336 37596 18732 S 100.0 1.0 0:53.03 java
1141 root 20 0 1336380 51408 20740 S 10.0 1.3 1:40.14 snapd
1643 xiniu 20 0 1394936 94248 62708 S 2.0 2.4 0:15.01 Xorg
2258 xiniu 20 0 765984 50744 38820 S 1.3 1.3 0:03.97 gnome-terminal-
1794 xiniu 20 0 4178280 233584 128876 S 0.7 6.0 0:37.52 gnome-shell
243 root 20 0 0 0 0 S 0.3 0.0 0:01.69 jbd2/sda1-8
1773 xiniu 20 0 220784 6948 6248 S 0.3 0.2 0:00.12 at-spi2-registr
8679 xiniu 20 0 46028 4152 3376 R 0.3 0.1 0:00.05 top
1 root 20 0 225800 9464 6556 S 0.0 0.2 0:04.57 systemd

可以看出 cpu 100% 的进程 ID8360

查询进程中的线程

使用 ps H -eo pid,tid,%cpu | grep 进程ID 输出

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
8360  8360  0.0
8360 8362 0.2
8360 8363 0.0
8360 8364 0.0
8360 8365 0.0
8360 8366 0.0
8360 8367 0.0
8360 8368 0.0
8360 8369 0.0
8360 8370 0.0
8360 8371 0.0
8360 8372 0.0
8360 8373 0.1
8360 8374 0.0
8360 8375 0.0
8360 8376 100
8360 8377 0.0
8360 8378 0.0
8360 8379 0.0

可以看出是 8376 线程引起的。

查询哪段代码

使用 jstack 进程 ID,也就是 jstack 8360 输出

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
2022-12-17 22:39:34
Full thread dump OpenJDK 64-Bit Server VM (25.352-b08 mixed mode):

"Attach Listener" #14 daemon prio=9 os_prio=0 tid=0x00007faa64001000 nid=0x21fc waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE

"DestroyJavaVM" #13 prio=5 os_prio=0 tid=0x00007faa9c00b800 nid=0x20aa waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE

"Thread-3" #12 prio=5 os_prio=0 tid=0x00007faa9c380800 nid=0x20bb waiting on condition [0x00007faa79e33000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at com.redisc.Run.lambda$main$3(Run.java:33)
at com.redisc.Run$$Lambda$4/2117255219.run(Unknown Source)
at java.lang.Thread.run(Thread.java:750)

"Thread-2" #11 prio=5 os_prio=0 tid=0x00007faa9c37f000 nid=0x20ba waiting on condition [0x00007faa79f34000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at com.redisc.Run.lambda$main$2(Run.java:25)
at com.redisc.Run$$Lambda$3/517380410.run(Unknown Source)
at java.lang.Thread.run(Thread.java:750)

"Thread-1" #10 prio=5 os_prio=0 tid=0x00007faa9c37d000 nid=0x20b9 waiting on condition [0x00007faa7a035000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at com.redisc.Run.lambda$main$1(Run.java:17)
at com.redisc.Run$$Lambda$2/1637506559.run(Unknown Source)
at java.lang.Thread.run(Thread.java:750)

"Thread-0" #9 prio=5 os_prio=0 tid=0x00007faa9c37b800 nid=0x20b8 runnable [0x00007faa8c1ae000]
java.lang.Thread.State: RUNNABLE
at com.redisc.Run.lambda$main$0(Run.java:10)
at com.redisc.Run$$Lambda$1/598446861.run(Unknown Source)
at java.lang.Thread.run(Thread.java:750)

"Service Thread" #8 daemon prio=9 os_prio=0 tid=0x00007faa9c194000 nid=0x20b6 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE

"C1 CompilerThread2" #7 daemon prio=9 os_prio=0 tid=0x00007faa9c189000 nid=0x20b5 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE

"C2 CompilerThread1" #6 daemon prio=9 os_prio=0 tid=0x00007faa9c187800 nid=0x20b4 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE

"C2 CompilerThread0" #5 daemon prio=9 os_prio=0 tid=0x00007faa9c184800 nid=0x20b3 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" #4 daemon prio=9 os_prio=0 tid=0x00007faa9c17d000 nid=0x20b2 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE

"Finalizer" #3 daemon prio=8 os_prio=0 tid=0x00007faa9c14f000 nid=0x20b1 in Object.wait() [0x00007faa8cffe000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000000ec388f00> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144)
- locked <0x00000000ec388f00> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:165)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:188)

"Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x00007faa9c14a800 nid=0x20b0 in Object.wait() [0x00007faaa0113000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000000ec386b98> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:502)
at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
- locked <0x00000000ec386b98> (a java.lang.ref.Reference$Lock)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)

"VM Thread" os_prio=0 tid=0x00007faa9c141000 nid=0x20af runnable

"GC task thread#0 (ParallelGC)" os_prio=0 tid=0x00007faa9c021000 nid=0x20ab runnable

"GC task thread#1 (ParallelGC)" os_prio=0 tid=0x00007faa9c023000 nid=0x20ac runnable

"GC task thread#2 (ParallelGC)" os_prio=0 tid=0x00007faa9c024800 nid=0x20ad runnable

"GC task thread#3 (ParallelGC)" os_prio=0 tid=0x00007faa9c026800 nid=0x20ae runnable

"VM Periodic Task Thread" os_prio=0 tid=0x00007faa9c197000 nid=0x20b7 waiting on condition

JNI global references: 312

上面查出来了是 8376 线程引起的。但是,这是 10 进制的,上面的输出线程是 16 进制的,所以 8376 转化为 16 进制是 20b8

也就是

1
2
3
4
5
"Thread-0" #9 prio=5 os_prio=0 tid=0x00007faa9c37b800 nid=0x20b8 runnable [0x00007faa8c1ae000]
java.lang.Thread.State: RUNNABLE
at com.redisc.Run.lambda$main$0(Run.java:10)
at com.redisc.Run$$Lambda$1/598446861.run(Unknown Source)
at java.lang.Thread.run(Thread.java:750)

可以看出,代码出在 Run.java:10,这个代码正好是

1
2
3
while (true) {

}
请我喝杯咖啡吧~