调查Python程序为何产生很多线程
在生产环境中,出现Python程序异常退出并产生core文件的情况。
使用gdb打开core文件后,可以看到大量线程:
gdb /usr/local/bin/python3 core.117155
(gdb) info threads
由于程序异常退出后,还会在次被拉起,可以通过ps命令观察正在运行的程序有多少个线程:
ps -o nlwp {pid}
在发现其有大量线程时,可以通过 py-spy 工具将每个线程的调用栈打印出来:
py-spy dump --pid {pid}
之后我在打印中观察到大量线程的代码为同一位置:
Thread 0x7F3EFBC386C0 (active): "Thread-82085"
_delayed_execute (merged_execute.py:28)
run (threading.py:1378)
_bootstrap_inner (threading.py:1016)
_bootstrap (threading.py:973)
Thread 0x7F3EFB4376C0 (active): "Thread-82086"
_delayed_execute (merged_execute.py:28)
run (threading.py:1378)
_bootstrap_inner (threading.py:1016)
_bootstrap (threading.py:973)
Thread 0x7F3EFAC366C0 (active): "Thread-82087"
_delayed_execute (merged_execute.py:28)
run (threading.py:1378)
_bootstrap_inner (threading.py:1016)
_bootstrap (threading.py:973)
Thread 0x7F3EFA4356C0 (active): "Thread-82088"
_delayed_execute (merged_execute.py:28)
run (threading.py:1378)
_bootstrap_inner (threading.py:1016)
_bootstrap (threading.py:973)
Thread 0x7F3EF9C346C0 (active): "Thread-82089"
_delayed_execute (merged_execute.py:28)
run (threading.py:1378)
_bootstrap_inner (threading.py:1016)
_bootstrap (threading.py:973)
Thread 0x7F3EF94336C0 (active): "Thread-82090"
_delayed_execute (merged_execute.py:28)
run (threading.py:1378)
_bootstrap_inner (threading.py:1016)
_bootstrap (threading.py:973)
Thread 0x7F3EF8C326C0 (active): "Thread-82091"
_delayed_execute (merged_execute.py:28)
run (threading.py:1378)
_bootstrap_inner (threading.py:1016)
_bootstrap (threading.py:973)
在查看相关的代码后,发现是巡检程序发现了大量异常数据,然后批量启动线程处理导致,后将巡检程序改为一次只处理一部分数据解决这个问题。