调查Python程序为何产生很多线程

在生产环境中,出现Python程序异常退出并产生core文件的情况。

使用gdb打开core文件后,可以看到大量线程:

gdb /usr/local/bin/python3 core.117155
(gdb) info threads

由于程序异常退出后,还会在次被拉起,可以通过ps命令观察正在运行的程序有多少个线程:

ps -o nlwp {pid}

在发现其有大量线程时,可以通过 py-spy 工具将每个线程的调用栈打印出来:

py-spy dump --pid {pid}

之后我在打印中观察到大量线程的代码为同一位置:

Thread 0x7F3EFBC386C0 (active): "Thread-82085"
    _delayed_execute (merged_execute.py:28)
    run (threading.py:1378)
    _bootstrap_inner (threading.py:1016)
    _bootstrap (threading.py:973)
Thread 0x7F3EFB4376C0 (active): "Thread-82086"
    _delayed_execute (merged_execute.py:28)
    run (threading.py:1378)
    _bootstrap_inner (threading.py:1016)
    _bootstrap (threading.py:973)
Thread 0x7F3EFAC366C0 (active): "Thread-82087"
    _delayed_execute (merged_execute.py:28)
    run (threading.py:1378)
    _bootstrap_inner (threading.py:1016)
    _bootstrap (threading.py:973)
Thread 0x7F3EFA4356C0 (active): "Thread-82088"
    _delayed_execute (merged_execute.py:28)
    run (threading.py:1378)
    _bootstrap_inner (threading.py:1016)
    _bootstrap (threading.py:973)
Thread 0x7F3EF9C346C0 (active): "Thread-82089"
    _delayed_execute (merged_execute.py:28)
    run (threading.py:1378)
    _bootstrap_inner (threading.py:1016)
    _bootstrap (threading.py:973)
Thread 0x7F3EF94336C0 (active): "Thread-82090"
    _delayed_execute (merged_execute.py:28)
    run (threading.py:1378)
    _bootstrap_inner (threading.py:1016)
    _bootstrap (threading.py:973)
Thread 0x7F3EF8C326C0 (active): "Thread-82091"
    _delayed_execute (merged_execute.py:28)
    run (threading.py:1378)
    _bootstrap_inner (threading.py:1016)
    _bootstrap (threading.py:973)

在查看相关的代码后,发现是巡检程序发现了大量异常数据,然后批量启动线程处理导致,后将巡检程序改为一次只处理一部分数据解决这个问题。

comments powered by Disqus