29.3 用 `kill` 发送信号给特定线程的尝试
线程号在整个系统是唯一的,而 kill
命令也疑似可以精准发送信号给目标线程(见下方代码的测试 1;测试时只保留测试 1 和 2 其中一个,注释掉另外一个)。特殊情况:
- 当目标线程已经是僵尸的时候,则会将信号发给同组的其他线程。见测试 2。
- 如果线程正常退出,也被系统正常回收了资源(不是僵尸),那么 kill 就会报错(因为没有这个线程了,也不可能找到它的线程组),如
bash: kill: (23196) - No such process
。在很极端的情况下,这个 PID 被其他进程使用了,kill 会将信号发给不相关的进程。但是由于 Linux 以循环递增的方式重用 PID,这需要相当长的时间,一般不会遇到这种情况。
#define _GNU_SOURCE
#include <pthread.h>
#include <signal.h>
#include <stdio.h>
#include <unistd.h>
// Not portable since gettid is only on Linux.
unsigned long get_thread_id(void) { return (unsigned long)gettid(); }
void *f(void *arg) {
fprintf(stderr, "Thread %lu starts\n", get_thread_id());
for (;;) {
pause();
fprintf(stderr, "Thread %lu is awoken\n", get_thread_id());
pthread_exit(NULL);
}
fprintf(stderr, "Thread %lu but shouldn't reached here\n", get_thread_id());
return NULL;
}
void chld_handler(int sig) {
// (f)printf is not async-signal-safe, but I use it for testing.
fprintf(stderr, "Handler thread is %lu\n", get_thread_id());
}
int main() {
fprintf(stderr, "Main thread is %lu\n", get_thread_id());
// 因为 SIGCHLD 的默认行为是忽略,所以不设置信号处理器函数,SIGCHLD 就不会让 thread1 从 pause() 返回。
struct sigaction action;
action.sa_handler = chld_handler;
sigset_t signals_blocked;
sigfillset(&signals_blocked);
action.sa_mask = signals_blocked;
action.sa_flags = 0;
if (sigaction(SIGCHLD, &action, NULL) != 0) {
perror("sigaction");
return 1;
}
pthread_t thread1;
pthread_create(&thread1, NULL, f, NULL);
// 测试 1:让主线程去等待 thread1,通过 htop 可以发现两个线程都活着。以两个线程之一的 PID 为参数发送 SIGCHLD
// 信号,会使得对应的线程处理这个信号。
pthread_join(thread1, NULL);
// 测试 2:让主线程先退出,只有 thread1 活着,这个时候无论是根据两个线程中的哪一个发送信号,都是由活着的 thread1
// 处理。但为什么主线程退出后会成为僵尸?
// pthread_detach(thread1);
// pthread_exit(NULL); // 主线程结束后不会再处理信号
}
主线程为什么退出后成为僵尸?为什么我不能成功对主线程 join 或者 detach 以在其终止后回收资源(不再是僵尸)?
From my understanding on how the pthreads library works, I believe that the reason for the zombie thread is that joining normally with the main thread would throw away its resources and since the main thread returns a status (through the return of the main function) that could and in some cases is expected to be consumed by the parent process (i.e., through the use of wait), this thread cannot be completely destroyed until the thread group has exited entirely (i.e., that the whole process has exited). If it was somehow allowed to return the value of the pthread_exit call to the parent process then the parent would think that the child had exited which wouldn’t be true, since the PrintHello function would be still running.
因为主线程和进程同 PID,而其资源要留着等其父进程回收,所以在主线程退出、但其他线程没有退出时,保持主线程为僵尸的状态,在其结束前使用 pthread_detach()
或结束后使用 pthread_join()
都不会回收它的资源。
相关链接: