31 线程特有数据（TSD）和线程局部存储（TLS）

发表于： 2024-07-11 更新于： 2024-09-07 分类于： the-linux-programming-interface

字数： 945 阅读：≈ 2分钟

TSD 和 TLS 都属于每线程存储（Per-Thread Storage）。

仅初始化一次

#include <pthread.h>

pthread_once_t once_control = PTHREAD_ONCE_INIT;

int pthread_once(pthread_once_t *once_control, void (*init_routine) (void));

线程特有数据（Thread-Specific Data）

在 C11 之前，thread_local 变量是不受到语言支持的，因此为了创建线程特有数据就只能用相关的 API。

Pthreads 系列的特有数据 API 看起来很难用：

#include <pthread.h>

int pthread_key_create(pthread_key_t *key, void (*destr_function) (void *));

int pthread_key_delete(pthread_key_t key);

int pthread_setspecific(pthread_key_t key, const void *pointer);

void * pthread_getspecific(pthread_key_t key);

Pthreads API 为每个线程维护了一张数据表。每个数据项是一个数据指针和一个析构函数指针。表的长度也是有限的，可用 _SC_THREAD_KEYS_MAX 查到。我这里查出来是 1024（Linux 上是这个值）：

#include <errno.h>
#include <stdio.h>
#include <unistd.h>

int main() {
    errno = 0;
    long ret = sysconf(_SC_THREAD_KEYS_MAX);
    int err = errno;
    if (ret == -1 && err != 0) {
        perror("sysconf");
    } else if (ret == -1) {
        printf("no limit\n");
    } else {
        printf("limit is %ld\n", ret); // 1024
    }
}

用 pthread_key_create() 创建的每个键实际上是对应了一个下标，用来在每个线程各自的表中去索引数据。之所以要设计成键而不是直接用下标，可能和要存储析构函数有关？在线程结束的时候，相应的析构函数会被调用，以保证数据的正确释放。如果数据项是 NULL（这也是所有数据项的初始值），则不会调用析构函数。

有点奇怪的是，pthread_key_delete() 会将对应的 key 清除，却不管对所有线程的数据的析构。

在创建线程之前就使用 pthread_key_create() 来创建键是一件比较麻烦的事情。所以 pthread_key_create() 经常会使用 pthread_once()。而 pthread_once_t 类型的标志则一般是一个全局的、函数里面可以访问到的标志。书中有个例子是使用 pthreads API 来实现一个线程安全的 strerror() 函数（非线程安全的 strerror() 函数是将结果放在一个全局数组中返回，每次调用都会修改全局数组）。

线程局部存储（Thread-Local Storage）

用起来比 TSD 更简单，只需要在全局或静态变量的定义处加上 __thread 说明符。

线程局部存储需要内核（由 Linux 2.6 提供）、Pthreads 实现（由 NPTL 提供）以及 C 编译器（在 x86-32 平台上由 gcc 3.3 或后续版本提供）的支持。

C 语言标准的支持

在 C11 之前，语言级别并没有支持线程局部存储，__thread 是 GNU 的扩展功能。

在 C11 中有了 TLS 的支持，thread_local 成为一个宏，被定义为 _Thread_local（关键字），C23 的时候 thread_local 成为关键字（也可能同时是一个预定义宏）。

效率

https://stackoverflow.com/a/32246896/

不会被翻译成对 TSD 的调用，且比 TSD 要快。

TLS is implemented both on i386 and amd64 Linux with a segment register (%fs for i386, %gs for amd64). The speed difference is negligible. –fuz

如上文，这很像 C++ 虚函数依靠变化的虚表指针和不变的表内下标来实现多态。