【linux 多线程并发】线程本地数据存储的两种方式，每个线程可以有同名全局私有数据，以及两种方式的性能分析

线程本地数据(TLS)

专栏内容：

参天引擎内核架构本专栏一起来聊聊参天引擎内核架构，以及如何实现多机的数据库节点的多读多写，与传统主备，MPP的区别，技术难点的分析，数据元数据同步，多主节点的情况下对故障容灾的支持。

手写数据库toadb 本专栏主要介绍如何从零开发，开发的步骤，以及开发过程中的涉及的原理，遇到的问题等，让大家能跟上并且可以一起开发，让每个需要的人成为参与者。本专栏会定期更新，对应的代码也会定期更新，每个阶段的代码会打上tag，方便阶段学习。

开源贡献：

toadb开源库

个人主页：我的主页
管理社区：开源数据库
座右铭：天行健，君子以自强不息；地势坤，君子以厚德载物.

文章目录

前言

现代的CPU都是多core处理器，而且在intel处理器中每个core又可以多个processor，形成了多任务并行处理的硬件架构，在服务器端的处理器上架构又有一些不同，传统的采用SMP，也就是对称的多任务处理架构，每个任务都可以对等的访问所有内存，外设等，而如今在ARM系列CPU上，多采用NUMA架构，它将CPU核分了几个组，给每个组的CPU core分配了对应的内存和外设，CPU访问对应的内存和外设时速度最优，跨组访问时性能会降底一些。

随着硬件技术的持续发展，它们对一般应用的性能优化能力越来越强，同时对于服务器软件的开发，提出更高要求，要想达到极高的并发和性能，就需要充分利用当前硬件架构的特点，对它们进行压榨。那么，我们的应用至少也是要采用多任务架构，不管是多线程还是多进程的多任务架构，才可以充分利用硬件的资源，达到高效的处理能力。

当然多任务框架的采用，不仅仅是多线程的执行，需要对多任务下带来的问题进行处理，如任务执行返回值获取，任务间数据的传递，任务执行次序的协调；当然也不是任务越多处理越快，要避免线程过多导致操作系统夯住，也要防止任务空转过快导致CPU使用率飙高。

本专栏主要介绍使用多线程与多进程模型，如何搭建多任务的应用框架，同时对多任务下的数据通信，数据同步，任务控制，以及CPU core与任务绑定等相关知识的分享，让大家在实际开发中轻松构建自已的多任务程序。

概述

linux 系统中线程是一种经量级的任务，同一进程的多个线程是共享进程内存的；当我们定义一个全局变量时，它可以被当前进程下的所有线程访问，如何来定义一个线程本地的变量呢？

TLS方式

在linux 系统下一般有两种方式来定义线程本地变量，这一技术叫做Thread Local Storage, TLS。

GCC的__thread关键字
键值对API

TLS生命周期

线程本地变量的生命周期与线程的生命周期一样，当线程结束时，线程本地变量的内存就会被回收。

当然这里需要特别注意，当线程本地变量为指针类型时，动态分配的内存空间，系统并不会自动回收，只是将指针变量置为NULL，为了避免内存泄漏，需要在线程退出时主动进行清理动作，这将在后面的博文中介绍。

线程pthread结构内存

在介绍线程本地变量存储时，就不得不介绍一下pthread结构的内存，它定义了线程的重要数据结构，描述了用户状态线程的完整信息。

pthread 结构非常复杂，通过 specific_1stblock 数组和特定的辅助数组与 TLS 相关。

#definePTHREAD_KEY_2NDLEVEL_SIZE32#definePTHREAD_KEY_1STLEVEL_SIZE\((PTHREAD_KEYS_MAX + PTHREAD_KEY_2NDLEVEL_SIZE -1)\/ PTHREAD_KEY_2NDLEVEL_SIZE)structpthread{union{#if!TLS_DTV_AT_TP/* This overlaps the TCB as used for TLS without threads (see tls.h).  */tcbhead_t header;#elsestruct{int multiple_threads;int gscope_flag;} header;#endifvoid*__padding[24];};list_t list;pid_t tid;...structpthread_key_data{/* Sequence number.  We use uintptr_t to not require padding on
       32- and 64-bit machines.  On 64-bit machines it helps to avoid
       wrapping, too.  */uintptr_t seq;/* Data pointer.  */void*data;} specific_1stblock[PTHREAD_KEY_2NDLEVEL_SIZE];/* Two-level array for the thread-specific data.  */structpthread_key_data*specific[PTHREAD_KEY_1STLEVEL_SIZE];/* Flag which is set when specific data is set.  */
  bool specific_used;...}

__thread 关键字

该关键字可用于在 GCC/Clang 编译环境中声明 TLS 变量，该关键字不是 C 标准，并且因编译器不同而有差异；

原理介绍

使用 __thread关键字声明的变量存储在线程的pthred 结构与堆栈空间之间，也就是说，在内存布局方面，从高地址到底层地址的内存分布为：pthred结构、可变区和堆栈区（堆栈的底部和可变区的顶部是连续的）；

在这种方式下的线程本地变量，变量的类型不能是复杂的类型，如C++的class类型，而且动态申请的变量空间，需要主动释放，线程结束时，只是对变量空间回收，而对应的动态内存则会泄漏。

代码举例

/* 
 * created by senllang 2024/1/1 
 * mail : [email protected] 
 * Copyright (C) 2023-2024, senllang
 */#include<pthread.h>#include<stdlib.h>#include<stdio.h>#include<string.h>#include<unistd.h>#defineTHREAD_NAME_LEN32
__thread char threadName[THREAD_NAME_LEN];
__thread int delay =0;typedefstructThreadData{char name[THREAD_NAME_LEN];int delay;}ThreadData;void*threadEntry(void*arg){int ret =0;int i =0;
    ThreadData * data =(ThreadData *)arg;printf("[%lu] thread entered \n",pthread_self());strncpy(threadName, data->name, THREAD_NAME_LEN);
    delay = data->delay;for(i =0; i < delay; i++){usleep(10);}printf("[%lu] %s exiting after delay %d.\n",pthread_self(), threadName, delay);pthread_exit(&ret);}intmain(int argc,char*argv[]){pthread_t thid1,thid2,thid3;void*ret;
    ThreadData args1 ={"thread 1",50000}, args2 ={"thread 2",25000}, args3 ={"thread 3",12500};strncpy(threadName,"Main Thread", THREAD_NAME_LEN);if(pthread_create(&thid1,NULL, threadEntry,&args1)!=0){perror("pthread_create() error");exit(1);}if(pthread_create(&thid2,NULL, threadEntry,&args2)!=0){perror("pthread_create() error");exit(1);}if(pthread_create(&thid3,NULL, threadEntry,&args3)!=0){perror("pthread_create() error");exit(1);}if(pthread_join(thid1,&ret)!=0){perror("pthread_create() error");exit(3);}if(pthread_join(thid2,&ret)!=0){perror("pthread_create() error");exit(3);}if(pthread_join(thid3,&ret)!=0){perror("pthread_create() error");exit(3);}printf("[%s]all thread exited delay:%d .\n", threadName, delay);}

每个线程定义了两个线程本地变量 threadName, delay，在线程处理函数中，对它们赋值后，再延迟一段时间，然后输出这两个变量值，结果可以看到每个线程的本地变量值都不一样，可以独立使用。

运行结果：

[senllang@hatch example_04]$ gcc -lpthread threadLocalStorage_gcc.c 
[senllang@hatch example_04]$ ./a.out 
[139945977145088] thread entered 
[139945960359680] thread entered 
[139945968752384] thread entered 
[139945960359680] thread 3 exiting after delay 12500.
[139945968752384] thread 2 exiting after delay 25000.
[139945977145088] thread 1 exiting after delay 50000.
[Main Thread]all thread exited delay:0 .

线程API方式

另一种使用线程本地变量的方式，是使用线程key相关的API，它分为两类，一是创建和销毁接口，另一类是变量的设置与获取接口。

这种方式下，线程的本地数据存储在 pthread结构中，其中specific_1stblock，specific两个数组按key值索引，并存储对应的线程本地数据；

线程本地数据的数量，在这种方式下是有限的。

创建与销毁接口

#include<pthread.h>intpthread_key_create(pthread_key_t*key,void(*destructor)(void*));intpthread_key_delete(pthread_key_t key);

创建接口，获取一个 pthread_key_t变量的值，其实就是内存获取一个键值来存储数据，第二个参数destructor传递一个销毁数据的方法，当本地数据为复杂数据类型，或者动态申请内存时，在线程退出时进行清理调用。

在线程使用完后，需要释放对应的key。

设置本地变量值接口

#include<pthread.h>intpthread_setspecific(pthread_key_t key,constvoid* value);void*pthread_getspecific(pthread_key_t key);

这里设置线程的本地变量值，和获取线程本地变量值；

在不同线程中设置时，就会只设置当前线程的本地变量，不影响其它线程。

代码示例

/* 
 * created by senllang 2024/1/1 
 * mail : [email protected] 
 * Copyright (C) 2023-2024, senllang
 */#include<stdio.h>#include<pthread.h>// 定义一个 TLS 键  pthread_key_t tls_key;voidShowThreadLocalData(char*prompt,pthread_t thid){// 获取 TLS 存储的值  int*value =(int*)pthread_getspecific(tls_key);if(value ==NULL){printf("[%s]Thread: %ld, Value: NULL\n", prompt, thid);}else{printf("[%s]Thread: %ld, Value: %d\n", prompt, thid,*value);}}// 线程函数  void*thread_func(void*arg){ShowThreadLocalData("pre",pthread_self());pthread_setspecific(tls_key,(void*) arg);ShowThreadLocalData("after",pthread_self());returnNULL;}intmain(){// 创建 2 个线程  pthread_t thread1, thread2;int args1 =100, args2=200;pthread_key_create(&tls_key,NULL);// 设置 TLS 值  pthread_setspecific(tls_key,(void*)500);pthread_create(&thread1,NULL, thread_func,&args1);pthread_create(&thread2,NULL, thread_func,&args2);// 等待线程结束  pthread_join(thread1,NULL);pthread_join(thread2,NULL);pthread_key_delete(tls_key);return0;}

在主线程和两个子线程中都设置了本地变量值，运行后，可以看到每个线程中的值都不一样。

[senllang@hatch example_04]$ gcc -lpthread threadLocalStorage_key.c 
[senllang@hatch example_04]$ ./a.out 
[pre]Thread: 140252914022144, Value: NULL
[after]Thread: 140252914022144, Value: 100[pre]Thread: 140252905629440, Value: NULL
[after]Thread: 140252905629440, Value: 200

在线程开始时，获取本地变量值，都没有获取到主线程设置的值。

两种方式比较

不同的存储区域/寻址方法 API 方式定义的数据由 specific_1stblock 数组和结构的特定辅助数组寻址，而__thread存储类型变量由栈空间地址偏移量寻址。
性能/效率差异由于__thread由栈地址偏移量解决，因此性能高于 API方式。
可以存储不同的数据 __thread只能修改常规的POD类型变量，对于指针类型数据，动态申请的内存，需要主动销毁;而 API方式支持传入销毁方法并支持所有数据类型。
支持的数据数量不同理论上，只要堆栈不满，__thread类型的变量就可以无限期定义；而API 方式只能创建PTHREAD_KEYS_MAX个键，但可以使用一个键通过结构体等方式存储多个值。

总结

本文所涉及的代码已经上传到工程hatchCode, 在multipleThreads/example_04目录下；

线程本地变量的使用，使得线程并发时，与进程并发更加相似，都有自己的私有全局数据，当然线程的特别之处在于，线程的本地变量的空间取决于线程栈的大小，当然也可以是结构指针，再动态申请空间，那么空间也就不存在问题了。

结尾

非常感谢大家的支持，在浏览的同时别忘了留下您宝贵的评论，如果觉得值得鼓励，请点赞，收藏，我会更加努力！

作者邮箱：study@senllang.onaliyun.com
如有错误或者疏漏欢迎指出，互相学习。

标签： linux c语言并发编程

本文转载自: https://blog.csdn.net/senllang/article/details/135327922
版权归原作者 韩楚风 所有，如有侵权，请联系我们删除。