A bug of thread TLS in glibc

Source: Internet
Author: User
Tags valgrind

The earliest time was when a timer (timer_create), The first time this timer is triggered, it will cause the program core to fall, and the core location is also not fixed. Valgrind can be used to find memory write errors:

==31676== Invalid write of size 8==31676==    at 0x37A540F852: _dl_allocate_tls_init (in /lib64/ld-2.5.so)==31676==    by 0x4E26BD3: [email protected]@GLIBC_2.2.5 (in /lib64/libpthread-2.5.so)==31676==    by 0x76E0B00: timer_helper_thread (in /lib64/librt-2.5.so)==31676==    by 0x4E2673C: start_thread (in /lib64/libpthread-2.5.so)==31676==    by 0x58974BC: clone (in /lib64/libc-2.5.so)==31676==  Address 0xf84dbd0 is 0 bytes after a block of size 336 alloc'd==31676==    at 0x4A05430: calloc (vg_replace_malloc.c:418)==31676==    by 0x37A5410082: _dl_allocate_tls (in /lib64/ld-2.5.so)==31676==    by 0x4E26EB8: [email protected]@GLIBC_2.2.5 (in /lib64/libpthread-2.5.so)==31676==    by 0x76E0B00: timer_helper_thread (in /lib64/librt-2.5.so)==31676==    by 0x4E2673C: start_thread (in /lib64/libpthread-2.5.so)==31676==    by 0x58974BC: clone (in /lib64/libc-2.5.so)

Google_dl_allocate_tls_initA glibc bug 13862 was found to be a bit similar to mine. This article describes the bug and TLS implementation.

You need to check the glibc source code and how to confirm the version of glibc used. You can do this:

$ /lib/libc.so.6GNU C Library stable release version 2.5, by Roland McGrath et al....

For convenience, you can also directly in the (glibc cross reference) [http://osxr.org/glibc/source? V = glibc-2.17] web page for viewing, different versions, but little impact.

Bug description

To reproduce the 13862 bug, the author mentioned that the following conditions should be met:

The use of a relatively large number of dynamic libraries, loaded at runtime using dlopen.

The use of thread-local-storage within those libraries.

A thread exiting prior to the number of loaded libraries increasing a significant amount, followed by a new thread being created after the number of libraries has increased.

Simply put, a thread is enabled when a large number of dynamic libraries containing TLS variables are loaded. After the thread exits, another thread is started.

This is similar to our problem scenario. The difference is that timer is used, but timer starts a new thread at the time of triggering, And the thread will exit immediately:

/nptl/sysdeps/unix/sysv/linux/timer_routines.c

Timer_helper_thread (...) // detect the auxiliary thread triggered by the timer {... pthread_t th; (void) pthread_create (& th, & TK-> ATTR, timer_sigev_thread, // enable a new thread to call the user-registered timer function TD );...}

To reproduce this bug, use my experiment code thread-TLS or use the attachment in bug 13862.

TLS implementation

You can follow_dl_allocate_tls_initFunction implementation. This function traverses all loaded modules that contain TLS variables and initializes the TLS data structure of a thread.

Each thread has its own stack space, which stores the TLS variables of each module separately, so that TLS variables can be copied separately in each thread. For the relationship between TLS and threads, see:

Used by the Application Layerpthread_tActuallypthreadObject address. The stack space andpthreadThe structure is a continuous memory. However, this address does not point to the first address of the memory. Related code:/nptl/allocatestack. callocate_stackThe function allocates the stack memory of the thread.

pthreadThe first member istcbhead_t,tcbhead_tMediumdtvPoint todtv_tArray. The size of the array dynamically changes with the number of modules loaded by the current program. When each module is loaded, there isl_tls_modidDirectly actdtv_tSubscript index of the array.tcbhead_tIndtvActually pointdtv_tThe second element. The first element is used to record the entiredtv_tThe number of elements in the array. The second element is used for storing TLS variables starting with the third element.

Onedtv_tThe addresses of all TLS variables in a module are stored. Of course, these TLS variables are stored in continuous memory space.dtv_t::pointer::valIt is the pointer used to point to this memory. For non-Dynamically Loaded modules, it points to the thread stack location; otherwise, it points to the dynamically allocated memory location.

The above structure is described in code,

union dtv_t {    size_t counter;    struct {        void *val; /* point to tls variable memory */        bool is_static;    } pointer;}; struct tcbhead_t {    void *tcb;    dtv_t *dtv; /* point to a dtv_t array */    void *padding[22]; /* other members i don't care */};struct pthread {    tcbhead_t tcb;    /* more members i don't care */};

DTV is an array used to store TLS variables in modules..

For the actual code, see/nptl/descr. h and nptl/sysdeps/x86_64/TLS. h.

Lab

Useg++ -o thread -g -Wall -lpthread -ldl thread.cppCompile the code, that is, a. So is loaded before the thread is created:

Breakpoint 1, dump_pthread (id=1084229952) at thread.cpp:4040          printf("pthread %p, dtv %p\n", pd, dtv);(gdb) set $dtv=pd->tcb.dtv(gdb) p $dtv[-1]$1 = {counter = 17, pointer = {val = 0x11, is_static = false}}(gdb) p $dtv[3]$2 = {counter = 18446744073709551615, pointer = {val = 0xffffffffffffffff, is_static = false}}

dtv[3]Corresponding to the Dynamically Loaded modules,is_static=false,valInitialized to-1:

/Elf/dl-tls.c_dl_allocate_tls_init

if (map->l_tls_offset == NO_TLS_OFFSET   || map->l_tls_offset == FORCED_DYNAMIC_TLS_OFFSET) {   /* For dynamically loaded modules we simply store      the value indicating deferred allocation.  */   dtv[map->l_tls_modid].pointer.val = TLS_DTV_UNALLOCATED;   dtv[map->l_tls_modid].pointer.is_static = false;   continue; }

dtvThe array size is 17, see the code/Elf/dl-tls.callocate_dtv:

// Dl_tls_max_dtv_idx increases with the increase of the load module and loads one. so is 1 dtv_length = gl (dl_tls_max_dtv_idx) + dtv_surplus; // dtv_surplus 14dtv = calloc (dtv_length + 2, sizeof (dtv_t); If (DTV! = NULL) {/* this is the initial length of the DTV. */DTV [0]. Counter = dtv_length;

Continue the experiment above.functionThe TLS is initialized.dtv[3]MediumvalPoint to the initialized TLS variable address:

68          fn();(gdb)0x601808, 0x601804, 0x60180072          return 0;(gdb) p $dtv[3]$3 = {counter = 6297600, pointer = {val = 0x601800, is_static = false}}(gdb) x/3xw 0x6018000x601800:       0x55667788      0xaabbccdd      0x11223344

You can also check it at this time.dtv[1]The contentpthreadThe preceding memory location:

(gdb) p $dtv[1]$5 = {counter = 1084229936, pointer = {val = 0x40a00930, is_static = true}}(gdb) p/x tid$7 = 0x40a00940

Conclusion:

  • The storage of TLS variables in a thread is measured in modules.
So module Loading

You do not need to check it here.dlopenAnd so on.__threadThe entire implementation involves some details about the elf loader. Here we can learn about some implementations through experiment.

As you can see above,If. So is dynamically loaded before a thread is created, the DTV array size will increase accordingly.. What if I load. So after the thread is created?

Useg++ -o thread -g -Wall -lpthread -ldl thread.cpp -DTEST_DTV_EXPAND -DSO_CNT=1Compile the program and debug the program as follows:

73          load_sos();(gdb)0x601e78, 0x601e74, 0x601e70Breakpoint 1, dump_pthread (id=1084229952) at thread.cpp:4444          printf("pthread %p, dtv %p\n", pd, dtv);(gdb) p $dtv[-1]$3 = {counter = 17, pointer = {val = 0x11, is_static = false}}(gdb) p $dtv[4]$4 = {counter = 6299248, pointer = {val = 0x601e70, is_static = false}}

When. So is newly loaded,dtvThe array size is not added,dtv[4]Directly used.

BecausedtvWhen the initial size is 16, what will happen when the loaded. So value exceeds this number?

Useg++ -o thread -g -Wall -lpthread -ldl thread.cpp -DTEST_DTV_EXPANDCompile the program:

...pthread 0x40a00940, dtv 0x6016a0...Breakpoint 1, dump_pthread (id=1084229952) at thread.cpp:4444          printf("pthread %p, dtv %p\n", pd, dtv);(gdb) p dtv$2 = (dtv_t *) 0x6078a0(gdb) p dtv[-1]$3 = {counter = 32, pointer = {val = 0x20, is_static = false}}(gdb) p dtv[5]$4 = {counter = 6300896, pointer = {val = 0x6024e0, is_static = false}}

As you can see,dtvThe memory is re-allocated (0x6016a0-> 0x6078a0) and expanded.

The conclusion is as follows:

  • The DTV size is determined by the number of loaded modules before a thread is created.
  • After a thread is created, the newly loaded modules dynamically expand the DTV size (when necessary)
Pthread stack Reuse

Inallocate_stackWhen allocating a thread stack, there is an operation from the cache:

Allocate_stack (..) {... pd = get_cached_stack (& size, & MEm );...} /* Get a stack frame from the cache. we have to match by size since some blocks might be too small or far too large. */get_cached_stack (...) {... list_for_each (entry, & stack_cache) // obtain {...} from stack_cache according to the size {...}... /* clear the DTV. */dtv_t * DTV = get_dtv (tls_tpadj (result); For (size_t CNT = 0; CNT <DTV [-1]. counter; ++ CNT) I F (! DTV [1 + CNT]. pointer. is_static & DTV [1 + CNT]. pointer. Val! = Tls_dtv_unallocated) Free (DTV [1 + CNT]. pointer. val); memset (DTV, '\ 0', (DTV [-1]. counter + 1) * sizeof (dtv_t);/* re-initialize the TLS. */_ dl_allocate_tls_init (tls_tpadj (result ));}

get_cached_stackThepthreadIn DTV to reinitialize.Note:_dl_allocate_tls_initThe DTV array is initialized Based on the module list.

Lab

When a thread exits, it may be treated as a cacheget_cached_stackRetrieve and reuse.

Useg++ -o thread -g -Wall -lpthread -ldl thread.cpp -DTEST_CACHE_STACKCompile the program and run:

$ ./thread..pthread 0x413c9940, dtv 0x1be46a0... pthread 0x413c9940, dtv 0x1be46a0
Review bugs

When the newly created thread reuses the previously exited thread stack_dl_allocate_tls_initThe DTV array is initialized based on the number of loaded modules. If the number of modules exceeds the size of the reused DTV array at this time, an invalid memory will be written. If valgrind is used, the results mentioned at the beginning of this article will be obtained.

Because the DTV array size is usually slightly larger, the program will not be faulty when the number of newly loaded modules is insufficient. You can control the test programSO_CNTSee the changes in DTV content.

In addition, I checked the update history of glibc. So far (2.20) this bug has not been fixed.

References
  • Glibc bug 13862-reuse of cached stack can cause bounds overrun of thread DTV
  • Glibc TLS implementation
  • Linux thread Stack
  • Introduction to thread management in linux user space II: Creating thread stacks

Address: http://codemacro.com/2014/10/07/pthread-tls-bug/
Written by Kevin Lynx posted athttp: // codemacro.com

A bug of thread TLS in glibc

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.