Analysis of the Thread Local Storage (TLS) mechanism in Windows

Source: Internet
Author: User

Multithreading is one of the most common problems in programming. The reason is that multi-threaded programs often violate the idea of high-level languages shielding the underlying details of the system, programmers need to have a deep understanding of the calling mechanism of the operating system. Writing algorithm programs in advanced languages-> writing multi-threaded programs may be a difficult journey. Of course, for multi-threaded programs, even if they do not grasp the details of the operating system, if they have learned some general principles of the operating system, they may be able to barely write the program, however, the control and understanding of the program may not be so good. If a multi-threaded program contains multiple modules (DLL dynamic loading), writing a program may be a disaster if you cannot understand the internal mechanism.

In response to multi-module DLL calls, Windows provides TLS (Thread Local Storage, thread local storage ). Although TLS can still be used in applications that do not call DLL, the operating system designers do not recommend using TLS too much. In common applications, try to avoid using TLS. But for DLL, TLS is an alternative to static and global variables. Most of the following content is cited or briefly described in Windows core programming (Windows via C/C ++). For more implementation details, see Matt.
Pietrek's "Windows 95 system programming secrets" also occasionally has its own comments. Of course, for more detailed content about windows multi-thread programming, see operating system principles and Windows Kernel explanations.

Local thread storage provides a mechanism to bind data to a specific thread. Through this mechanism, you can share some global variables that were originally shared by the thread (because the thread does not have its own memory space in the operating system, it shares space with other threads in the same process, so for the thread, the global variable is not the thread private) is converted to the thread private, in this way, some programs that use too many global variables without considering multi-thread concurrency are supported. Of course, TLS is not only applicable to the above situations.

For example, Microsoft's early C Language Runtime Library was written for a single thread, and many global and static variables were used for its implementation. In the later maintenance process, TLS is widely used to support multithreading.

For the use of global variables, the author of Windows core programming wrote as follows:

"In my own software projects, I avoid global variables as much as possible. If your application uses global and static variables, I strongly suggest that you examine each variable and investigate the possibilities
For changing it to a stack-based Variable. This effort can save you an enormous amount of time if you decide to add threads to your application, and even single-threaded applications can benefit ."

The general idea is to avoid using global variables as much as possible. If you use global variables, change them to the variables stored in the stack as much as possible. This effort saves you a lot of time when you try to add multithreading, even if a single-threaded program will benefit from this.

There are two types of TLS: static and dynamic. They can be used in common applications or DLL at the same time. But it makes more sense for the DLL: Because the DLL does not know the internal structure of the calling program. Use local variables whenever possible in a thread in a common application.

Dynamic TLS:

Shows the dynamic TLS allocation of each thread in the memory space of the operating system. The allocation of local variables of each thread corresponds to a bit in the array. The value is free or inuse (which may correspond to 0 and 1 respectively ). It corresponds to the distribution of the dynamic storage structure (slot) of the corresponding subscript (index. Tls_minimum_available indicates the maximum number of slots that can be carried by the system, which is 64 in windows. In addition to the bit flag array to mark the storage status of the slot, there are also pvoid (should be a null pointer) type Arrays for storing the slot. The number of members is the same as that of the bit array, and the members correspond one to one. Bit
The specific implementation details of the Flag array and slot array, Windows core programming, are not mentioned too much. I have referred to Matt pietrek's Windows 95 system programming secrets. The content is quoted as follows:

"

The Windows 95 process database (PDB)
In Windows 95, each process database is a block of memory allocated from
The Kernel32 shared memory heap. Kernel32 often uses the acronym
PDB instead of the longer term "process database." Unfortunately, in Win16,
PDB is a synonym for the dos psp that all programs have. Is this confusing?
Yes! For the purposes of this chapter, I'll use PDB IN THE Kernel32 sense
The term. Each PDB is considered to be a Kernel32 object as evidenced
The value 5 (k32obj_process) in the first DWORD of the structure.
Procdb. h file from the win32wlk program gives a C-style view of
PDB structure.

....
88 hDWORD tlsinusebits1
These 32 bits represent the status of the lowest 32 TLS (Thread Local Storage)
Indexes. If a bit is set, the TLS index is in use. Each successive TLS index is
Represented by successively greater bit values; for example:
Tlsindex: 0 = 0x00000001
Tlsindex: L = 0x00000002
Tlsindex: 2 = 0x00000004
Thread Local Storage is discussed in detail in the "Thread Local Storage"
Section later in this chapter.
8chDWORD tlsinusebits2
This DWORD represents the status of TLS indices 32 through 63. See
Previous field description (88 h) for more information.

...

The thread Database
The thread database is a Kernel32 object (type k32obj_thread) That's
Allocated from the Kernel32 shared heap. Like process databases,
Thread databases aren't directly linked together in a linked-list fashion.
Threadb. h file from the win32wlk sources has a C-style structure defi-
Nition for a thread database.

...
3chPdword ptlsarray
This Pointer Points to the thread's TLS array. The entries in this array are
Used by the tlssetvalue family of functions. TLS is described later in this
Chapter. The actual memory for the TLS array comes a bit later in
Thread database.
...
98 hDWORD tlsarray [64]
The tlsarray field is an array of 64 Dwords. Each DWORD holds
Value that tlsgetvalue returns for a given tls id. For instance, the first
DWORD in the array is returned by tlsgetvalue (0). The second DWORD
Is returned by tlsgetvalue (1), and so on. TLS is described in a subsequent
Section of this chapter.
...

"

The original Article is somewhat obscure because it involves a lot of implementation details, such as the implementation of the Windows Kernel and the storage in the memory. The content is about the first 32 bits and the last 32 bits of the bit flag array are stored in a DWORD type variable respectively. These two arrays are stored in the process database (PDB. The base address and actual data of pvoid data are stored in the online database. For details about the thread database, process database, and other Windows systems, you can further read the masterpiece of Matt pietrek. I will not take a shift here...

TLS accesses the actual data mainly through members of the DWORD type in the pvoid array. This member should generally store the address of the thread's private variable, and pvoid should be a data type similar to the void pointer.

After talking about the dynamic TLS mechanism, the rest is the interface provided by the operating system to TLS. There are four main functions:

DWORD tlsalloc ();

Bool tlssetvalue (DWORD dwtlsindex, pvoid pvtlsvalue );

Pvoid tlsgetvalue (DWORD dwtlsindex );

Bool tlsfree (DWORD dwtlsindex );

The function is to obtain a TLS index, set a pvoid pointer to the slot array, obtain a pvoid pointer, and release the TLS of the corresponding index. The function interface is not difficult to understand. In tlsalloc, all pvoid arrays corresponding to the indexes of all threads in the same process are set to 0, so as to prevent access to dirty data previously called free.

For the index's TLS storage location, Windows core programming is described as follows:

"A dll (or an application) usually saves the index (that is, the TLS index) in a global variable. this is one of those times when a global variable is actually the better choice because the value is
Used on a perprocess basis rather than a per-thread basis ."

It is clear that the author recommends storing TLS indexes in the global data segment of the process, which is why TLS is actually a multi-threaded global variable.

The dynamic TLS mechanism can be understood as the operating system provides a synchronous memory space for each thread. These memory space structures (TLS indexes) are the same, the meaning (or use) of the data is the same, but the actual data is different. Because the indexes are unified, the indexes are stored as global variables.

Static TLS

Static TLS is easy to use. You only need to add _ declspec (thread) before the declaration of global or static variables.

For example, __declspec (thread) DWORD gt_dwstarttime = 0;

The local variables declared by _ declspec (thread) (survival in the stack) are meaningless.

Declared variables of _ declspec (thread) will create a separate copy for each thread, and access to variables of the _ declspec (thread) type, the compiler will perform separate processing.


The above section briefly introduces the TLS Thread Local Storage Mechanism in windows, mainly referring to some classic books. For more detailed and in-depth details, or if you want to use these features in a program, refer to the bibliography mentioned above.


Bibliography:

Matt pietrek Windows 95 system programming secrets

Jeffrey Richter, Caffe nasarre windows via C/C ++, privacy th Edition

 

In addition:

One application of TLS is the management of the status of the thread module in MFC. The following post briefly introduces mfc tls:

 

Original article: http://www.cnblogs.com/moonz-wu/archive/2008/05/08/1189021.html

Local thread storage TLS

The Windows operating system provides a process/thread program model, where process is the resource allocation object.
, Master the resources owned by the program, and thread represents the running of the program, which is the scheduling object of the operating system. Yes
Note that in the operating system, these two items are both a Kernel32 object. Process database and th
Read database. For more information, see the Windows 95 programing secret of Matt Petrik.

Thread Local Storage is a mechanism for implementing global data of the thread, and the data is only
As the data is stored in the thread database of the thread:
A 64-dollar DWORD array is defined in the database to save the data. The operating system also provides
To complete the operations on the data, such as tlsalloc, tlsfree, tlssetvalue, and tlsgetvalue.

In MFC, TLS is also provided. Therefore, MFC has designed a series of classes and programs to complete this task. Details
The program is in afxtls. cpp and afxtls _. h.
The main classes involved are:

Class ctypedsimplelist: Public csimplelist
Struct cthreaddata: Public cnotrackobject
Struct cslotdata
Class cthreadslotdata
Class cthreadlocal: Public cthreadlocalobject

Cthreadslotdata is the most important class for encapsulating TLS, ctypedsimplelist, cslotdata, CTH
Readdata is a class designed to encapsulate TLS with only auxiliary functions. Cthreadlocal is a high-level encapsulation.

First, let's analyze the data encapsulation method. The definition and analysis of important classes are as follows:
For the sake of simplicity, only data members are listed and function members are not listed)

Definition:

Class cthreadslotdata
{
Public:
DWORD m_tlsindex;
Int m_nalloc;
Int m_nrover;
Int m_nmax;
Cslotdata * m_pslotdata;
Ctypedsimplelist <cthreaddata *> m_list;
Critical_section m_sect;
};

Analysis:

Afxtls. cpp defines a global variable of the cthreadslotdata class: _ afxthreaddata. In CTH
This global variable is widely used in readlocal member functions to access the TLS function.

DWORD m_tlsindex

The index used to save the TLS data, that is, the offset in the 64-yuan array in the thread database.
Cthreadslotdata class constructor initialization.

Int m_nalloc
Int m_nrover
Int m_nmax

These three variables are used to allocate slots and record related States. For example, m_nalloc is used to save the currently allocated slot
Number. The thread allocates a slot for each TLS data.

Cslotdata * m_pslotdata;

Used to record the status of each allocated slot: used or not used.

Ctypedsimplelist <cthreaddata *> m_list;

Cthreadslotdata implements one and only one cthreaddata object for each thread, and uses the linked list
Class Object m_list to manage them. Actually, the cthread is actually saved to the thread database.
The pointer to the data object, and the TLS data to be stored by the programmer is saved to the pdata Member of the cthreaddata object
To the dynamic array. The cthreaddata objects of all threads are linked through the pnext Member of the cthreaddata object.
Table, which is managed by ctypedsimplelist <cthreaddata *> m_list.

Critical_section m_sect;

Because all the TLS operations of threads are implemented by Access _ afxthreaddata, this produces multi-thread synchronization.
Problem: m_sect is a variable used for thread synchronization. Ensure that only one thread is accessing _ afxthread at a time
Member variables in data.

Definition:

Struct cthreaddata: Public cnotrackobject
{
Cthreaddata * pnext; // required to be member of csimplelist
Int ncount; // current size of pdata
Lpvoid * pdata; // actual Thread Local data (Indexed by nslot)
};

Analysis:

Cthreaddata is used to assist cthreadslotdata in TLS. For each thread, TLS data is required.
It is managed and saved by a cthreaddata object.

Cthreaddata * pnext

In cthreadslotdata, cthreaddata is managed by a linked list, And pnext is used to manage the cthre of each thread.
Addata objects are linked.

Int ncount

Specifies the length of the dynamic array used to save the TLS Data Pointer.

Lpvoid * pdata

In cthreaddata, the pointer of each TLS data is actually saved. Therefore, a pointer array is defined, and ncount uses
To indicate the length of the array. pdata is used to indicate the base address of the array.

Definition:

Struct cslotdata
{
DWORD dwflags; // slot flags (Allocated/Not allocated)
Hinstance hinst; // module which owns this slot
};

Analysis:

Cslotdata is used to assist cthreadslotdata to complete TLS functions. The TLS data of each thread depends on
A cthreaddata object is saved. The specific implementation is to save the TLS data pointer to the dynamic state of the cthreaddata object.
Pointer array (indicated by pdata ). The usage of each member in this array is determined by the length
The cslotdata array is the same, which is indicated by DWORD dwflags.

From the above analysis, it is not difficult to find that the TLS function encapsulation in MFC is like this, And the TLS data pointers of all threads are
Saved in a dynamic pointer array, and the base address of the array is indicated by pdata of a cthreaddata object.
At the same time, the pointer of the cthreaddata object is stored in the thread database, instead of the TLS Data Pointer.
And the index values are the same, all of them are m_tlsindex members in the cthreadslotdata class. In addition
Slotdata provides a linked list to manage cthreaddata objects of all threads. So cthreadslotdata
Class to access the TLS data of all threads. See Figure tls.bmp. (For convenience, I put the figure in the signature file.
, Just below)

The following describes how to use the TLS function.

To facilitate TLS usage, MFC designs the cthreadlocal class. It is a template class, which is defined as follows:

Template <class type>
Class cthreadlocal: Public cthreadlocalobject
{
// Attributes
Public:
Afx_inline type * getdata ()
{
Type * pdata = (type *) cthreadlocalobject: getdata (& Createobject );
Assert (pdata! = NULL );
Return pdata;
}
Afx_inline type * getdatana ()
{
Type * pdata = (type *) cthreadlocalobject: getdatana ();
Return pdata;
}
Afx_inline operator type *()
{Return getdata ();}
Afx_inline type * operator-> ()
{Return getdata ();}

// Implementation
Public:
Static cnotrackobject * afxapi Createobject ()
{Return new type ;}
};

When cthreadlocal is used, you only need to use cthreadlocal <classtype> name; to construct
The TLS data of classtype. Note that classtype must be based on cnotrackobject. In fact, the above declaration Definition
A cthreadlocal object named name. However, this cthreadlocal object can be used to generate and access
The type of TLS data is classtype.


For information about the module status management of MFC, refer to Chapter 9 of "MFC deep dive" by Li Jiujin. The status of MFC is as follows: http://www.vczx.com/tutorial/mfc/mfc9.php.

For more information, see the source code of MFC.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.