Reverse basic OS-specpacific (2)
Chapter 65 local thread storage
TLS is a unique data area of each thread. Each thread can store the data they need here. A famous example is the standard C global variable errno. Multiple Threads can use errno to obtain the returned error code at the same time. If it is a global variable, it cannot work normally in a multi-threaded environment. Therefore, errno must be stored in TLS.
A new thread_local modifier is added to the C ++ 11 standard to indicate that each thread belongs to its own version of the variable. It can be initialized and located in TLS.
Listing 65.1: C ++ 11
#include <iostream>#include <thread>thread_local int tmp=3;int main(){ std::cout << tmp << std::endl;};
Use MinGW GCC 4.8.1 instead of MSVC2012 for compilation.
If we view its PE file, we can see that the tmp variable is placed in the TLS section.
65.1 linear cool Generator
The pure random number generator in the previous chapter 20th has a defect: it is not thread-safe because its internal state variables can be read or modified by different threads at the same time.
65.1.1 TLS data not initialized by Win32
If the _ declspec (thread) modifier is added to a global variable, it is allocated to TLS.
#include <stdint.h>#include <windows.h>#include <winnt.h> // from the Numerical Recipes book#define RNG_a 1664525#define RNG_c 1013904223 __declspec( thread ) uint32_t rand_state; void my_srand (uint32_t init){ rand_state=init;} int my_rand (){ rand_state=rand_state*RNG_a; rand_state=rand_state+RNG_c; return rand_state & 0x7fff;} int main(){ my_srand(0x12345678); printf ("%d\n", my_rand());};
Using Hiew, you can see that the PE file has an additional section:. tls.
Listing 65.2: Optimizing MSVC 2013x86
_TLS SEGMENT _rand_state DD 01H DUP (?)_TLS ENDS_DATA SEGMENT $SG84851 DB '%d', 0aH, 00H_DATA ENDS_TEXT SEGMENT_init$ = 8 ; size = 4_my_srand PROC; FS:0=address of TIB mov eax, DWORD PTR fs:__tls_array ; displayed in IDA as FS:2Ch; EAX=address of TLS of process mov ecx, DWORD PTR __tls_index mov ecx, DWORD PTR [eax+ecx*4]; ECX=current TLS segment mov eax, DWORD PTR _init$[esp-4] mov DWORD PTR _rand_state[ecx], eax ret 0_my_srand ENDP_my_rand PROC; FS:0=address of TIB mov eax, DWORD PTR fs:__tls_array ; displayed in IDA as FS:2Ch; EAX=address of TLS of process mov ecx, DWORD PTR __tls_index mov ecx, DWORD PTR [eax+ecx*4]; ECX=current TLS segment imul eax, DWORD PTR _rand_state[ecx], 1664525 add eax, 1013904223 ; 3c6ef35fH mov DWORD PTR _rand_state[ecx], eax and eax, 32767 ; 00007fffH ret 0_my_rand ENDP_TEXT ENDS
Rand_state is now in the TLS segment, and each thread of this variable has its own version. It is accessed in this way: load the address of TIB (Thread Information Block) from FS: 2Ch, add an additional index (if needed), and then calculate the address in the TLS segment.
Finally, you can access the rand_state variable through the ECX register, which points to the specific data area of each thread.
FS: this is an option that every reverse engineer is familiar. It is used to point to TIB, so the access to the specific data of the thread can be completed quickly.
GS: This selection is used for Win64, and the address 0x58 is TLS.
Listing 65.3: Optimizing MSVC 2013x64
_TLS SEGMENT rand_state DD 01H DUP (?)_TLS ENDS_DATA SEGMENT $SG85451 DB '%d', 0aH, 00H_DATA ENDS_TEXT SEGMENTinit$ = 8my_srand PROC mov edx, DWORD PTR _tls_index mov rax, QWORD PTR gs:88 ; 58h mov r8d, OFFSET FLAT:rand_state mov rax, QWORD PTR [rax+rdx*8] mov DWORD PTR [r8+rax], ecx ret 0my_srand ENDPmy_rand PROC mov rax, QWORD PTR gs:88 ; 58h mov ecx, DWORD PTR _tls_index mov edx, OFFSET FLAT:rand_state mov rcx, QWORD PTR [rax+rcx*8] imul eax, DWORD PTR [rcx+rdx], 1664525 ;0019660dH add eax, 1013904223 ; 3c6ef35fH mov DWORD PTR [rcx+rdx], eax and eax, 32767 ; 00007fffH ret 0my_rand ENDP_TEXT ENDS
Initialize TLS data
For example, we want to set some fixed values for rand_state to prevent programmers from forgetting to initialize them.
#include <stdint.h>#include <windows.h>#include <winnt.h> // from the Numerical Recipes book#define RNG_a 1664525#define RNG_c 1013904223 __declspec( thread ) uint32_t rand_state=1234; void my_srand (uint32_t init){ rand_state=init;} int my_rand (){ rand_state=rand_state*RNG_a; rand_state=rand_state+RNG_c; return rand_state & 0x7fff;} int main(){ printf ("%d\n", my_rand());};
The Code except setting the initial value for rand_state is no different from the previous one, but in IDA we can see:
.tls:00404000 ; Segment type: Pure data.tls:00404000 ; Segment permissions: Read/Write.tls:00404000 _tls segment para public 'DATA' use32.tls:00404000 assume cs:_tls.tls:00404000 ;org 404000h.tls:00404000 TlsStart db 0 ; DATA XREF: .rdata:TlsDirectory.tls:00404001 db 0.tls:00404002 db 0.tls:00404003 db 0.tls:00404004 dd 1234.tls:00404008 TlsEnd db 0 ; DATA XREF: .rdata:TlsEnd_pt...
Each time a new thread runs, a new TLS will be allocated to it, and all data including 1234 will be copied.
This is a typical scenario:
Thread A starts running, assigns it a tls, and copies 1234 to rand_state.
Thread A calls the my_rand () function multiple times. rand_state is not 1234.
Thread B starts running, assigns it a TLS, and copies 1234 to rand_state. At this time, we can see that two threads use the same variable, but their values are different.
TLS callbacks
What if we want to assign a variable value to TLS? For example, the programmer forgets to call the my_srand () function to initialize the PRNG, but the random number generator must use a real random value instead of 1234 at the beginning. In this case, TLS callbaks can be used.
The following code has poor portability because you should understand it. We define a function (tls_callback () that is called before the process/thread starts execution. This function uses the return value of the GetTickCount () function to initialize PRNG.
#include <stdint.h>#include <windows.h>#include <winnt.h> // from the Numerical Recipes book#define RNG_a 1664525#define RNG_c 1013904223 __declspec( thread ) uint32_t rand_state; void my_srand (uint32_t init){ rand_state=init;} void NTAPI tls_callback(PVOID a, DWORD dwReason, PVOID b){ my_srand (GetTickCount());} #pragma data_seg(".CRT$XLB")PIMAGE_TLS_CALLBACK p_thread_callback = tls_callback;#pragma data_seg() int my_rand (){ rand_state=rand_state*RNG_a; rand_state=rand_state+RNG_c; return rand_state & 0x7fff;}int main(){ // rand_state is already initialized at the moment (using GetTickCount()) printf ("%d\n", my_rand());};
Check with IDA:
Listing 65.4: Optimizing MSVC 2013
.text:00401020 TlsCallback_0 proc near ; DATA XREF: .rdata:TlsCallbacks.text:00401020 call ds:GetTickCount.text:00401026 push eax.text:00401027 call my_srand.text:0040102C pop ecx.text:0040102D retn 0Ch.text:0040102D TlsCallback_0 endp....rdata:004020C0 TlsCallbacks dd offset TlsCallback_0 ; DATA XREF: .rdata:TlsCallbacks_ptr....rdata:00402118 TlsDirectory dd offset TlsStart.rdata:0040211C TlsEnd_ptr dd offset TlsEnd.rdata:00402120 TlsIndex_ptr dd offset TlsIndex.rdata:00402124 TlsCallbacks_ptr dd offset TlsCallbacks.rdata:00402128 TlsSizeOfZeroFill dd 0.rdata:0040212C TlsCharacteristics dd 300000h
The TLS callbacks function is often used to hide the unpacking process. Some may be confused about why some code can be secretly executed before OEP (Original Entry Point.
65.1.2 Linux
The following is how GCC declares local thread storage:
#!c__thread uint32_t rand_state=1234;
This is not a modifier of the Standard C/C ++, but an extension feature of GCC.
GS: This Selection Sub-is also used to access TLS, but slightly different:
Listing 65.5: Optimizing GCC 4.8.1 x86
.text:08048460 my_srand proc near.text:08048460.text:08048460 arg_0 = dword ptr 4.text:08048460.text:08048460 mov eax, [esp+arg_0].text:08048464 mov gs:0FFFFFFFCh, eax.text:0804846A retn.text:0804846A my_srand endp.text:08048470 my_rand proc near.text:08048470 imul eax, gs:0FFFFFFFCh, 19660Dh.text:0804847B add eax, 3C6EF35Fh.text:08048480 mov gs:0FFFFFFFCh, eax.text:08048486 and eax, 7FFFh.text:0804848B retn.text:0804848B my_rand endp