Linux reentrant, asynchronous signal security, and thread safety

Last Update:2014-10-12 Source: Internet

Author: User

Tags posix signal handler

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

A reentrant function

When a captured signal is processed by a process, the normal sequence of instructions executed by the process is temporarily interrupted by a signal processor. It first executes the instructions in the signal handler. If it is returned from a signal handler (for example without calling Exit or longjmp), proceed to the normal sequence of instructions that the process is executing when the signal is captured (this is similar to what happens when a hardware interrupt occurs.) But in the signal processor, we don't know the code where the process is executing when the signal is captured.

What happens if a process is using malloc to allocate additional memory on its heap, and when the signal handler is inserted to execute it because it snaps to a signal, then malloc is called? Or, what happens if the process is invoking a function that stores the result in a static region, such as Getpwnam, and we call the same function in the signal handler? In the malloc example, the process can be severely damaged because malloc typically maintains a linked list of all of its allocated areas, and the process may be changing this linked table while inserting the execution signal handler. In the case of Getpwnam, the information returned to the normal caller may be overwritten by the information returned to the signal processor.

SUS prescribes a function that must be guaranteed to be re-entered. These re-entry functions are listed in the following table:

a re-entry function that a signal processor may call
Accept	Fchmod	Lseek	SendTo	Stat
Access	Fchown	Lstat	Setgid	Symlink
Aio_error	Fcntl	Mkdir	Setpgid	Sysconf
Aio_return	Fdatasync	Mkfifo	Setsid	Tcdrain
Aio_suspend	Fork	Open	SetSockOpt	Tcflow
Alarm	Fpathconf	Pathconf	Setuid	Tcflush
Bind	Fstat	Pause	Shutdown	Tcgetattr
Cfgetispeed	Fsync	Pipe	Sigaction	Tcgetpgrp
Cfgetospeed	Ftruncate	Poll	Sigaddset	Tcsendbreak
Cfsetispeed	Getegid	Posix_trace_event	Sigdelset	Tcsetattr
Cfsetospeed	Geteuid	Pselect	Sigemptyset	Tcsetpgrp
ChDir	Getgid	Raise	Sigfillset	Time
chmod	GetGroups	Read	Sigismenber	Timer_getoverrun
Chown	Getpeername	Readlink	Signal	Timer_gettime
Clock_gettime	Getpgrp	Recv	Sigpause	Timer_settime
Close	Getpid	Recvfrom	Sigpending	Times
Connect	Getppid	Recvmsg	Sigprocmask	Umask
creat	GetSockName	Rename	Sigqueue	Uname
Dup	GetSockOpt	RmDir	Sigset	Unlink
Dup2	Getuid	Select	Sigsuspend	Utime
Execle	Kill	Sem_post	Sleep	Wait
Execve	Link	Send	Socket	Waitpid
_exit & _exit	Listen	Sendmsg	Socketpair	Write

A reentrant function is simply a function that can be interrupted, that is, it can be interrupted at any moment of the execution of the function, transferred to the OS Scheduler to execute another piece of code, and no error will occur when the control is returned. The reentrant (reentrant) function can be used concurrently by more than one task without worrying about data errors. In contrast, non-reentrant (non-reentrant) functions cannot be shared by more than one task, unless you can ensure that the function is mutually exclusive (either by using semaphores or by disabling interrupts in key parts of the code). The Reentrant function can be interrupted at any time and then continue running without losing data. reentrant functions either use local variables or protect their own data when using global variables. The signal security, in fact, is the asynchronous signal security, is that the thread in the signal processing function, regardless of any way to call your function if the deadlock does not modify the data, it is the signal security. Therefore, I think reentrant and asynchronous signal security is a concept. Second-thread safety

Thread safety: A function is called thread-safe, and it always produces the correct result when it is called repeatedly by multiple concurrent threads.

There is a class of important thread-safe functions called reentrant functions, characterized by their having a property: when they are called by multiple threads, no shared data is referenced.

Although thread safety and reentrant are sometimes (incorrectly) used as synonyms, there are clear technical differences between them. A reentrant function is a true subset of thread-safe functions.

Three possible re-entry and thread safety differences and connections

Reentrant functions:

The re-entry means that the function can be interrupted first, meaning that it is not dependent on any environment (including static) in addition to the variables on its stack, such that the function is Purecode (pure code) reentrant, allowing multiple copies of the function to be run, Because they are using separate stacks, they do not interfere with each other.

A reentrant function is a thread-safe function, but in turn, a thread-safe function may not be a reentrant function.
In fact, there are very few reentrant functions, there are only 115 reentrant functions described in section 10.6 of the single UNIX specification, and there are only 89 Apue functions that are not guaranteed to be thread-safe in the Posix.1 Apue 12.5 section. The signal, like a hardware interrupt, interrupts the sequence of instructions being executed. The signal processing function cannot determine where the process is running when the signal is captured. If the operation in the signal processing function is the same as that of the interrupted function, and there is a static data structure in the operation, when the signal processing function returns (of course, the signal processing function can be returned), the original execution sequence is restored, May cause operations in the signal processing function to overwrite data in the previous normal operation.

non-reentrant cases :

Use a static data structure, such as getpwnam,getpwuid: if the signal is executing when the getpwnam is occurring, the execution getpwnam in the signal handler may overwrite the old value acquired by the original Getpwnam

Call malloc or free: If the signal occurs at malloc (a linked table that modifies the storage space on the heap), the signal handler calls malloc and destroys the kernel's data structure
Use standard IO functions because many implementations of standard IO use global data structures, such as printf (file offsets are global)
Call longjmp or siglongjmp in the function: When the signal occurs, the program is modifying a data structure, and the handler returns to another place, causing the data to be partially updated.

Even for reentrant functions, it is important to note that the errno is a problem to use in signal processing functions. There is only one errno variable in a thread, and the reentrant function used in the signal processing function may also modify the errno. For example, the Read function is reentrant, but it is also possible to modify errno. Therefore, the correct approach is to start with the signal processing function, save the errno first, when the signal processing function exits, and then restore the errno.

For example, the program is calling printf output, but when you call printf, a signal appears, and the corresponding signal handler has a printf statement, which results in a mix of two printf outputs.
If it is for printf locking, the same situation as above will lead to deadlock. in this case, the method used is usually to shield certain signals in a particular area .

Method of shielding the signal:
1> signal (sigpipe, sig_ign); Ignoring some signals
2> Sigprocmask ()
Sigprocmask is defined only for single threads
3> Pthread_sigmask ()
Pthread_sigmasks can be used in multiple threads

Now it seems that the limits of the signal async security and the reentrant limit seem to be the same, so here they are equated;

Thread Safety:
Thread Safety: If a function can be called by multiple threads at the same time, it is called a thread-safe function. The Malloc function is thread-safe.

When sharing is not required, provide a dedicated copy of the data for each thread. If sharing is important, provide explicit synchronization to ensure that the program operates in a deterministic manner. By including the process in the statement to lock and unlock the mutex, you can make the whole process of restlessness into the whole process of line shuo, and can be serialized.

Many functions are not thread-safe, because the data they return is stored in a static memory buffer. By modifying the interface, the caller provides the buffers themselves to make these functions thread-safe.
When the operating system implements support for thread-safe functions, some of the non-thread-safe functions in posix.1 are provided with replaceable thread-safe versions.
For example, gethostbyname () is thread insecure and provides a thread-safe implementation of Gethostbyname_r () in Linux.
The function name is appended with "_r" to indicate that this version is reentrant (for thread reentrant, that is, thread-safe, but not for signal processing functions that are reentrant or asynchronous).

Common negligence problems in multi-threaded programs
1> passes the pointer to the caller stack as a parameter to the new thread.
2> a share that accesses global memory without a synchronization mechanism can change state.
3> A deadlock occurs when two threads try to obtain permissions on the same global resource in turn. One of the threads controls the first resource, and another thread controls the second resource. None of the threads can continue to operate until one of the threads has been discarded.
4> attempts to regain the held lock (recursive deadlock).
5> creates a hidden interval in sync protection. This interval will occur in protection if the protected code snippet contains functions that release the synchronization mechanism and regain the synchronization mechanism before returning the caller. The results are misleading. For callers, the global data is protected from the surface, but is not actually protected.
6> uses the sigwait (2) model to process asynchronous signals when mixing UNIX signals with threads.
7> calls setjmp (3C) and longjmp (3C) and then jumps for a long time without releasing the mutex.
8> The condition cannot be re-evaluated when returned from a call to *_cond_wait () or *_cond_timedwait ().

Iv. Summary

To determine whether a function is reentrant, to determine if it can be interrupted, to get the correct result when the interrupt is resumed. (Interrupt execution of the sequence of instructions does not alter the function's data)
Determining whether a function is thread-safe is to make sure that each thread can get the correct results when it is able to execute its sequence of instructions simultaneously on multiple threads.

If a function is reentrant for multiple threads, the function is thread-safe, but this does not mean that the function is reentrant for the signal handler.
If the function is safe to re-enter the asynchronous signal handler, then it can be said that the function is "asynchronous-signal safe".

Reentrant and thread-safe are two independent concepts that are related to how a function handles resources.

First, reentrant and thread safety are two concepts that are not equivalent, a function can be reentrant, or thread-safe, both can be satisfied, both can not be satisfied (the description of a strict vulnerability, see article II).

Second, from a collection and logical point of view, Reentrant is a subset of thread safety, and reentrant is a sufficient non-essential condition for thread safety. The reentrant function must be thread-safe, but it does not.

Third, POSIX defines the two concepts of reentrant and thread safety:
Reentrant function:a Function whose effect, when called by-or more threads,is guaranteed to is as if the threads each Executed thefunction one after another in a undefined order, even ifthe actual execution is interleaved.

Thread-safe function:a Function, the May is safely invoked concurrently by multiple threads.

Async-signal-safe function:a Function, the May is invoked, without restriction fromsignal-catching functions. No function is Async-signal-safe unless explicitly described as such

The above three relationships are: reentrant function is bound to be thread-safe functions and asynchronous signal security functions, thread-safe functions are not necessarily reentrant functions.

The difference between reentrant and thread safety is reflected in the ability to be called in the Signal handler function, which can be safely called in the Signal handler, so it is also a async-signal-safe function While the thread-safe function is not guaranteed to be safely called in the Signal handler, it is also async-signal-safe function if a non-reentrant function is guaranteed not to be interrupted by a signal blocking set.

It is worth mentioning that the POSIX 1003.1 system Interface default is Thread-safe, but not async-signal-safe. Async-signal-safe needs to be clearly expressed, such as fork () and Signal ().

A non-reentrant function is usually (though not in all cases) judged by its external interface and usage methods. For example: Strtok () is non-reentrant because it stores the string that is tagged internally, and the CTime () function is also non-reentrant, which returns a pointer to the static data that is overridden in each invocation.

A thread-safe function implements the secure access of multiple threads to shared data by means of a lock. The concept of thread safety is related only to the internal implementation of the function, without affecting the external interface of the function. In the C language, local variables are allocated on the stack. Therefore, any functions that are not using static data or other shared resources are thread-safe.

The following library of functions is thread-safe in the current version of AIX:

* C Standard Function library

* BSD-compatible function library

Using global variables (functions) is non-thread safe. Such information should be stored on a thread-by-unit, so that access to the data can be serialized. One thread may read the error code generated by another thread. In Aix, each thread has a separate errno variable.

Finally, let's envision a thread-safe, non-reentrant function:

Assume that the function func () requires access to a shared resource during execution, so in order to implement thread safety, lock the resource before using it and unlock it without requiring it.

Assuming that the function in a certain execution process, after the resource lock has been obtained, the asynchronous signal occurs, the execution flow of the program to the corresponding signal processing function, and then assume that in the signal processing function also need to call the function func (), then Func () in this execution will still access the shared resources before attempting to obtain a resource lock, However, we know that the previous func () instance has acquired the lock, so the signal processing function is blocked-on the other hand, the thread that was interrupted before the signal processing function ended cannot be resumed, and of course there is no chance of releasing the resource, so there is a deadlock between the thread and the signal handler function.

Thus, func (), although secured by locking, is non-reentrant because of the function body's access to the shared resource.

Rewriting function libraries

The following highlights the main steps to overwrite the existing library of functions with a reentrant and thread-safe version, only for the C-language function library.

* Identify all global variables that are exported by the function library. These global variables are typically defined in the header file by the Export keyword.

The exported global variables should be encapsulated. Each variable should be set to be private to the library (implemented by the static keyword) and then create an access function for the global variable to perform access to the global variable.

* Identify all static variables and other shared resources. Static variables are usually defined by the static keyword.

Each shared resource should be associated with a lock, and the granularity of the lock (that is, the number of locks) affects the performance of the function library. In order to initialize all locks, you may need an initialization function that is called only once.

* Identify all non-reentrant functions and convert them to reentrant. See also function re-entry

* Identify all non-thread-safe functions and turn them into thread-safe. See function Threading Security.

Workarounds for using non-thread-safe functions

With some workaround, non-thread-safe functions can be called by multiple threads. This may be useful in some cases, especially when using a non-thread-safe library in a multithreaded program-either for testing purposes or because there is no corresponding thread-safe version available. This workaround increases overhead because it requires serialization of calls to one or a set of functions.

Use locks that act on the entire library, and lock each time you use the library (calling a function in the library or accessing a global variable in the library), as shown in the following pseudo-code:
/* This is pseudo-code! */

Lock (Library_lock);

Library_call ();

Unlock (Library_lock);

Lock (Library_lock);

x = Library_var;

Unlock (Library_lock);

This workaround has the potential to cause a performance bottleneck because at any given time, only one thread can access or use the library arbitrarily. This approach is acceptable only if the library is rarely used, or as a fast implementation.

Use a lock that acts on a single library component (a function or a global variable) or a set of components, as shown in the following pseudo-code
/* This is pseudo-code! */

Lock (Library_modulea_lock);

Library_modulea_call ();

Unlock (Library_modulea_lock);

Lock (Library_moduleb_lock);

x = Library_moduleb_var;

Unlock (Library_moduleb_lock);

This method is more complex than the former, but it can improve the performance

Because this type of workaround should only be used in applications other than in the library, you can use a mutex (mutex) to lock the entire library.

Linux reentrant, asynchronous signal security, and thread safety

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More