When a thread calls a function fork, the entire process address space is copied to the child process, Copy-on-write is mentioned in section 8.3. A child process is a process that is completely different from the parent process, but if neither the parent process nor the child process modifies the contents of the memory, the memory page can be shared between the parent process and the child process.
By inheriting the entire address space of the parent process, the child process inherits the state of each mutex, read-write lock, and condition variable of the parent process, and the child process needs to clear the lock state if the parent process contains multiple threads and does not immediately call exec after the fork function returns.
Inside a child process after the fork, only one thread appears, which is a copy of the thread that called the fork function in the parent process. If any thread in the parent process locks the lock, the same lock is locked in the child process, and the problem is that the child process does not have a copy of the thread that contains the lock lock, so the child process has no way of knowing which lock needs to be locked and which lock needs to be unlocked.
The above problems can be avoided by calling the function exec after the fork, in which case the old address space is discarded, so the lock state is not important. However, this approach is not always feasible, and if the subprocess needs to continue running, then we need to use a different strategy.
To avoid inconsistent states in a multithreaded process, posix.1 indicates that only asynchronous signal-safe functions can be invoked in the time before the fork returns to the EXEC function. This limits what the child process can do before calling exec, but it does not solve the problem of the lock state in the child process.
To clear the lock state, we can set up a fork handler to handle it.
#include <pthread.h>
-
int pthread_atfork ( void (* Prepare void void (* Span class= "PLN" >parent void void (* child void
Returns: 0 if OK, error number on failure.
Using the function pthread_atfork, we can set up three functions to help the group clear the lock state of the lock. The prepare function is called by the parent process before the parent process calls the function fork to create the child process, and the fork handler is to get all the locks defined by the parent process. The parent fork handler is executed by the parental process call before the parent process has forked the child process but the fork function is not returned, and the fork handler is the function of releasing the lock of the lock acquired by all prepare fork handler; Fork handler is called before the fork function returns in the subprocess, just like the parent fork handler, the child fork handler must release all the locks acquired by the prepare fork handler.
Note that these locks are not locked once and unlocked two times, because when the child process is created, it acquires all the state of the lock defined by the parent process, because prepare locks all locks, and the parent and child processes start running under the same memory content. When the parent and child processes unlock their locked copy, the new memory space is assigned to the child process, and the parent process's memory content is copied to the child process (copy-on-write), so it appears that the parent process locks all the locks of the parent and child processes. The parent and child processes then disassociate the lock of two locks in different address spaces, as if a sequence is executed as follows:
- The parent process locks all locks;
- The child process locks all locks;
- The parent process releases the lock;
- Child process release lock;
We can call the function pthread_atfork multiple times, so we can create multiple sets of fork handler, and if we don't need to use either of these handlers, we can pass in a null pointer, which does not produce any problems. When multiple fork handlers are called, the order in which handlers is called is different, and the parent and child fork handlers are called in the order in which they are registered, whereas the prepare function is called in reverse order of their registration. This sequence allows multiple modules to register their own fork handlers and maintain a locked hierarchy.
For example, suppose module a invokes a function in module B, and two modules have their own locks, and if the locking level is a before B, module B must have the fork installed before module a Handlers. When the parent process calls the function fork, the following steps are executed, assuming the child process starts before the parent process:
- The Prepare fork handler in module A is called to obtain the lock of module A;
- The Prepare fork handler in module B is called to obtain the lock of module B;
- Child processes are created;
- The child fork handler of Module B is called to release the lock of all module B in the subprocess;
- The child fork handler of Module B is called to release the lock of all module A in the subprocess;
- The fork function returns to the child process;
- The parent fork handler of Module B is called to release the lock for all module B;
- The parent fork handler of module A is called to release the lock for all module A;
- The fork function returns to the parent process;
If you use fork handlers to clear the state of the lock, how does the condition of the conditional variable need to be cleared? On some implementations, the condition variable may not require any cleanup work, however, the implementation that uses the lock as part of the condition variable needs to perform cleanup work, but the problem is that the interface is not provided for us to implement cleanup work, if the lock is embedded in the data structure of the condition variable, Then we can not use the conditional variable after calling the fork function (only in the child process by this limit???), because there is no portable interface to clear its state, on the other hand, if the implementation uses a global lock to protect the conditional variable data structure. The implementation itself can be cleaned up in the Fork function library, however, the application should not rely on such details.
Example
The program in Figure 12.17 illustrates the use of function pthread_atfork and fork handler:
#include "apue.h"
#include <pthread.h>
pthread_mutex_t lock1 = PTHREAD_MUTEX_INITIALIZER;
pthread_mutex_t lock2 = PTHREAD_MUTEX_INITIALIZER;
void prepare(void)
{
int err;
printf("prepare locks ...\n");
if((err = pthread_mutex_lock(&lock1)) != 0)
err_cont(err, "can‘t lock lock1 in prepare handler");
if((err = pthread_mutex_lock(&lock2)) != 0)
err_cont(err, "can‘t lock lock2 in prepare handler");
}
void parent(void)
{
int err;
printf("parent unlocking locks ...\n");
if((err = pthread_mutex_unlock(&lock1)) != 0)
err_cont(err, "can‘t unlock lock1 in parent handler");
if((err = pthread_mutex_unlock(&lock2)) != 0)
err_cont(err, "can‘t unlock lock2 in parent handler");
}
void child(void)
{
int err;
printf("child unlocking locks ...\n");
if((err = pthread_mutex_unlock(&lock1)) != 0)
err_cont(err, "can‘t unlock lock1 in child handler");
if((err = pthread_mutex_unlock(&lock2)) != 0)
err_cont(err, "can‘t unlock lock2 in child handler");
}
void *thr_fn(void *arg)
{
printf("thread started ... \n");
pause();
return (0);
}
int main(void)
{
int err;
pid_t pid;
pthread_t tid;
if((err = pthread_atfork(prepare, parent, child)) != 0)
err_exit(err, "can‘t install fork handlers");
if((err = pthread_create(&tid, NULL, thr_fn, 0)) != 0)
err_exit(err, "can‘t create thread");
sleep(2);
printf("parent about to fork...\n");
if((pid = fork()) < 0)
err_quit("fork failed");
else if(pid == 0)
printf("child returned from fork\n");
else
printf("parent returned from fork\n");
exit(0);
}
Figure 12.17 Pthread_atfork Example
The effect of the operation is as follows:
[email protected]:~/APUE/chapter12$ ./12_17.exe
thread started ...
parent about to fork...
prepare locks ...
parent unlocking locks ...
parent returned from fork
child unlocking locks ...
child returned from fork
[email protected]:~/APUE/chapter12$
Although the pthread_atfork mechanism wants to solve the problem of lock state after fork, the mechanism can only be used in restricted environments due to some defects:
- There is no way to reinitialize the state of complex synchronization objects, such as conditional variables and barriers;
- Some implementations of the error detection mechanism for mutexes generate an error when a child process attempts to unlock a mutex locked by its parent process;
- Recursive locks cannot be cleared in child fork handler because there is no way to know the number of times a recursive lock is locked;
- Child Fork Handler cannot be used to purge synchronization objects if the subprocess only allows the invocation of an asynchronous signal security function, because none of these functions are safe from asynchronous signals. In fact, it is possible that the synchronization object may be in an intermediate state when the thread calls fork, but the cleanup of the synchronization object requires that it be in a consistent state.
- If the application calls the fork function in the Signal processing function (which is legal, because the fork function is safe for asynchronous signals), then the pthread_atfork registered fork handlers can only invoke the asynchronous signal security function, otherwise the result is undefined.
From for notes (Wiz)
12.9 Threads and Fork