Fork is not allowed in multi-threaded programs
Code of design for C ++ on UNIX 3
Criterion 3: fork is not allowed in multithreaded programs
When fork is used in a multi-threaded program, it may cause various problems. A typical example is that the Fork sub-process may be deadlocked. please do not, Fork sub-processes in multi-threaded programs if you cannot grasp the problem.
What problems can this cause?
Let's take a look at the instance. When the following code is executed and doit () is called at the beginning of the sub-process execution, the chance of deadlock will be high.
1 void * doit (void *){
2
3 static pthread_mutex_t mutex = pthread_mutex_initializer;
4
5 pthread_mutex_lock (& mutex );
6
7 struct timespec Ts = {10, 0}; // sleep for 10 seconds
8 nanosleep (& TS, 0); // sleep for 10 seconds
9
10 pthread_mutex_unlock (& mutex );
11
12 Return 0;
13
14}
15
16
17
18 int main (void ){
19
20 pthread_t T;
21
22 pthread_create (& T, 0, doit, 0 );
23
24 // make and start the thread ·
25
26 if (Fork () = 0 ){
27
28
29
30
31
32 // sub-process
33
34 // when a child process is created, there are many parent child processes executing nanosleep.
35
36 doit (0 );
37 return 0;
38}
39
40 pthread_join (T, 0 );//
41
42 // wait until the sub-thread ends
43
44}
45
The reasons for the deadlock are as follows.
Generally, fork does the following:
1. The memory data of the parent process will be copied to the child process intact.
2. The sub-process is generated in a single thread state.
In the memory area, the memory of the static variable (note * 2) mutex will be copied to the sub-process. in addition, even if multiple threads exist in the parent process, they will not be inherited into the child process. the two features of fork are the cause of the deadlock.
Note: A detailed explanation of the cause of the deadlock ---
1. Execute doit () in the thread first.
2. Lock the mutex variable during doit execution.
3. The mutex variable content will be copied to the fork sub-process as it is (before that, the mutex variable content has been changed to the locked state by the thread ).
4. when the sub-process calls doit again, it will find that the mutex has been locked when it locks the mutex, so it will keep waiting, wait until the process that owns the mutex releases it (no one actually owns the mutex lock ).
5. before the thread's doit execution is complete, it will release its own mutex, but this is the mutex and the mutex in the sub-process are already two memories. therefore, even if the mutex lock is released, it will not affect the mutex in the child process.
For example, try to consider the following execution process to understand why the accidental use of fork in the preceding multi-threaded program causes a deadlock (note * 3 ).
1. In the parent process before fork, threads 1 and 2 are started.
2. Thread 1 calls the doit Function
3. The doit function locks its own mutex
4. Thread 1 executes the nanosleep function to sleep for 10 seconds
5. Here the program handles switching to thread 2
6. Thread 2 calls the fork Function
7. Generate sub-Processes
8. At this time, the mutex used by the doit function of the sub-process is in the "locked state", and the unlocked thread does not exist in the sub-process.
9. The sub-process starts to process
10. The sub-process calls the doit function.
11. The sub-process locks the locked mutex again, and then causes a deadlock.
The doit function here is called the fork-unsafe function because of fork in multithreading ". otherwise, the function that cannot cause the problem is called the fork-safe function ". although in some commercial Unix systems, the functions provided by OS (system calls) are recorded in fork-safety, but in Linux (glibc), of course! Not recorded. even in POSIX, there are no special rules, so those functions are fork-safe and almost unrecognizable. if you don't understand it, it would be better to consider it as unsafe. () Wolfram gloger said that calling the asynchronous signal security function is a specification standard, so I tried to investigate that there is "in the meantime (comment * 5) in the pthread_atfork handler ), only a short list of async-signal-safe Library Routines are promised to be available.. it seems like this.
To put it simply, the malloc function is a typical example of maintaining its inherent mutex. Generally, it is fork-unsafe. there are many functions dependent on the malloc function, such as the printf function, which also turns into fork-unsafe.
So far, it has been written that thread + fork is dangerous, but there is a special case to tell you. "when you call exec immediately after fork, it is not a problem as a special column ". why ..? Once the exec function (note * 6) is called, the "Memory Data" of the process is temporarily reset to a very beautiful state. therefore, even in a multi-threaded process, the fork does not immediately call all dangerous functions, but only calls the exec function, the sub-process will not generate any mistaken actions. however, please note that the word "" is used here. even if only one printf ("I'm child process") is called before exec, there is a risk of deadlock.
Note: When the commands specified in the exec function are executed, the memory image of the modified command overwrites the memory space of the parent process. Therefore, no data exists in the parent process.
How can we avoid disasters?
Is there any way to avoid deadlock in order to securely use fork in multi-threaded programs? Try to consider a few.
Avoidance Method 1: When fork is executed, other threads are completely terminated before it.
If other threads are completely terminated before fork, the problem will not occur. but this is only possible. also, for some reason, when other threads cannot end and execute fork, there will be some non-merging problems that are difficult to parse.
Method 2: fork and then immediately call the exec function in the sub-process
(Remember something you forgot to write)
When method 1 is avoided, execl and other exec series functions are called immediately after fork without calling any function (such as printf. if you do not use "fork without EXEC" in the program, this should be the actual Avoidance Method.
Note: The author may mean to write the tasks that the original sub-process should do into a separate program, compile it into an executable program, and then call it by the exec function.
Avoidance Method 3: "other threads", without fork-unsafe Processing
Except the thread that calls fork, all other threads should not handle fork-unsafe. in the case of using threads to increase the speed of numerical computation * 7, this may be fork-safe processing, but this is not the case in general applications. even if we only grasp that the functions are fork-safe, it is not easy to do. the fork-safe function must be an asynchronous signal security function, and they can all be counted. therefore, malloc/New and printf functions cannot be used.
Method 4: Use the pthread_atfork function to call the previously prepared callback function before fork.
Use the pthread_atfork function to call the previously prepared callback function before the fork is to be called. In this callback function, the memory data of the process is cleared through negotiation. however, the functions provided by the OS (such as malloc) are not cleared in the callback function. because the data structure used in malloc is invisible externally. therefore, the pthread_atfork function has almost no practical value.
Solution 5: In multi-threaded programs, fork is not used.
Instead of using the fork method, pthread_create is used to replace fork. This is a more practical method than evict 2 and is recommended.
* 1: generate system calls for sub-Processes
* 2: global variables and static variables in Functions
* 3: If Linux is used, check the man manual of the pthread_atfork function. There are some explanations about these procedures.
* 4: Solaris, HP-UX, etc.
* 5: The time from fork to Exec execution
* 6: execve system call
* 7: fork-safe if only four arithmetic operations are performed