Use the fork and fork in multiple threads with caution.
Preface
In the single-core era, all programs are single-process/Single-thread programs. With the development of computer hardware technology, after entering the multi-core era, in order to reduce the response time, reuse the resources of multi-core cpu, the use of multi-process programming methods is gradually accepted and mastered. However, the cost of creating a process is relatively high, and the multi-threaded programming methods are gradually recognized and favored.
I remember when I was just learning about thread processes, why is it better to combine multi-process and multi-thread? Now let's take a look at too young too simple, which will be discussed later.
Process and thread model
The classic definition of a process is an instance of the program in execution. Every program in the system runs in the context of a process. Context is composed of the State required for the program to run correctly. This state includes the code and data of the program stored in the memory, its stack, general purpose register content, program counter (PC) environment variables and the set of open file descriptors.
The process mainly provides two abstractions to the upper-layer applications:
- An independent logical control flow provides an illusion that our program exclusively uses a processor.
- A private virtual address space provides an illusion that our program exclusively uses a memory system.
The thread is the logical flow that runs in the process context. The thread is automatically scheduled by the kernel. Each thread has its own thread context, including a unique integer thread ID, stack, stack pointer, program counter (PC), general purpose register, and condition code. Each thread shares the remainder of the process context with other threads running in the same process. This includes the entire user virtual address space, which consists of read-only text (CODE), read/write data, heap, and all the shared library code and data areas. The thread also shares the set of opened files.
That is, the process is the minimum unit of resource management, while the thread is the minimum unit of program execution.
In linux, posix Threads can be regarded as lightweight processes. Both pthread_create thread and fork creation process are created by calling the _ clone function in the kernel, the options for creating a thread or process are different, such as whether to share the virtual address space and file descriptor.
Fork and Multithreading
We know that a child process created through fork is almost but not exactly the same as the parent process. The sub-process obtains a copy of the same (but independent) user-level virtual address space as the parent process, including text, data, and bss segments, heap, and user stack. The child process also obtains the same copy as any opened file descriptor of the parent process. This means that the child process can read and write any opened file in the parent process, the biggest difference between parent and child processes is that they have different PIDs.
However, in Linux, fork only copies the current thread to the sub-process. In fork (2)-Linux Man Page, There is a related description:
The child process is created with a single thread--the one that called fork(). The entire virtual address space of the parent is replicated in the child, including the states of mutexes, condition variables, and other pthreads objects; the use of pthread_atfork(3) may be helpful for dealing with problems that this can cause.
That is to say, in addition to the thread that calls fork, other threads "evaporate" in sub-processes.
This is the root cause of all the problems caused by fork in multithreading.
Mutex lock
Mutex lock is a key part of most of the multi-thread fork problems.
In most operating systems, for the sake of performance, the lock is basically implemented in the user State rather than the kernel state (because the most convenient implementation in the user State, basically, it is implemented through atomic operations or memory barrier mentioned in the previous article). Therefore, when fork is called, all the locks of the parent process will be copied to the child process.
This is the problem. From the operating system perspective, each lock has its owner, that is, the thread that locks it. Assume that a thread locks a lock before fork, that is, it holds the lock, and another thread calls fork to create a sub-process. But the thread holding the lock in the sub-process "disappears". From the perspective of the sub-process, the lock is "permanently" locked, because its owner "evaporated.
If any thread in the sub-process locks the lock that has been held, a deadlock will occur.
Of course, some people may say that they can get all the locks before fork, and then release each lock in the fork sub-process. Not to mention the fact that the business logic and other factors allow us to do so, this approach will bring about a problem, that is, it implies a lock sequence, the sub-process must be unlocked in the same order. Otherwise, a deadlock will occur.
If you are sure that you will unlock the sub-process in the same order without making any mistakes, there is also an implicit problem that you cannot control, that is, the library function.
Because you are not sure that all the library functions you use will not use shared data, that is, they are completely thread-safe. A considerable number of thread-safe library functions are implemented internally by holding mutex locks, for example, almost all programs will use the standard C/C ++ library functions malloc and printf.
For example, a multi-threaded program will inevitably allocate dynamic memory before fork, which will inevitably use the malloc function, while dynamic memory will inevitably be allocated in subprocesses after fork, this also requires the use of malloc, but this is not safe, because it is possible that the internal lock of malloc has been held by a thread before fork, the thread disappears in the child process.
Exec and file descriptor
According to the above analysis, it seems that it is the only wise to call the exec function immediately in the fork sub-process in multithreading. In fact, even if this is done, there are still some shortcomings. Because the child process inherits all opened file descriptors from the parent process, the child process can still read and write files in the parent process before exec is executed, but what if you do not want the child process to read or write an opened file in the parent process?
Perhaps fcntl can be used to set file attributes:
int fd = open("file", O_RDWR | O_CREAT);if (fd < 0){ perror("open");}fcntl(fd, F_SETFD, FD_CLOEXEC);
However, if another thread fork has a sub-process before calling fcntl to set the CLOEXEC attribute after opening the file in open mode, the sub-process can still read and write the file. If the lock is used, it will return to the situation discussed above.
Starting from Linux 2.6.23 kernel, we can set the O_CLOEXEC flag in open, which is equivalent to "opening the file and setting CLOEXEC" as an atomic operation. In this way, you cannot read and write the files opened in the parent process before the child process fork executes exec.
Pthread_atfork
If you have unfortunately encountered a problem about fork in multithreading, you can try pthread_atfork:
int pthread_atfork(void (*prepare)(void), void (*parent)void(), void (*child)(void));
- The prepare processing function is called by the parent process before the fork creates a child process. The task of this function is to obtain all the locks defined by the parent process.
- The parent processing function is called in the parent process environment after the fork creates a child process, but before the fork returns. Its task is to unlock all the locks obtained by prepare.
- The child handler is called in the subprocess environment before fork returns. Like the parent handler, it must unlock all the locks obtained in the prepare.
Because the sub-process inherits the copy of the lock of the parent process, all of the above are not unlocked twice, but unlocked independently. You can call the pthread_atfork Function Multiple times to set multiple fork handlers, but when multiple handlers are used. The Calling sequence of the handler is different. Parent and child are called in the order they are registered, while prepare is called in the opposite order as it is registered. This allows multiple modules to register their own handlers and maintain the lock level (similar to the structure hierarchy of multiple RAII objects ).
Note that pthread_atfork can only clear locks, but cannot clear condition variables. In some systems, condition variables do not need to be cleared. However, in some systems, the implementation of conditional variables contains locks, which need to be cleared. However, there are no interfaces and methods for clearing condition variables.
Conclusion
- In a multi-threaded program, it is best to use fork to execute the exec function without any other operations on the child process of fork.
- If you are sure to execute the exec function through the sub-process in the fork thread, you must add the CLOEXEC flag when opening the file descriptor before the fork.
References
(End)