Since the concept of multithreaded programming has emerged in Linux, the development of Linux multi-line applications has always been related to two issues: compatibility, efficiency. Starting from the threading model, this paper analyzes the implementation of the most popular Linuxthreads line libraries on the Linux platform and its shortcomings, and describes how the Linux community treats and solves the two problems of compatibility and efficiency.
I. Basic knowledge: Threads and processes
According to the textbook definition, a process is the smallest unit of resource management, and a thread is the smallest unit of program execution. In the design of the operating system, from the process evolution of the thread, the main purpose is to better support SMP and reduce (process/thread) context switching overhead.
Regardless of the division, a process requires at least one thread as its instruction executor, and the process manages resources (such as CPUs, memory, files, and so on), while assigning threads to a CPU to execute. A process can of course have multiple threads, at this point, if the process runs on an SMP machine, it can use multiple CPUs at the same time to execute each thread, to achieve maximum parallelism to improve efficiency, and even on a single CPU machine, the use of multithreaded models to design programs, As in the case of a multi-process model in lieu of a single process model, the design is more concise, more complete, more efficient execution of the program, such as the use of multiple threads to respond to multiple inputs, the multi-threaded model implementation of the function can actually be implemented with a multi-process model, compared with the latter, The context switching overhead of a thread is much smaller than the process, and semantically, in response to multiple input functions, it is actually sharing all resources except the CPU.
In view of the two meanings of threading model, the core-level thread and the user-level threading model are developed respectively, and the standard of classification is mainly the thread dispatcher in the kernel or outside the kernel. The former is more advantageous for concurrent use of multi-processor resources, while the latter is more about context switching overhead. In the current commercial system, it is common to use both to provide the core thread to meet the needs of the SMP system, and to implement another thread mechanism in the user state in the way of line libraries, when a core thread becomes the dispatcher of multiple user-state threads simultaneously. As with many technologies, "blending" usually leads to higher efficiency, but it also makes it more difficult to implement, for "simple" design ideas, Linux does not have a plan to implement a hybrid model from the start, but it uses a "mix" of another approach in its implementation.
The implementation of the thread mechanism can be implemented on the kernel of the operating system, or it can be implemented out of the core, which obviously requires at least a process to be implemented in the core, while the former generally requires that the process be supported at the same time. The core-level threading model obviously requires the support of the former, while the user-level threading model is not necessarily based on the latter. This discrepancy, as previously mentioned, is brought about by the different standards of the two classifications.
When the kernel supports both a process and a thread, it is possible to implement a "many-to-many" model of thread-process, in which a thread of a process is dispatched by the kernel, while it can also be used as a dispatcher for the user-level thread pool and select the appropriate user-level thread to run in its space. This is the "hybrid" threading model mentioned earlier, which can meet the needs of multiprocessor system and minimize scheduling overhead. The majority of commercial operating systems (such as digital Unix, Solaris, Irix) employ this threading model that fully implements the POSIX1003.1C standard. The threads implemented in the kernel can be divided into two models: one-to-one and many-to-ones, the former with a core process (perhaps a light-weight process) corresponding to a thread, the thread scheduling is equivalent to the process scheduling, to the core to complete, while the latter is completely outside the core implementation of multi-threading, scheduling is completed in the user The latter is the simple implementation of the user-level threading model mentioned earlier, obviously, this out-of-the-core thread scheduler actually only needs to complete the line Cheng stacks switch, the scheduling overhead is very small, but also because the core signal (whether synchronous or asynchronous) is in the process unit, and therefore cannot be located to the thread, So this implementation can not be used in multiprocessor systems, and this demand is becoming larger and bigger, so in reality, the implementation of pure user-level threads, in addition to the research purposes of the algorithm, almost disappeared.
The Linux kernel only provides support for lightweight processes, limiting the implementation of a more efficient threading model, but Linux focuses on optimizing the scheduling overhead of the process and partly compensates for this shortcoming. Currently the most popular threading mechanism Linuxthreads is threading-process "one-to-one" model, scheduling to the core, and at the user level to implement a thread management including signal processing mechanism. The operating mechanism of linux-linuxthreads is the focus of this paper.
Two. Lightweight process implementation in the Linux 2.4 kernel
The initial process definition contains the program, the resource and its execution three parts, in which the program usually refers to the code, the resources at the operating system level usually includes memory resources, IO resources, signal processing and other parts, and the execution of the program is generally understood as the execution context, including the CPU, and later developed into a thread. Prior to the concept of thread, in order to reduce the cost of process switching, the operating system designers gradually revised the concept of the process, gradually allowing the resources of the process to be stripped out of its principal, allowing some processes to share some of the resources, such as files, signals, data memory, and even code, which developed a lightweight process concept. The Linux kernel has already implemented a lightweight process in the 2.0.x version, and the application can use a unified clone () system call interface, with different parameters to specify whether to create a lightweight process or a normal process. In the kernel, the clone () call is called do_fork () after arguments are passed and interpreted, and the inside function is also the final implementation of the fork (), vfork () system call:
<linux-2.4.20/kernel/fork.c>int do_fork (unsigned long clone_flags, unsigned long stack_start, struct pt_regs * regs, unsigned long stack_size)
The clone_flags is taken from the "or" value of the following macro:
<linux-2.4.20/include/linux/sched.h> #define csignal 0x000000ff/* Signal mask to be s ENT at Exit */#define CLONE_VM 0x00000100/* Set if VM shared between processes */#define CLONE_FS 0x00000200/ * Set if FS info shared between processes */#define CLONE_FILES 0x00000400/* Set if open FILES shared between proces SES */#define CLONE_SIGHAND 0x00000800/* Set if signal handlers and blocked signals shared */#define CLONE_PID 0x0000 -*/SET if PID shared */#define CLONE_PTRACE 0x00002000/* Set if we want to let tracing continue on the child too */#define CLONE_VFORK 0x00004000/* Set if the parent wants the child to wake it up on Mm_release */#define Clone_parent 0x00008000/* Set if we want to the same parent as the Cloner */#define CLONE_THREAD 0x00010000/* Same THREAD G Roup? */#define CLONE_NEWNS 0x00020000/* New namespace group? */#define CLONE_SIGNAL (Clone_sighand | Clone_thread)
In Do_fork (), different clone_flags will cause different behavior for linuxthreads, which it uses (CLONE_VM | Clone_fs | Clone_files | Clone_sighand) parameter to invoke Clone () to create a "thread" that represents shared memory, shared file system access count, shared file descriptor, and shared signal processing. This section is for these parameters to see how the Linux kernel implements these resources for sharing.
1.clone_vm
Do_fork () needs to call COPY_MM () to set the MM and active_mm items in the task_struct, which correspond to the memory space associated with the process by the two mm_struct data. If the CLONE_VM switch is specified at Do_fork (), copy_mm () sets the MM and active_mm in the new task_struct to be the same as current, while increasing the number of users of the Mm_struct (mm_struct: : mm_users). In other words, the lightweight process shares the memory address space with the parent process, which shows the status of the mm_struct in the process:
2.clone_fs
Task_struct uses FS (struct fs_struct *) to record the root and current directory information of the file system where the process resides, Do_fork () calls Copy_fs () to replicate the structure, and for lightweight processes only adds fs-> The count count, which shares the same fs_struct as the parent process. In other words, lightweight processes do not have independent file system-related information, and any thread in the process that alters the current directory, root directory, and so on, will directly affect other threads.
3.clone_files
A process may have opened some files, using files (struct files_struct *) in the process structure task_struct to hold the file structure (struct file) information opened by the process, do_fork () called Copy_files ( ) to handle this process property, and the lightweight process will only increase the files->count count when Copy_files () is shared with the parent process. This sharing allows any thread to access the open files maintained by the process, and the operations on them are reflected directly to other threads in the process.
4.clone_sighand
Each Linux process can define its own way of handling the signal, using an array of struct k_sigaction structures in the sig (struct signal_struct) in task_struct to hold this configuration information, Do_fork () The Copy_sighand () in is responsible for copying the information; The lightweight process does not replicate, but only increases the signal_struct::count count, sharing the structure with the parent process. In other words, the child processes are handled exactly the same way as the parent process and can be changed from one to the other.
Many of the work done in Do_fork () is not described in detail here. For SMP systems, all the processes fork out and are assigned to the same CPU as the parent process, until the process is scheduled for CPU selection.
Although Linux supports lightweight processes, it does not say that it supports core-level threads, because Linux threads and processes are actually at a scheduling level, sharing a process identifier space, which makes it impossible to implement the POSIX threading mechanism in full sense on Linux, As a result, many Linux line libraries implementations can only implement most of the POSIX semantics and be as close as possible to the functionality.
Three. Threading mechanism for Linuxthread
Linuxthreads is currently the most widely used line libraries on the Linux platform and is developed by Xavier Leroy ([email protected]) and has been bundled in glibc release. What it achieves is a "one-to-one" threading model based on the core lightweight process, where a thread entity corresponds to a core lightweight process, and the management of threads is implemented in the kernel external function library.
1. Thread description data structure and implementation limitations
Linuxthreads defines a struct _PTHREAD_DESCR_STRUCT data structure to describe threads, and uses global array variable __pthread_handles to describe and reference threads that the process governs. In the first two items in __pthread_handles, Linuxthreads defines two global system threads: __pthread_initial_thread and __pthread_manager_thread, and __pthread The _main_thread characterizes the parent thread of the __pthread_manager_thread (initially __pthread_initial_thread).
Struct _pthread_descr_struct is a doubly linked list structure, where the list of __pthread_manager_thread contains only one element, in fact, __pthread_manager_thread is a special thread , linuxthreads only used three domains, including errno, P_pid, p_priority, and so on. The chain where the __pthread_main_thread is located strings all the user threads in the process together. The __pthread_handles array that is formed after a series of Pthread_create () is as follows:
The newly created thread will first occupy an entry in the __pthread_handles array, and then connect to the linked list in the __pthread_main_thread-led pointer through the chain pointers in the data structure. The use of this list will be mentioned when introducing the creation and release of threads.
Linuxthreads follows the POSIX1003.1C standard, where the implementation of the line libraries is limited in scope, such as the maximum number of threads in a process, the size of the thread private data area, and so on. In the implementation of Linuxthreads, basically follow these restrictions, but also made some changes, the trend of change is to relax or expand these restrictions, making programming more convenient. These limited macros are mainly focused on sysdeps/unix/sysv/linux/bits/local_lim.h (different file locations used by different platforms), including the following:
The number of private data keys per process, POSIX defines _posix_thread_keys_max for 128,linuxthreads using pthread_keys_max,1024, and the number of operations allowed when private data is released, Linuxthreads is consistent with POSIX, defining a pthread_destructor_iterations of 4, the number of threads per process, POSIX defined as 64,linuxthreads increased to 1024x768 (Pthread_threads_ MAX); line Cheng stacks minimum space size, POSIX unspecified, linuxthreads using pthread_stack_min,16384 (bytes).
2. Managing Threads
One of the benefits of the "single-to" model is that thread scheduling is done by the core, while other work, such as thread cancellation, synchronization between threads, is done in the kernel outside of the process library. In Linuxthreads, a management thread is constructed specifically for each process that handles thread-related management. When a process first calls Pthread_create () creates a thread, it creates (__clone ()) and starts the management thread.
Within a process space, the management thread communicates with the other threads through a pair of "management pipelines (manager_pipe[2])", which are created before the management thread is created, and the read and write ends of the management pipeline are assigned to two global variables after the management thread is successfully started __pthread_ Manager_reader and __pthread_manager_request, after which each user thread Cheng requests through __pthread_manager_request to the management line, but the management thread itself is not directly used __pthread _manager_reader, the read end of the pipeline (Manager_pipe[0]) is passed as one of the parameters of the __clone () to the management thread, and the main task of the management thread is to listen to the read-side of the pipe and respond to requests taken from it.
The process for creating a management thread is as follows:
(global variable Pthread_manager_request initial value is-1)
When initialization is complete, the lightweight process number is recorded in __pthread_manager_thread and the thread id,2*pthread_threads_max+1 for the out-of-core assignment and management does not conflict with any regular user thread IDs. The management thread runs as a child of the caller thread of the pthread_create (), and the user thread created by Pthread_create () is created by a management thread that invokes clone (), so it is actually the child thread of the management thread. (The concept of a child thread here should be understood as a child process.) )
__pthread_manager () is the main loop where the management thread is located, and enters the while (1) loop after a series of initialization work. In the loop, the thread manages the read end of the pipeline with a 2-second timeout query (__poll ()). Checks whether the parent thread (that is, the main thread that created the manager) has exited before processing the request, and exits the entire process if it has exited. Call Pthread_reap_children () cleanup If there is a child thread that exits that needs cleanup.
This is then the request in the read pipeline, which performs the appropriate action (Switch-case) based on the request type. The specific request processing, the source code is relatively clear, here does not repeat.
3. Thread Stacks
In Linuxthreads, the stack of the management thread and the stack of the user thread are detached, and the management thread allocates a thread_manager_stack_size byte area in the process heap through malloc () as its own run stack.
The stack allocation method for user threads varies with architecture, based on two macro definitions, one is Need_separate_register_stack, this property is used only on IA64 platforms, and the other is Floating_stack macro, Used on a few platforms such as I386, where the user thread stack is determined by the system and provides protection. At the same time, users can specify a stack that uses a user-defined configuration through the thread property structure. Due to space limitations, it is only possible to analyze the two types of stack organization used by the I386 platform: floating_stack mode and user-defined mode.
In Floating_stack mode, Linuxthreads uses mmap () to allocate 8MB space from the kernel space (I386 system default maximum stack space size, if there is a run limit (rlimit), according to the run Limit setting), using Mprotect () Sets the first page of which is a non-access area. The functions of the 8M space are assigned as follows:
Low-Address protected pages are used to monitor stack overflows.
For the user-specified stack, after following the pointer to the bounds, set the line stacks top, and calculate the bottom of the stack, do not protect, the correctness of the user's own assurance.
Regardless of the organization, the thread description structure is always positioned near the stack at the top of the stack.
4. Thread ID and Process ID
Each linuxthreads thread has both a thread ID and a process ID, where the process ID is the process number maintained by the kernel, and the thread ID is assigned and maintained by Linuxthreads.
The __pthread_initial_thread thread ID for Pthread_threads_max,__pthread_manager_thread is 2*pthread_threads_max+ 1, the thread ID of the first user thread is pthread_threads_max+2, and then the thread ID of the nth user thread follows the following formula:
Tid=n*pthread_threads_max+n+1
This allocation ensures that all threads in the process (including those that have exited) do not have the same thread ID, and that the thread ID type pthread_t is defined as an unsigned long integer (unsigned long int), and that the run time of the rational is not duplicated by the inline ID.
Finding the thread data structure from the thread ID is done in the Pthread_handle () function, in effect only the thread number is modeled as Pthread_threads_max, resulting in the thread's index in __pthread_handles.
5. Creation of threads
After Pthread_create () sends the REQ_CREATE request to the management thread, the management thread calls Pthread_handle_create () to create a new thread. After allocating the stack, setting the thread property, create and start a new thread with Pthread_start_thread () for the function entry call __clone (). Pthread_start_thread () reads its own process ID number into the thread description structure and configures the schedule based on the scheduling method in which it is logged. When everything is ready, call the real thread execution function and call Pthread_exit () to clean up the scene after this function returns.
6.LinuxThreads of deficiency
Due to the limitations of the Linux kernel and the difficulty of implementation, linuxthreads is not fully POSIX compliant and is described in its release readme.
1) Process ID issue
This deficiency is the most critical deficiency, causing the cause to involve Linuxthreads's "one-on" model.
The Linux kernel does not support a true thread, and Linuxthreads is supported by a lightweight process that has the same kernel scheduling view as a normal process. These lightweight processes have independent process IDs that enjoy the same capabilities as normal processes in process scheduling, signal processing, and IO. In the view of the source reader, the clone () of the Linux kernel does not implement support for the Clone_pid parameter.
The handling of Clone_pid in kernel do_fork () is this:
if (Clone_flags & clone_pid) { if (current->pid) goto fork_out; }
This code indicates that the current Linux kernel only recognizes the Clone_pid parameter when the PID is 0, in fact, the Clone_pid parameter is used only when the SMP is initialized and the process is manually created.
By POSIX definition, all threads of the same process should share a process ID and parent process ID, which is not possible under the current "one to one" model.
2) Signal processing problems
Because the asynchronous signal is distributed by the kernel as a process, and each thread of linuxthreads is a process for the kernel and does not implement a "thread group", some semantics do not conform to the POSIX standard, such as not implementing a signal to all threads in the process, as explained by the Readme.
If the core does not provide real-time signals, Linuxthreads will use SIGUSR1 and SIGUSR2 as internal restart and cancel signals, so the application cannot use the two signals that were originally reserved for the user. Extended real-time signals (from _sigrtmin to _sigrtmax) are supported in later versions of the Linux kernel 2.1.60, so this problem does not exist.
The default action of some signals is difficult to implement in the current system, such as Sigstop and sigcont,linuxthreads can only hang one thread, and cannot suspend the entire process.
3) Total number of threads
Linuxthreads defines the maximum number of threads per process as 1024, but in practice this value is also limited by the total number of processes in the whole system, because threads are actually core processes.
In kernel 2.4.x, a new set of total process count methods is used, so that the total number of processes is basically limited to the size of physical memory and the formula is calculated in the kernel/fork.c fork_init () function:
Max_threads = mempages/(thread_size/page_size)/8
On i386, thread_size=2*page_size,page_size=2^12 (4KB), mempages= physical memory size/page_size, machine for 256M of memory, mempages=256*2^20/2^ 12=256*2^8, at which point the maximum number of threads is 4096.
However, in order to ensure that the total number of processes per user (except root) does not occupy more than half of the physical memory, Fork_init () continues to specify:
Init_task.rlim[rlimit_nproc].rlim_cur = MAX_THREADS/2; Init_task.rlim[rlimit_nproc].rlim_max = MAX_THREADS/2;
The number of these processes is checked in do_fork (), so the total number of threads is limited by these three factors for linuxthreads.
4) Managing threading issues
Managing threads can be a bottleneck, which is a common problem in this structure, and the management thread is responsible for the cleanup of the user thread, so even though the management thread has blocked most of the signals, once the management thread dies, the user thread has to clean it up manually, and the user thread does not know the state of the management thread. Subsequent requests such as thread creation will be unattended.
5) Synchronization Issues
The thread synchronization in Linuxthreads is largely based on the signal, and the efficiency is always a problem through the synchronous mode of the complex signal processing mechanism of the kernel.
6) Other POSIX compatibility issues
Many of the system calls in Linux are related to processes by semantics, such as Nice, setuid, Setrlimit, and so on, and in the current linuxthreads, these calls affect only the caller thread.
7) Real-time problems
The introduction of threading has some real-time considerations, but linuxthreads temporarily does not support, such as scheduling options, has not yet been implemented. Not only is linuxthreads so, standard Linux is rarely considered in real time.
Four. Other thread implementation mechanisms
Linuxthreads issues, especially compatibility issues, have severely hampered the use of multithreaded designs on Linux for cross-platform applications, such as Apache, so that threading applications on Linux have remained at a relatively low level. There are already many people in the Linux community who are working to improve threading performance, including both user-level line libraries and line libraries that are both core and user-level with improved. At present, there are two projects, one is the NPTL (Native POSIX Thread Library) led by Redhat Company, the other is IBM Investment development NGPT (Next Generation POSIX threading), Both are built around a fully compatible POSIX 1003.1c, while doing work within and out of the core to implement a multi-pair multithreaded model. Both of these models compensate for the shortcomings of linuxthreads, and are all reinvent new designs.
1.NPTL
NPTL's design objectives can be summed up in the following points:
- POSIX compatibility
- Utilization of SMP structures
- Low startup Overhead
- Low link overhead (that is, programs that do not use threads should not be affected by line libraries)
- Binary compatibility with linuxthreads applications
- Scalable capabilities for hardware and software
- Multi-architecture support
- NUMA support
- Integration with C + +
In the technical implementation, NPTL still employs a 1:1 threading model, and with GLIBC and the latest Linux kernel2.5.x development in the signal processing, thread synchronization, storage management and other aspects of optimization. Unlike Linuxthreads, NPTL does not use administrative threads, and the management of core threads is carried out directly in the kernel, which also has performance optimizations.
Mainly because of the core problem, NPTL is still not 100%posix compatible, but in terms of performance relative linuxthreads has been largely improved.
2.NGPT
IBM's Open source project, NGPT, launched a stable version of 2.2.0 on January 10, 2003, but the related documentation is much worse. As far as we know, NGPT is a m:n model based on the GNU PTH (GNU Portable Threads) project, and the GNU PTH is a classic user-level line libraries implementation.
According to the notice on the official website of the NGPT on March 2003, NGPT, considering the growing acceptance of NPTL, will not be further developed in the future in order to avoid the confusion caused by different line libraries versions, and now carry out supportive maintenance work. That said, Ngpt has given up the libraries standard of competing with NPTL for the next generation of Linux POSIX lines.
3. Other efficient threading mechanisms
The scheduler activations cannot be mentioned here. The multi-threaded kernel structure published in ACM in 1991 affected the design of many multithreaded cores, including Mach3.0, NetBSD, and commercial versions of digital UNIX (now called Compaq True64 Unix). Its essence is to use user-level thread scheduling, as much as possible to reduce the user-level to the core system call requests, the latter is often an important source of running overhead. The threading mechanism of this structure, in fact, is a flexible and efficient combination of user-level threading and core-level threading practicality, so the Linux, FreeBSD, including a number of open-source operating system design community are conducting relevant research, trying to implement scheduler in this system Activations.
Resources
- [Linus torvalds,2002] Linux kernel source v2.4.20
- [GNU,2002] GLIBC source v2.2.2 (contains linuxthreads v0.9)
- [Thomas E. terrill,1997] A Introduction to Threads Using the linuxthreads Interface
- [Ulrich Drepper,ingo molnar,2003] The Native POSIX Thread Library for Linux
- http://www.ibm.com/developerworks/oss/pthreads/, NGPT official website
- [Ralf S. engelschall,2000] Portable multithreading
- [Thomas E. Anderson, Brian N. Bershad, Edward D. Lazowska, Henry M. levy,1992] Scheduler activations:effective Kernel support for the user-level Management of Parallelism
- [[email protected]] On Linux Threads
Linux Threading Implementation Mechanism analysis (reprint)