Linux thread-light Process

Last Update:2018-12-03 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Processes are similar to life: they are generated and have more or less effective life. One or more sub-processes can be generated and eventually die. A tiny difference is that there is no gender difference between processes-each process has only one parent. So, there is an important concept in the operating system-how is thread implemented in Linux? Linux does not have the thread concept. Haha, isn't Linux very backward? No. On the contrary, Linux provides another concept-light process, which is more scalable and great.

Linux supports multithreading, but it is implemented through a lightweight process.

From the kernel point of view, the process aims to allocate system resources (CPU time, memory, etc. When a process is created, it is almost the same as the parent process. It accepts a (logical) Copy of the address space of the parent process, and starts executing the same code as the parent process from the next command of the Process Creation System Call (fork. Although parent and child processes can share pages that contain program code (body), they each have independent data copies (including heap and stack, and copy data upon writing ), therefore, modifications made by sub-processes to a memory unit are invisible to the parent process.

Today's Unix kernel has been freed from this simple process creation mode. Most Unix systems support multi-threaded applications: many user programs with relatively independent execution streams share most of the data structures of applications. In such a system, a process is composed of several other user threads, and each thread represents an execution stream.

Earlier Linux kernel versions do not support multithreading applications. From the kernel point of view, multi-threaded applications are just a common process. Multi-threaded applications create, process, and schedule multiple execution streams in the user State. Users use the standard thread library provided by POSIX 1c in C language to implement user-level threads, this includes synchronization operations for thread creation, deletion, mutex, and conditional variables, as well as standard functions for scheduling and management of threads, without kernel support.

However, the implementation of this multi-threaded application is not so satisfactory. For example, a famous example on the ULK-3, assuming a man vs. chess program uses two threads: one of them controls the graphical board, waiting for the movement of human players and showing the movement of the computer, another step is to move the game. Although the first thread waits for the player to move, the second thread should continue to run to take advantage of the player's thinking time. However, if the chess program is only a separate process, the first thread cannot simply issue a blocking system call waiting for user behavior; otherwise, the second thread is also blocked. Therefore, the first thread must use a complex non-blocking technique to ensure that the process is still running.

Now Linux uses lightweight processes to provide better support for multi-threaded applications. Two lightweight processes can basically share some resources, such as address space and file opening. If one of them modifies the shared resource, the other Immediately checks the modification. Of course, when two threads access shared resources, they must synchronize themselves.

A simple way to implement multi-threaded applications is to associate lightweight processes with each application thread. In this way, threads can access the same application data structure set by simply sharing the same memory address space and opening the file set. At the same time, each thread can be independently scheduled by the kernel, so that one sleep while the other can still run. POSIX-compatible multi-threaded applications are processed by kernels that support "thread groups. In Linux, a thread group is basically a set of lightweight processes that implement multi-threaded applications, such as getpid (), kill (), and _ exit () such a number of system calls act as an organization.

In Linux, when the system calls clone () system call, the new process it creates shares the same user address space with the called user. In principle, the newly created process is a thread of the caller process. But Linux does not admit that because the kernel does not specifically define the data structure used by the thread, its thread and process have no difference in structure. This is why Linux is powerful, because the structure of the Linux Process system is reasonable enough to implement thread functions without additional data structures, the thread implementation function can also be expanded, because you can select shared resources as needed. Oh, Is it great?

1 clone () system call

In Linux, a lightweight process is created by a system call named clone:
Asmlinkage int sys_clone (struct pt_regs regs)
{
Unsigned long clone_flags;
Unsigned long newsp;
Int _ User * parent_tidptr, * child_tidptr;

Clone_flags = regs. EBX;
Newsp = regs. ECx;
Parent_tidptr = (INT _ User *) regs. edX;
Child_tidptr = (INT _ User *) regs. EDI;
If (! Newsp)
Newsp = regs. ESP;
Return do_fork (clone_flags, newsp, & regs, 0, parent_tidptr, child_tidptr );
}

In fact, clone () is an encapsulation function defined in the C language library. It is responsible for creating a new stack of lightweight processes and starting clone system calls. The process created by clone () can produce the effect of a thread, because it determines which information to share based on the parameters you specify. The following parameters are used:
FN: Specifies a function executed by a new process. When this function is returned, the child process is terminated. The function returns an integer indicating the exit code of the sub-process.
Arg: point to the data passed to the FN () function.
Flags: indicates the signal code sent to the parent process when the child process ends. Generally, the sigchld signal is selected. The remaining three bytes are encoded in a clone flag. The following table lists the flag groups:


Flag name	Description
Clone_vm	Shares the memory Descriptor and all page tables.
Clone_fs	Shares the table that identifies the root directory and Current working directory, as well as the value of the bitmask used to mask Initial file permissions of a new file (the so-called file umask ).
Clone_files	Shares the table that identifies the open files.
Clone_sighand	Shares the tables that identify the signal handlers and Blocked and pending signals. If this Flag is true,Clone_vm Flag must also be set.
Clone_ptrace	If traced, the parent wants the child to be traced too. Furthermore, the debugger may want to trace the child on its own; in this case, The kernel forces the flag to 1.
Clone_vfork	Set when the system call issued isVfork () .
Clone_parent	Sets the parent of the child (Parent And Real_parent Fields in the process descriptor) to the parent of Calling process.
Clone_thread	Inserts the child into the same thread group of the parent, and Forces the child to share the signal descriptor of the parent. The child's Tgid AndGroup_leader Fields are set accordingly. If this flag Is true,Clone_sighand Flag must also be set.
Clone_newns	Set if the clone needs its own namespace, that is, its own view Of the mounted filesystems; it is not Possible to specify bothClone_newns And Clone_fs .
Clone_sysvsem	Shares the System v ipc undoable semaphore operations.
Clone_settls	Creates a new Thread Local Storage (TLS) segment for Lightweight Process; the segment is described in the structure pointed to by TLS Parameter.
Clone_parent_settid	Writes the PID of the child into the user mode variable of Parent pointed to byPtid Parameter.
Clone_child_cleartid	When set, the kernel sets up a mechanism to be triggered when The child process will exit or when it will start executing a new program. In These cases, the kernel will clear the user mode variable pointed to by Ctid Parameter and will awaken any process waiting for this Event.
Clone_detached	A legacy flag ignored by the kernel.
Clone_untraced	Set by the kernel to override the value of Clone_ptrace Flag (used for disabling tracing of kernel threads ).
Clone_child_settid	Writes the PID of the child into the user mode variable of Child pointed to byCtid Parameter.
Clone_stopped	Forces the child to start inTask_stopped State.

Child_stack: assigns the user State Stack pointer to the ESP register of the sub-process. The calling process (parent process) should always allocate a new stack to the child process.
TLS: the address of the local storage segment's TLS data structure, which is defined for a new lightweight process. It makes sense only when the clone_settls flag is set.
Ptid: Specifies the user State Variable address of the parent process. The parent process has the same PID as the new lightweight process. It makes sense only when the clone_parent_settid flag is set.
Ctid: the user State Variable address of the new lightweight process. The process has the PID of this type of process. It makes sense only when clone_child_settid is set.

However, the sys_clone () routine called by the clone () system does not seem to have parameters such as FN and Arg, but is just a bunch of registers. Don't worry, because the encapsulation function stores the FN pointer somewhere in the sub-process stack, which is the location where the returned address of the encapsulation function is stored. The ARG pointer is stored under the FN in the sub-process stack. When the encapsulation function ends, the CPU retrieves the return address from the stack and then executes the FN (ARG) function.

The traditional fork () system call is implemented by clone () in Linux. The flags parameter of clone () is designated as the sigchld signal and all the clone flags of clear 0, its child_stack parameter is the current stack pointer of the parent process. Therefore, the parent process and child process temporarily share a user-state stack. However, as long as one of the Parent and Child processes tries to change the stack, they will immediately get a copy of the user State stack.

The vfork system call is also implemented using clone () in Linux. The clone () parameter flags is specified as the sigchld signal, clone_vm, clone_vfork flag, and clone () the child_stack parameter is equal to the current stack pointer of the parent process.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More