One of the many Linux multitasking programs: Tasks, processes, threads

Source: Internet
Author: User

Source: CSDN Wang Wensong transfer from: Linux commune

Introduction to Linux under multi-tasking

First of all, what do you call a multitasking system first? What are the tasks, processes, and threads, respectively? What is the difference between them? , so you can have a macro understanding of the three, and then for each carefully explained.

What is a multi-tasking system? a multitasking system means that multiple applications can be run at the same time, and each application is called a task.

Task definition: A task is a logical concept, a task performed by a software, or a series of operations that collectively achieve a purpose.

Process definition: A process is a dynamic execution of a program with a separate function on a dataset, which is the smallest unit of the system for resource allocation and scheduling.

Thread definition: A thread is a running route that is independent of the process, is the smallest unit of processor scheduling, and can also be a lightweight process.

Look at the definition, a little dizzy, or the popular talk about their differences. ① typically a task is a single execution of a program, and a task consists of one or more subtasks that complete a separate function, which is a process or thread. ② a process can have multiple threads, and each thread must have a parent process.

Task

A task is a logical concept, a task performed by a software, or a series of operations that collectively achieve a purpose. Typically a task is a single execution of a program, and a task consists of one or more subtasks that complete a separate function, which is a process or thread. For example, one operation of an anti-virus software is a task designed to protect a computer system from various viruses, a task that includes subtasks (processes or threads) of multiple independent functions, including real-time monitoring functions, timed kill functions, firewall functions, and user interaction functions. The relationship between tasks, processes, and threads is shown in 1

Process

Basic concepts of processes

A process is a dynamic execution of a program with a separate function on a data set, which is the basic unit for resource allocation and dispatch of the system. A task can be run concurrently to activate multiple processes that work together to accomplish one final goal of the task.

Processes are characterized by concurrency, dynamics, interactivity, independence, and Asynchrony.

Processes and programs are fundamentally different: A program is a static piece of code, is a set of commands stored in nonvolatile memory, an ordered collection, without any concept of implementation, and the process is a dynamic concept, it is the process of program execution, including dynamic creation, scheduling and extinction of the entire process, It is the smallest unit of program execution and resource management.

The process structure under Linux

The process includes not only the program's instructions and data, but also all registers of the program counter and processor, and the process stack that stores the temporary data, so the process being executed includes all current activities of the processor.

Because Linux is a multi-tasking, multi-process operating system, other processes must wait until the system has assigned the processor's use rights to itself before it can run. When a running process waits for other system resources, the Linux kernel will take control of the processor and assign the processor to other waiting processes, and he follows the scheduling algorithm in the kernel to decide which process to assign the processor to, that is, the kernel does not leave the processor idle.

The kernel stores all processes in a bidirectional circular list (the process chain list), where the header of the list is the Init_task descriptor. Each item of the list is a structure of type task_struct, called the process descriptor, which contains all the information about a process, defined in the <include/linux/sched.h> file. TASK_STRUCT kernel structure is large, it can fully describe a process, such as the state of the process, the basic information of the process, the process identifier, memory-related information, the parent process-related information, the terminal information related to the process, the current working directory, open file information, received signal information.

The following is a detailed explanation of the two most important domains in the TASK_STRUCT structure: state (process status) and PID (process identifier). If you want to know more about task_struct, please click here.

(1) Process status

The processes in Linux are in the following States.

Running state (task_running): The process is currently running, or is waiting to be dispatched in the running queue.

Interruptible blocking State (task_interruptible): The process is in a blocked (sleep) state, waiting for certain events to occur or to be able to consume certain resources. Processes in this state can be interrupted by a signal. After a signal is received or awakened by an explicit wake-up call (such as calling the WAKE_UP Series macro: Wake_up, wake_up_interruptible, and so on), the process transitions to the task_running state.

Non-interruptible blocking state (task_uninterruptible): This process state is similar to a interruptible blocking state (task_interruptible), except that it does not process the signal, and the process that passes the signal to this state cannot change its state. In some specific cases (processes must wait until certain events that cannot be interrupted occur), this state is useful. The process is only awakened by the displayed wake-up call when the event it waits for occurs.

Terminating blocking state (task_killable): This state operates like task_uninterruptible, except that a process in that State can respond to a fatal signal. It can replace an active, non-interruptible blocking state (task_uninterruptible) that may not be terminated, as well as an interruptible blocking state task_interruptible that is easy to wake up but is poorly secured.

Paused state (task_stopped): The execution of the process is paused, and when the process receives a signal such as SIGSTOP, SIGSTP, Sigttin, Sigttou, it enters a paused state.

Trace State (task_traced): Execution of the process is paused by the debugger. When a process is monitored by another (such as the debugger using the Ptrace () system call to monitor the test program), any signal can place the process in a tracking state.

Zombie State (Exit_zombie): The process ends, and the parent process has not yet used a system call such as the wait function family (such as calling the Waitpid () function) to "corpse", that is, waiting for the parent process to destroy it. In this state the process "corpse" has abandoned almost all memory space, there is no executable code, and can not be dispatched, just keep a position in the process list, record the status of the process and other information for other processes to collect.

Zombie Undo Status (Exit_dead): This is the final state, after the parent process calls the wait function family "corpse", the process is completely deleted by the system.

The conversion relationship between them is shown in 2:

Processes can use Set_task_state and set_current_state macros to change the state information of the specified process and the state of the current process.

(2) Process identifier

The Linux kernel identifies each process with a unique process identifier PID (just like the file descriptor). PID is stored in the PID field of the process descriptor, the newly created PID is usually the PID of the previous process plus 1, but the PID value is capped (max =pid_max_default-1, usually 32767), the reader can view the/proc/sys/kernel/pid _max to determine the maximum number of processes for the system.

When the system starts, the kernel is usually represented as a process. A macro current that points to task_struct is used to record a running process. Current often appears as a pointer to the process descriptor structure in the kernel code, for example, Current->pid represents the PID of the process that the processor is executing. When the system needs to see all the processes, it calls the for_each_process () macro, which is much faster than the system searches the array.

The system call functions that obtain the current process number (PID) and parent process number (PPID) in Linux are Getpid () and Getppid () respectively.

Process creation, execution, termination

(1) Creation and execution of processes

First we need to know what is created and what is executed ha! I didn't understand it when I first started to watch it. Creating a process is the creation of a new process, as we all know. While the implementation of the process, the definition of the process in front of the time, said the sub-task is running, plainly, the process of execution is to let the process of doing something, do not occupy that what does not lanna what.

Many operating systems provide a mechanism for generating processes, that is, creating a process in the new address space, reading the executable file, and finally starting execution. The creation of processes in Linux is unique, and it breaks down the above steps into two separate functions to execute: the fork () function and the EXEC function family. first, the fork () function creates a sub-process by copying the current process (note that at this point the resource has not been copied over, to learn about the Copy page technology at the time of writing), the child process differs from the parent process only in the different PID, ppid, and some resources and statistics. The EXEC function family is responsible for reading the executable file and loading it into the address space to start running.

(2) Termination of the process

Process finalization also requires a lot of tedious work, and the system must ensure that the recycle process consumes resources and notifies the parent process. Linux first sets the terminating process to zombie state, at which point the process cannot be put into operation, and its existence only provides information for the parent process and applies for death. After the parent process obtains the information, it starts to call the wait function family, finally terminates the child process, and all the resources occupied by the child process are freed.

Memory structure of the process

The Linux operating system uses virtual memory management technology so that each process has its own process address space that is not interfering with each other. The address space is a linear virtual space of 4GB in size, and the virtual address that the user sees and touches is not able to see the actual physical memory address. This virtual address not only protects the operating system (the user does not have direct access to the physical address), but, more importantly, the user program can use a larger address space than the actual physical memory.

The 4GB process address space is divided into two parts: User space and kernel space. The user address space is from 0 to 3GB (0xc000 0000) and the kernel address space occupies 3GB to 4GB. Typically, a user process can only access the virtual address of a user space and cannot access the virtual address of the kernel space. The kernel space can be accessed only by the user process using system calls (on behalf of the user process executing in the kernel state). Whenever the process switches, the user space changes, and the kernel space is mapped by the kernel, which does not follow the process changes, is fixed. The kernel space address has its own page table, and the user process has a different page table. The user space for each process is completely independent and irrelevant. The virtual memory space of the process is shown in 3, where the user space includes the following functional areas:

Read-only segment: Contains program code (. Init and. Text) and read-only data (. rodata).

Data segment: A global variable and a static variable are stored. Where the readable writable data segment (. Data) holds the initialized global variables and static variables, and the BSS data segment (. BSS) holds uninitialized?? Local variables and static variables.

Heap: Automatically allocated by the system to release, store the function parameter value, local variable value, return address and so on.

Stacks: Store dynamically allocated data, which is typically distributed and released dynamically by programmers. If the programmer does not release, the program may end up being recycled by the operating system.

Memory-mapped regions of shared libraries: this is the mapping area for Linux dynamic linker and other shared code library code.

Since each process in a Linux system has a directory corresponding to the/proc file system (such as the information about the Init process described in a file in the/PROC/1 directory), the mapping of the address space of a process can be viewed through the proc file system. For example, if you run an application that has a process number of 13703, enter the "cat/proc/13703/maps" command to see the memory mapping of the process.

Thread

As already mentioned, processes are the basic unit of program execution and resource allocation in the system. Each process has its own data segment, code snippet, and stack segment, which causes the process to perform actions such as switching to a more complex context switch. In order to further reduce the idle time of the processor, support multiprocessor and reduce the context switching overhead, there is another concept---thread in the evolution of the process. It is a running route that is independent of the process, is the smallest unit of processor scheduling, and can also be called a lightweight thread. threads can access the memory space and resource allocations of a process and share it with other threads in the same process. Therefore, the cost of context switching for a thread is much smaller than the creation process.

A process can have multiple threads, and each thread must have a parent process. A thread does not own system resources, it only has some data structures that the runtime must have, such as a stack/register and a thread control block (TCB), and the thread shares all the resources owned by the process with other processes of its parent process. It is important to note that because threads share the resources and address space of a process, any thread's operations on system resources can affect other processes. As a matter of importance, synchronization in multiple threads is a very important issue. In a multithreaded system, the process is related to thread 4

There are 3 threads in a Linux system: ① user thread ② lightweight thread ③ kernel thread

One of Linux multitasking: tasks, processes, threads (GO)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.