The Linux kernel manages the process through a struct called the process descriptor task_struct
, which contains all the information required for a process. It is defined in the include/linux/sched.h
file.
When it comes to task_struct
structs, it can be said that she is the most complex structure in the Linux kernel source code, the number of members, the use of memory large.
Process status
/* Task State bitmask. note! These bits is also * encoded in fs/proc/array.c:get_task_state (). * * We have both separate sets of Flags:task->state * is on runnability, while task->exit_state am * about th e task exiting. Confusing, but this is the modifying one set can ' t modify the other one by * mistake. */#define TASK_RUNNING 0 #define TASK_INTERRUPTIBLE 1 #define task_uninterruptible 2 #define __task_st opped 4 #define __task_traced 8/* in Tsk->exit_state */#define EXIT_DEAD #define E Xit_zombie #define EXIT_TRACE (Exit_zombie | Exit_dead)/* in Tsk->state again */#define TASK_DEAD #define Task_wakekill//** Wake On signals that is deadly **/#define task_waking #define TASK_PARKED #define Task_noload 1024x768 #define TASK_STATE_MAX 2048/* Convenience macros for the sake of set_task_state */#define Task_killable (Task_wakekill | task_uninterruptible) #define TASK_STOPPED (Task_wakekill | __task_stopped) #define TASK_TRACED (TASK _wakekill | __task_traced)
5 Mutually exclusive states
Status |
Description |
Task_running |
Indicates that the process is either executing or is ready to be executed (ready), waiting for the CPU time slice to be dispatched |
Task_interruptible |
The state in which a process is suspended (blocked) because it waits for some condition. These conditions mainly include: hard interrupts, resources, some signals .... Once the waiting conditions are set, the process is quickly transformed from that state (blocking) to the ready state task_running |
Task_uninterruptible |
The meaning is similar to task_interruptible, except that it cannot be awakened by receiving a signal, and for processes in the task_uninterrupible state, even if we transmit a signal or have an external interrupt, they cannot be awakened. Only when the resources it waits for are available, will he be awakened. This logo is rarely used, but does not mean that there is no use, in fact, his role is very large, especially for the driver spying related hardware process is very important, this spying process can not be interrupted by some other things, otherwise it will let the city into an unpredictable state |
task_stopped |
The process is stopped and the process enters that state after it receives a sigstop, Sigttin, SIGTSTP, or Sigttou signal |
task_traced |
Indicates that the process is monitored by a process such as debugger, the process execution is stopped by the debugger, and when a process is monitored by another process, each signal will let the city enter the state |
2 Termination status
In fact, there are two additional process states that can be added to the state domain and added to the exit_state domain. These two states are reached only when the process terminates.
/* task state */int exit_state;int exit_code, exit_signal;
Status |
Description |
Exit_zombie |
The execution of the process is terminated, but its parent process has not yet used a system called wait () to know its termination information, when the process becomes a zombie process |
Exit_dead |
The final state of the process |
and int exit_code
, exit_signal
; we'll be on the back of the process.
New Sleep Status
As mentioned earlier, the process state task_uninterruptible and task_interruptible are all sleep states. Now, let's see how the kernel can put the process to sleep.
How the kernel will set the process to sleep
The Linux kernel provides two ways to put a process into sleep state.
The normal way to put a process into sleep is to set the process state to task_interruptible or task_uninterruptible and invoke the Scheduler's schedule () function. This removes the process from the CPU run queue.
If the process is in a sleep state in interruptible mode (by setting its state to task_interruptible), it can be awakened by an explicit wake-up call (Wakeup_process ()) or a signal that needs to be processed.
However, if the process is in a sleep state in non-interruptible mode (by setting its state to task_uninterruptible), it can only be woken up by an explicit wake-up call. Unless it is a last resort, we recommend that you set the process to interruptible sleep mode instead of non-disruptive sleep mode (for example, during device I/O, when the signal is difficult to process).
When a task in the Interruptible sleep mode receives a signal, it needs to process the signal (unless it has been shown), leave the task it was working on (where the code needs to be cleared), and return the-EINTR to the user space. Again, the work of checking these return codes and taking appropriate action will be done by the programmer.
As a result, lazy programmers may prefer to put the process into a sleep state in non-interruptible mode because the signal does not wake up such tasks.
One thing to note, however, is that wake-up calls to non-interruptible processes may not occur for some reason, which can cause the process to fail and eventually cause problems because the only workaround is to restart the system. On the one hand, you need to consider some details, because not doing this will introduce bugs on the kernel side and the client side. On the other hand, you may generate processes that are never stopped (processes that are blocked and cannot be terminated).
Now, we've implemented a new sleep method in the kernel.
Linux Kernel 2.6.25 introduces a new process sleep state,
Status |
Description |
Task_killable |
When the process is in this new sleep state that can be terminated, it works like task_uninterruptible, but can respond to a deadly signal |
It is defined as follows:
#define TASK_WAKEKILL 128 /** wake on signals that are deadly **//* Convenience macros for the sake of set_task_state */#define TASK_KILLABLE (TASK_WAKEKILL | TASK_UNINTERRUPTIBLE)#define TASK_STOPPED (TASK_WAKEKILL | __TASK_STOPPED)#define TASK_TRACED (TASK_WAKEKILL | __TASK_TRACED)
In other words, task_uninterruptible + Task_wakekill = task_killable.
And the Task_wakekill is used to wake the process when a fatal signal is received
New sleep state allows task_uninterruptible to respond to deadly signals
Process state transitions and causes are roughly as
Process identifier (PID)
pid_t pid; pid_t tgid;
UNIX systems use PID to identify processes, and Linux associates different PID with each process or lightweight thread in the system, while UNIX programmers want the same set of threads to have a common PID, which introduces the concept of thread groups in accordance with this standard Linux. A thread group all threads have the same PID as the lead thread, stored in the Tgid field, and Getpid () returns the Tgid value of the current process instead of the PID value.
In the case of a config_base_small configuration of 0, the PID has a range of 0 to 32767, that is, the maximum number of processes in the system is 32,768.
#define PID_MAX_DEFAULT (CONFIG_BASE_SMALL ? 0x1000 : 0x8000)
In a Linux system, all threads in a thread group use the same PID as the thread group's lead thread (the first lightweight process in the group) and are stored in the Tgid member. Only the PID members of the thread group's lead thread are set to the same value as Tgid. Note that the Getpid () system call returns the Tgid value of the current process instead of the PID value.
Process Kernel Stack
void *stack;
Kernel stacks and thread descriptors
For each process, the Linux kernel stores two different data structures in a separate memory area allocated to the process;
- One is the kernel-state process stack
- The other is a small data structure thread_info next to the process descriptor, called the thread descriptor.
Linux stores Thread_info (thread descriptors) and kernel-state thread stacks, which are typically 8192K (two page frames), and the address must be an integer multiple of 8192.
In the Linux/arch/x86/include/asm/page_32_types.h,
#define THREAD_SIZE_ORDER 1#define THREAD_SIZE (PAGE_SIZE << THREAD_SIZE_ORDER)
For efficiency reasons, the kernel allows this 8K space to occupy a contiguous two-page box and make the first page box's starting address a multiple of 213.
The kernel-state process accesses the stack in the kernel data segment, which is different from the stack used by the user-state process.
The stack used by the user-state process is in the linear address space of the process;
The kernel stack is the kernel stack that is used in kernel space when the process changes from the user space to the kernel space, changing the privilege level and needs to switch the stack. Because the kernel control path uses very little stack space, it requires only a thousands of-byte kernel-state stack.
Note that the kernel stack is used only for kernel routines, and the Linux kernel also provides separate hard interrupt stacks and soft interrupt stacks for interrupts
Shows how to store two kinds of data structures in physical memory. The thread descriptor resides with the beginning of this memory area, while the top end of the stack grows downward. From ULK3, the process kernel stack is associated with a process descriptor such as:
However, in newer kernel code, the process descriptor task_struct structure does not have a pointer to the THREAD_INFO structure directly, but instead uses a member of the void pointer type to represent it and then accesses the thread_info structure through a type conversion.
Related code in Include/linux/sched.h
#define task_thread_info(task) ((struct thread_info *)(task)->stack)
In this diagram, the ESP register is the CPU stack pointer that holds the address of the top unit of the stack. In the 80x86 system, the stack starts at the top and grows in the direction of the memory area. The kernel stack of the process is always empty after the user state has just switched to the kernel state. Therefore, the ESP register points to the top of the stack. Once the data is written to the stack, the value of ESP is decremented.
Kernel stack data structure description Thread_info and thread_union
Thread_info is architecture-related, and the structure is defined in thread_info.h
A federation is used in the Linux kernel to represent the thread descriptors and kernel stacks of a process:
union thread_union{ struct thread_info thread_info; unsigned long stack[THREAD_SIZE/sizeof(long)];};
Gets the thread_info of the currently running process on the CPU
Here's how to get the thread_info structure of the currently running process on the CPU via the ESP stack pointer.
In fact, as mentioned above, the THREAD_INFO structure and the kernel stack are tightly coupled, occupying two page frames of physical memory space. Also, the starting start address for the two page boxes is 213 aligned.
In earlier versions, support for 64-bit processors was not required, so the kernel could obtain a base address for the thread_info structure by simply masking the low 13 bits of ESP.
We compare below to get the implementation of the thread_info of the running process
schema |
version |
define link |
implementation |
thinking resolution |
x86 |
3.14 |
current_thread_info (void) |
return (struct Thread_info *) (SP & ~ (thread_size-1)); | The
masks the low 13 bits of ESP and eventually gets the address of Thread_info |
x86 |
3.15 |
current_thread_info (void) |
ti = (void *) (This_cpu_rea D_stable (kernel_stack) + kernel_stack_offset-thread_size); |
x86 |
4.1 |
current_thread_info (void) |
(struct thread_info *) (Current_top_of_stack ()-thread_size); |
Previous versions
The current stack pointer (current_stack_pointer = = SP) is ESP,
Thread_size is 8K, the binary representation is 0000 0000 0000 0000 0010 0000 0000 0000.
~ (thread_size-1) The result is just 1111 1111 1111 1111 1110 0000 0000 0000, the 13th bit is all zero, which is just the low 13 bits of the ESP, which is the Thread_info address.
The most commonly used process is the address of the process descriptor structure task_struct rather than the thread_info structure. In order to get the TASK_STRUCT structure of the running process on the current CPU, the kernel provides the present macro, which is essentially equivalent to Current_thread_info () because Task_struct *task at the beginning of the Thread_info. >task, defined in Include/asm-generic/current.h:
#define get_current() (current_thread_info()->task)#define current get_current()
Allocation and destruction of Thread_info
The process alloc_thread_info_node
allocates its kernel stack through a function, free_thread_info
freeing the allocated kernel stack through a function.
# if THREAD_SIZE >= PAGE_SIZEstatic struct thread_info *alloc_thread_info_node(struct task_struct *tsk, int node){ struct page *page = alloc_kmem_pages_node(node, THREADINFO_GFP, THREAD_SIZE_ORDER); return page ? page_address(page) : NULL;}static inline void free_thread_info(struct thread_info *ti){ free_kmem_pages((unsigned long)ti, THREAD_SIZE_ORDER);}# elsestatic struct kmem_cache *thread_info_cache;static struct thread_info *alloc_thread_info_node(struct task_struct *tsk, int node){ return kmem_cache_alloc_node(thread_info_cache, THREADINFO_GFP, node);}static void free_thread_info(struct thread_info *ti){ kmem_cache_free(thread_info_cache, ti);}
Process Flag
unsigned int flags; /* per process flags, defined below */
Information that reacts to the status of the process, but not the running state, for the kernel to identify the current state of the process for next steps
The possible values of the flags members are as follows, with the macros starting with PF (Processflag)
See
Http://lxr.free-electrons.com/source/include/linux/sched.h? v4.5#l2083
For example
The pf_forknoexec process was just created, but not yet executed.
Pf_superpriv Super User privileges.
Pf_dumpcore dumped core.
The pf_signaled process is killed by the signal (signal).
The pf_exiting process begins to close.
/** Per Process flags*/#define PF_EXITING 0x00000004/* getting shut down */#define Pf_exitpidone 0x00000008 /* PI exit Do on shut down */#define PF_VCPU 0x00000010/* I ' m a virtual CPU */#define Pf_wq_worker 0x00000020/* I ' a workqueue worker */#define PF_FORKNOEXEC 0x00000040/* forked but didn ' t exec */#define PF _mce_process 0x00000080/* PROCESS policy on MCE errors */#define PF_SUPERPRIV 0x00000100/* Used Super-user Privileges */#define PF_DUMPCORE 0x00000200/* DUMPED core */#define PF_SIGNALED 0x00000400/* killed b Y a signal */#define PF_MEMALLOC 0x00000800/* Allocating memory */#define PF_NPROC_EXCEEDED 0x00001000//Set _user noticed that Rlimit_nproc is exceeded */#define PF_USED_MATH 0x00002000/* If unset the FPU must be initial ized before use */#define PF_USED_ASYNC 0x00004000 */Used async_schedule* (), used by module init */#define PF_NOFR Eeze 0x00008000 /* This thread should is frozen */#define Pf_frozen 0x00010000/* Frozen for system suspend */#define Pf_f Strans 0x00020000/* Inside a filesystem transaction * #define PF_KSWAPD 0x00040000/* I am KSWAPD */ #define PF_MEMALLOC_NOIO 0x00080000/* allocating memory without IO involved */#define PF_LESS_THROTTLE 0x00100000 /* Throttle Me less:i clean memory */#define PF_KTHREAD 0x00200000/* I am a kernel thread */#define Pf_random IZE 0x00400000/* randomize virtual address space */#define PF_SWAPWRITE 0x00800000/* allowed to write to Swap */#define PF_NO_SETAFFINITY 0x04000000/* Userland is not allowed to meddle with cpus_allowed */#define Pf_mce_ea RLY 0x08000000/* Early Kill for MCE process policy */#define PF_MUTEX_TESTER 0x20000000/* Thread belongs to The RT Mutex Tester */#define PF_FREEZER_SKIP 0x40000000/* Freezer should not count it as Freezable */#define PF_SU Spend_task 0x80000000 /* This thread called freeze_processes and should isn't be frozen */
Represents a member of a process kinship
/* * pointers to (original) parent process, youngest child, younger sibling, * older sibling, respectively. (p->father can be replaced with * p->real_parent->pid) */struct task_struct __rcu *real_parent; /* real parent process */struct task_struct __rcu *parent; /* recipient of SIGCHLD, wait4() reports *//* * children/sibling forms the list of my natural children */struct list_head children; /* list of my children */struct list_head sibling; /* linkage in my parent's children list */struct task_struct *group_leader; /* threadgroup leader */
In a Linux system, there is a direct or indirect connection between all processes, each of which has its parent process, and possibly 0 or more child processes. All processes that have the same parent process have a sibling relationship.
Field |
Description |
Real_parent |
Point to its parent process, and if the parent process that created it no longer exists, point to the init process with PID 1 |
Parent |
Points to its parent process, and when it terminates, it must send a signal to its parent process. Its value is usually the same as Real_parent |
Children |
Represents the head of a linked list, and all the elements in the list are its child processes |
Sibling |
Used to insert the current process into the sibling list |
Group_leader |
Lead process to the group of the process in which it resides |
Linux Process Descriptor TASK_STRUCT structure Detail management and scheduling of--linux process (i) "Turn"