Chapter III
Main content:
- Processes and Threads
- The life cycle of a process
- Creation of processes
- Termination of the process
1. Processes and Threads
Processes and threads are program runtime State, dynamic, process and thread management operations (e.g., creation, destruction, etc.) are implemented by the kernel.
Processes in Linux are lightweight compared to Windows, and are not strictly differentiated between processes and threads, but threads are a special process.
So the following only discusses the process, only if the thread and the process exist in a different place only to mention the thread.
Process provides 2 virtual mechanisms: Virtual processors and virtual memory
Each process has a separate virtual processor and virtual memory,
Each thread has a separate virtual processor, and threads within the same process may share virtual memory.
Information about processes in the kernel is mostly stored in task_struct (include/linux/sched.h)
Process identity PID and thread identity tid are equal for the same process or thread.
You can use the PS command to view information for all processes in Linux:
Ps-eo Pid,tid,ppid,comm
2. The life cycle of the process
The transformation between the various states of a process constitutes the entire life cycle of the process.
3. Creation of processes
There is a major difference between the creation process in Linux and other systems, and the creation process in Linux is divided into 2 steps: fork () and exec ().
Fork: Create a child process by copying the current process
EXEC: Reads an executable file, loads it into memory and runs
The process created:
- Call Dup_task_struct () to allocate the kernel stack, task_struct, and so on for the new process, where the content is the same as the parent process.
- Check new process (whether the number of processes exceeds the upper limit, etc.)
- Clean up information about the new process (such as PID 0, etc.) and make it distinct from the parent process.
- New process status set to Task_uninterruptible
- Update the flags members of TASK_STRUCT.
- Call Alloc_pid () to assign a valid PID to the new process
- Copy or share the appropriate information according to the parameter flags of Clone ()
- Do some cleanup work and return the new process pointer
The fork () function that creates the process actually ultimately calls the Clone () function.
The steps to create threads and processes are just the same as the arguments that eventually pass to the clone () function.
For example, create a process with a normal fork, equivalent to: Clone (SIGCHLD, 0)
Create a process that is shared with the parent process, a file system resource, a file descriptor, and a signal handler, that is, a thread: Clone (CLONE_VM | Clone_fs | Clone_files | Clone_sighand, 0)
The main difference between kernel threads created in the kernel and normal processes is that kernel threads do not have a separate address space and they can only run in kernel space.
This is related to the previously mentioned Linux kernel as a single core.
4. Termination of the process
As with the creation process, there are many steps to ending a process:
Operations on child processes (Do_exit)
- Set the identity member in Task_struct to Pf_exiting
- Call Del_timer_sync () to remove the kernel timer, making sure no timers are queued and running
- Call EXIT_MM () to release the mm_struct that the process occupies
- Call Sem__exit () so that the process leaves the queue waiting for the IPC signal
- Call Exit_files () and Exit_fs () to release the file descriptor and file system resources that the process occupies
- Set the Exit_code of the task_struct to the return value of the process
- Call Exit_notify () to send a signal to the parent process and set its state to Exit_zombie
- Switch to new process to continue execution
After a child process enters Exit_zombie, the associated resource is freed, although it is never dispatched, but the memory it consumes is not released.
For example, the creation of the allocated kernel stack, task_struct structure and so on. These are freed by the parent process.
Actions on the parent process (Release_task)
After the parent process receives the exit_notify () signal sent by the child process, the process descriptor of the child process and all resources that are exclusive to the process are removed.
As you can see from the above steps, you have to make sure that each child process has a parent process, so what happens if the parent process ends before the child process ends?
The child process has taken this into account when calling Exit_notify ().
If the child process's parent process has exited, the exit_notify () function calls Forget_original_parent () before the child process exits, and then calls Find_new_reaper () to find the new parent process.
The Find_new_reaper () function will first find a thread in the current thread group as the father, and if not, let Init do the parent process. (The init process is always present at Linux startup)
Fourth Chapter
Main content:
- What is scheduling
- Scheduling implementation principle
- The method of scheduling implementation on Linux
- Scheduling-related system calls
1. What is scheduling
Now the operating system is multi-tasking, in order to enable more tasks to better run on the system at the same time, need a management program to manage the computer on the simultaneous running of the various tasks (that is, the process).
This management program is the scheduler, its function is simple to say:
- Decide which processes to run and which processes to wait on
- Determine how long each process runs
In addition, in order to achieve a better user experience, the running process can be interrupted immediately by other more urgent processes.
In short, scheduling is a balanced process. On the one hand, it is to ensure that each running process can maximize the use of the CPU (that is, as few switching processes, process switching too much, the CPU time will be wasted on switching), on the other hand, to ensure that the process can be fair use of the CPU (that is, to prevent a process of exclusive CPU for a long time).
2. Scheduling implementation Principle
As mentioned earlier, the scheduling function is deciding which process to run and how long the process should run.
Determines which process runs and how long it takes to prioritize processes. In order to determine how long a process can last, the concept of time slices is also introduced in the schedule.
2.1 about the priority of the process
There are 2 ways to prioritize processes, one is the nice value and the other is real-time priority.
The range of nice values is -20~+19, and the higher the value the lower the priority, which means that the nice value is 20 with the highest process priority.
The real-time priority range is 0~99, as opposed to the nice value, where real-time precedence is the higher the value the greater the priority.
Real-time processes are some processes that require relatively high response times, so processes with real-time priority in the system are running queues, and they preempt the normal process run time.
3. How to implement scheduling on Linux
The scheduling algorithm on Linux is developing continuously, after the 2.6.23 kernel, it adopts the "completely fair scheduling algorithm", referred to as CFS.
When the CFS algorithm allocates CPU time for each process, it does not assign them an absolute CPU time, but instead assigns them a percentage of CPU time based on the priority of the process.
such as Processa (Ni=1), PROCESSB (ni=3), PROCESSC (ni=6), in the CFS algorithm, respectively, the percentage of CPU occupied is: Processa (10%), PROCESSB (30%), PROCESSC (60%)
Because the total is 100%,PROCESSB priority is 3 times times Processa, PROCESSC priority is Processa 6 times times.
4. Scheduling related system calls
There are 2 main types of scheduling related system calls:
1) related to scheduling policies and process priorities (that is, the above mentioned parameters, priorities, time slices, etc.)-the first 8 in the table below
2) processor-related-the last 3 in the table below
System calls |
Describe |
Nice () |
Set the nice value of a process |
Sched_setscheduler () |
Set the scheduling policy for the process, that is, what scheduling algorithm the set process takes |
Sched_getscheduler () |
Get the scheduling algorithm for a process |
Sched_setparam () |
Set the real-time priority of a process |
Sched_getparam () |
Gets the real-time priority of the process |
Sched_get_priority_max () |
Get the maximum value of real-time priority, due to user rights issues, non-root users cannot set real-time priority to 99 |
Sched_get_priority_min () |
Gets the minimum value for the real-time priority, similar to the above |
Sched_rr_get_interval () |
Gets the time slice of the process |
Sched_setaffinity () |
The processing affinity of the set process is, in fact, the mask flag of cpu_allowed stored in task_struct. Each bit of the mask corresponds to a processor that is available on a system, and the default all bits are set, that is, the process can be executed on all processors in the system. This function allows the user to set different masks so that the process can only run on one or more processors in the system. |
Sched_getaffinity () |
Get processing affinity for a process |
Sched_yield () |
Temporarily let the processor |
Linux reading notes third to fourth chapter