Well-known kernel threads in Linux (1) -- ksoftirqd and events
Well-known kernel threads in Linux (1) -- ksoftirqd and events
-- Lvyilong316
We know that there are many kernel threads created by the linux system, which guarantee the normal operation of the system. Here we can see two well-known ones: ksoftirqd and events.
1. ksoftirqd
When it comes to ksoftirqd, you have to say "softirq", because this thread is used to execute soft interruptions (accurately speaking, it should be too many soft interruptions ). We know that, by priority, interruptions> soft interruptions> Users can interrupt soft interruptions, while soft interruptions can interrupt user processes.
For Soft Interrupt, the kernel will execute at several special times (note the difference between execution and scheduling. Soft Interrupt scheduling only marks the Soft Interrupt and does not actually execute it ), however, it is the most common process for the interrupt handler to return. Soft interruptions may be triggered frequently (for example, during large-traffic network communication ). What's more, Soft Interrupt execution functions sometimes schedule themselves. If Soft Interrupt occurs frequently, and they have the ability to reset themselves to the executable state, therefore, the process in the user space cannot obtain enough processing time, and thus the process is in hunger. To avoid user process hunger. Kernel developers have made some compromises. In the final implementation scheme of the kernel, they will not immediately handle the Soft Interrupt that is re-triggered by the Soft Interrupt itself (the Soft Interrupt Nesting is not allowed ). As an improvement, the kernel will wake up a group of kernel threads to handle these too many soft interruptions. These kernel threads run at the lowest priority (nice value is 19 ), this prevents them from robbing resources with other important tasks, but they will be executed in the end, so this solution can ensure that when the soft interruption load is heavy, the user process will not be hungry because it cannot get the processing time. Correspondingly, it can also ensure that excessive soft interruptions will eventually be processed.
Each processor has such a thread. All threads are named ksoftirq/n. The difference is that n corresponds to the processor number.
Next, let's take a detailed look at how the Soft Interrupt is executed by ksoftirqd. First, let's take a look at the process of handling and scheduling soft interruptions. A Soft Interrupt must be scheduled (activated) before execution. The term is "raisethesoftirq ". The activated softirq is usually not executed immediately. It usually checks whether there is a pending softirq in the current system at a later time. If yes, it will be executed, in linux, the soft interrupt function is do_softirq (), which is called in two more places. One is the ksoftirqd kernel thread we discuss when the return is interrupted. Let's first look at the returned results of the interruption.
1.1irq _ exit
// This function is called when the do_IRQ function exits after the hardware ISR is executed.
- Void irq_exit (void)
- {
- Account_system_vtime (current );
- Trace_hardirq_exit ();
- Sub_preempt_count (IRQ_EXIT_OFFSET); // modify preempt_count
- // Determine whether there is a nested hardware interrupt and whether there is a soft interrupt in the pending state. Note: only when both conditions are met can do_softirq () be called to enter the Soft Interrupt. That is to say, it will only enter when all the current hardware interrupt processing is completed and the Soft Interrupt Processing is installed. Details about in_interrupt () will be analyzed later.
- If (! In_interrupt () & local_softirq_pending ())
- // Actually, do_softirq () is called for execution.
- Invoke_softirq ();
- Preempt_enable_no_resched ();
- }
1.2in _ interrupt
Here we need to analyze the meaning of the in_interrupt () function. In the Linux kernel, several interfaces are defined to determine the context of the current execution path:
- #define hardirq_count() (preempt_count() & HARDIRQ_MASK)
- #define softirq_count() (preempt_count() & SOFTIRQ_MASK)
- #define irq_count() (preempt_count() & (HARDIRQ_MASK | SOFTIRQ_MASK | NMI_MASK))
- /*
- * Are we doing bottom half or hardware interrupt processing?
- * Are we in a softirq context? Interrupt context?
- */
- #define in_irq() (hardirq_count())
- #define in_softirq() (softirq_count())
- #define in_interrupt() (irq_count())
- /*
- * Are we in NMI context?
- */
- #define in_nmi() (preempt_count() & NMI_MASK)
It can be seen from the annotations that the hardware interrupt context, the software interrupt context, and the unshielded context. These macros all involve the macro preempt_count (). This macro is a very important macro and has made a detailed comment on it in the Linux source code:
- /*
- * We put the hardirq and softirq counter into the preemption
- * counter. The bitmask has the following meaning:
- *
- * - bits 0-7 are the preemption count (max preemption depth: 256)
- * - bits 8-15 are the softirq count (max # of softirqs: 256)
- *
- * The hardirq count can in theory reach the same as NR_IRQS.
- * In reality, the number of nested IRQS is limited to the stack
- * size as well. For archs with over 1000 IRQS it is not practical
- * to expect that they will all nest. We give a max of 10 bits for
- * hardirq nesting. An arch may choose to give less than 10 bits.
- * m68k expects it to be 8.
- *
- * - bits 16-25 are the hardirq count (max # of nested hardirqs: 1024)
- * - bit 26 is the NMI_MASK
- * - bit 28 is the PREEMPT_ACTIVE flag
- *
- * PREEMPT_MASK: 0x000000ff
- * SOFTIRQ_MASK: 0x0000ff00
- * HARDIRQ_MASK: 0x03ff0000
- * NMI_MASK: 0x04000000
- */
The meaning of each bit in preempt_count is as follows:
(1) bit0 ~ 7-digit indicates the preemption count, that is, the maximum preemption depth is 256.
(2) bit8 ~ 15 bits indicates the Soft Interrupt count, that is, the maximum number of soft interrupts supported is 256. Note that the soft interrupt is also subject to the pending status, a 32-bit variable, therefore, a maximum of 32 soft interruptions are supported.
(3) bit16 ~ 25 BITs indicate the number of nested layers of hardware interruption, that is, the maximum number of supported nesting layers is 1024. In actual situations, this is impossible, because the number of nested layers of interruptions is also subject to the size of the stack space for interrupt processing.
I have introduced so much. Now I want to analyze what in_interrupt mentioned above actually means?
- #define in_interrupt() (irq_count())
- #define irq_count() (preempt_count() & (HARDIRQ_MASK | SOFTIRQ_MASK \
- | NMI_MASK))
The macro definition shows that the value of the in_interrupt macro is the sum of the nested layers of hardware interrupt, Soft Interrupt count, and shielded interrupt. Therefore, if the value of in_interrupt () is greater than 0, the Soft Interrupt will not be processed, meaning (a) When there is a hardware interrupt nesting, (B) or Soft Interrupt is disabled (c) if the interrupt cannot be blocked, the Soft Interrupt will not be handled. Someone may ask if the Soft Interrupt is not entered from irq_exit after the interrupt is handled? Isn't the hard interrupt bit of preempt_count in Soft Interrupt execution not modified yet? Actually, the modification has been made, and it will be done in sub_preempt_count in the irq_exit function. In fact, if sub_preempt_count is executed, the interrupt handler is exited.
L note: If a Soft Interrupt is disabled, the Soft Interrupt count will be increased;
- __local_bh_disable((unsigned long)__builtin_return_address(0));
- static inline void __local_bh_disable(unsigned long ip)
- {
- add_preempt_count(SOFTIRQ_OFFSET);
- barrier();
- }
- # define add_preempt_count(val) do { preempt_count() += (val); } while (0)
1.3do _ softirq
Next we will focus on the following do_softirq () to learn how the Linux kernel handles softirq.
- Asmlinkage void do_softirq (void)
- {
- _ U32 pending;
- Unsigned long flags;
- // This function determines that if there is a nested hardware interrupt or the Soft Interrupt is disabled, the system will return immediately. This portal is primarily used to determine the mutex between ksoftirqd and ksoftirqd.
- If (in_interrupt ())
- Return;
- // Run the following code
- Local_irq_save (flags );
- // Determine whether a pending Soft Interrupt needs to be handled.
- Pending = local_softirq_pending ();
- // If yes, call _ do_softirq () for actual processing.
- If (pending)
- _ Do_softirq ();
- // Start interrupted execution
- Local_irq_restore (flags );
- }
Note that the call of local_softirq_pending () to obtain pending and the pending 0 must be in the Guanzhong disconnected state. Otherwise, the two operations may be interrupted, and the slot mark of the soft section in the interrupted scheduling will be lost.
True Soft Interrupt Processing is in _ do_softirq.
1.4 _ do_softirq
- // The maximum number of Soft Interrupt calls is 10.
- # Define MAX_SOFTIRQ_RESTART 10
- Asmlinkage void _ do_softirq (void)
- {
- // Software interrupt processing structure, which includes Soft Interrupt callback functions.
- Struct softirq_action * h;
- _ U32 pending;
- Int max_restart = MAX_SOFTIRQ_RESTART;
- Int cpu;
- // Obtain all pending soft interruptions.
- Pending = local_softirq_pending ();
- Account_system_vtime (current );
- // Disable other soft interrupts when executed here, which proves that only one Soft Interrupt can run on each CPU.
- _ Local_bh_disable (unsigned long) _ builtin_return_address (0 ));
- Trace_softirq_enter ();
- // Obtain the CPU currently being processed for SMP
- Cpu = smp_processor_id ();
- Restart:
- // Reset the flag of Soft Interrupt before each loop allows strong hardware interruption.
- /* Reset the pending bitmask before enabling irqs */
- Set_softirq_pending (0); // it can be called only when the gateway is disconnected.
- // The operation is interrupted only here. Note: In the previous running status, the operation was interrupted. In this case, the current Soft Interrupt may be preemptible by hardware interruption. That is to say, when a soft interrupt occurs, it will not be preemptible by the hardware interrupt in the first place. Only code later can be preemptible by hardware interruption.
- Local_irq_enable ();
- // Note that the following code can be preempted by hardware Interruption During running, but after the execution of the hardware interruption is completed, the registered Soft Interrupt cannot run immediately. Don't forget, although hardware interrupt execution is enabled, the previous _ local_bh_disable () function shields the Soft Interrupt. Therefore, this environment can only be preemptible by hardware interruptions, but the Soft Interrupt callback function of hard interrupt registration cannot be run. To ask why, it is because the _ local_bh_disable () function sets a flag as a mutex, which is exactly the in_interrupt () in the irq_exit () and do_softirq () functions above () one of the conditions for function judgment, that is, in_interrupt () function not only detects hard interruptions, but also determines soft interruptions. Therefore, when a hard interrupt is triggered in this environment, the Soft Interrupt of registration cannot be re-entered into this function. It can only be a flag, waiting for the following repeating loop (max MAX_SOFTIRQ_RESTART) it is possible to handle the Soft Interrupt that is registered for the hardware interrupt triggered at this time. Obtain the soft interrupt vector table.
- H = softirq_vec;
- // Cyclically process all softirq Soft Interrupt registration functions.
- Do {
- // If the pending flag is set for the soft interrupt, it indicates that you need to further process the functions it registers.
- If (pending & 1 ){
- // The callback function registered for this soft interrupt is executed here.
- H-> action (h );
- Rcu_bh_qsctr_inc (cpu );
- }
- // Continue searching until all pending soft interruptions in the Soft Interrupt vector table are processed.
- H ++;
- // The Code shows that the bitwise operation indicates that only 32 Soft Interrupt callback functions are processed in one loop.
- Pending> = 1;
- } While (pending );
- // Execute the following code During disconnection. Note: once again, hardware interruption cannot be preemptible during the following code execution.
- Local_irq_disable ();
- // As mentioned above, when the hardware interrupt execution environment is just opened, it can only be preemptible by hardware interruptions. At this time, it cannot handle soft interruptions, because the interrupted execution can be preemptible multiple times by hardware interruptions, A Soft Interrupt may be registered every time it is preemptible, so we need to re-take all the soft interruptions. So that the following code can be processed and then jump back to restart for repeated execution.
- Pending = local_softirq_pending ();
- // If a hardware interrupt is triggered in the above interrupt execution environment and a soft interrupt is registered, the pending bit is set for this Soft Interrupt, however, execution cannot be performed in an environment that has been blocked from soft interruptions. As mentioned above, irq_exit () and do_softirq () cannot be implemented in this process. This is a detailed record above. There is another opportunity for execution. Note: although the current environment is always in the environment where Soft Interrupt execution is blocked, however, here is an opportunity to execute the Soft Interrupt that was registered when the hardware interrupt was triggered during the interrupt environment. In fact, you only need to understand the Soft Interrupt mechanism, it is nothing more than calling ISR functions registered to the soft interrupt vector table in some specific environments. If the hardware interrupt is registered with a soft interrupt and the number of repeated executions is less than 10, go to the restart flag and repeat all the steps described above: Set the Soft Interrupt flag, re-open interrupted execution...
- // Note: The preceding steps can be repeated only when both conditions are met.
- If (pending & -- max_restart)
- Goto restart;
- // If the above steps are repeated for 10 times and the pending Soft Interrupt occurs, the system may reach a peak value within a certain period of time to balance this. The system creates a ksoftirqd thread to handle the issue, so as to avoid loading too much at a specific time. The ksoftirqd thread itself is a large loop. In some conditions, in order not to overload the thread, it can be preemptible by other processes, but note that it shows that preempt_xxx () is called () and schedule () will be preemptible and converted. The reason for this is that once the local_softirq_pending () function is called to detect that a pending Soft Interrupt needs to be handled, it will display the call of do_softirq () to process soft disconnections. That is to say, the ksoftirqd thread awakened by the following code may return to this function. Especially when the system needs to respond to many soft interruptions, its call entry is do_softirq (), this is why the in_interrupt () function is used to determine whether there is a reason for Soft Interrupt Processing at the entry of do_softirq (). The purpose is to prevent re-entry. The implementation of ksoftirqd shows the analysis of the ksoftirqd () function.
- If (pending)
- // This function actually calls wake_up_process () to wake up ksoftirqd
- Wakeup_softirqd ();
- Trace_softirq_exit ();
- Account_system_vtime (current );
- // The Soft Interrupt execution environment is enabled until the end, allowing Soft Interrupt execution. Note: The call to do_softirq () is not triggered again because it is not local_bh_enable.
- _ Local_bh_enable ();
- }
1.5 ksoftirqd
This function is the execution function corresponding to the ksoftirqd kernel thread. Ksoftirq calls do_softirq () to process the Soft Interrupt (discovered by softirq_pending () function. By repeating this operation, the Soft Interrupt triggered again will also be executed. If necessary, schedule () is called after each iteration to give more important processes a processing opportunity. After all the operations to be executed are completed, the kernel thread sets itself to the TASK_INTERTUPTIBLE state, arousing the scheduler to select other executable processes for running.
- Static int ksoftirqd (void * _ bind_cpu)
- {
- // Display the static priority of the current process by calling this function. Of course, this priority will change with the scheduler policy.
- Set_user_nice (current, 19 );
- // Set that the current process cannot be started
- Current-> flags | = PF_NOFREEZE;
- // Set the current process status to an interrupted state, which can respond to signal processing.
- Set_current_state (TASK_INTERRUPTIBLE );
- // The following is a large loop. The loop determines whether the current process will stop. If not, it will continue to determine whether the current pending Soft Interrupt needs to be handled.
- While (! Kthread_should_stop ()){
- // If the process can be processed, the current process is prohibited from being preemptible during the processing period.
- Preempt_disable ();
- // First, judge that the system does not have a pending Soft Interrupt.
- If (! Local_softirq_pending ()){
- // If NO, preemptible is allowed before giving up the CPU, because the code is always executed without preemptible.
- Preempt_enable_no_resched ();
- // It is displayed that calling this function voluntarily abandons the CPU to put the current process into the sleep queue and converts the new process execution (the scheduler is not recorded here)
- Schedule ();
- // Note: if the current display shows that the process that calls the schedule () function is scheduled to run again, the next statement that calls this function will start to be executed. That is to say, if the current process is executed again, the following preempt_disable () function will be executed. When a process is scheduled again, the current process is prohibited from being preemptible during the following processing periods.
- Preempt_disable ();
- }
- /* Set the current process to running. Note: The current process has been set to not be preemptible. After the process enters the loop, the two branches above will be executed here no matter which one is going. First, the pending Soft Interrupt needs to be executed when it enters the loop. Second, there is no pending Soft Interrupt during the loop, and the current process continues to be executed when it is scheduled to obtain the CPU again. */
- _ Set_current_state (TASK_RUNNING );
- /* Cyclically determine whether a pending Soft Interrupt exists. If yes, call do_softirq () for specific processing. Note: This is also the entry point of do_softirq (). After _ do_softirq () processes the Soft Interrupt callback function for 10 times cyclically, will be called again here. Here, _ do_softirq () may be called to handle Soft Interrupt callback functions. As mentioned earlier in _ do_softirq (), the system is in a busy state if it cannot be processed for 10 times. Based on the above analysis, we can imagine that if the system is very busy, this process will run alternately with do_softirq (). At this time, this process will occupy a very high CPU, although the following cond_resched () function does some processing, the current processing process may reduce the CPU load due to scheduling after handling a Soft Interrupt, however, when the process is very busy, it may still occupy a lot of CPU. */
- While (local_softirq_pending ()){
- /* Preempt disable stops cpu going offline. If already offline, we'll be on wrong CPU: don't process */
- If (cpu_is_offline (long) _ bind_cpu ))
- /* If the associated CPU cannot continue to process, it will jump to the wait_to_die flag and wait until the end and exit. */
- Goto wait_to_die;
- /* Execute do_softirq () to handle the specific Soft Interrupt callback function. Note: If a Soft Interrupt is being processed at this time, it will be returned immediately. Do you still remember the in_interrupt () function described earlier. */
- Do_softirq ();
- /* Allow the current process to be preemptible. */
- Preempt_enable_no_resched ();
- /* This function may indirectly call schedule () to convert the current process and allow the current process to be preemptible. That is to say, when processing a Soft Interrupt callback function, it may be converted to another process. I think the purpose of this operation is to prevent the process from occupying CPU for a long time when some loads exceed the standard, the second is to prevent other processes from responding when there are many soft interruptions that need to be handled. */
- Cond_resched ();
- /* Disable the current process from being preemptible. */
- Preempt_disable ();
- /* Have all soft interruptions been processed? If not, continue the previous steps */
- }
- /* After all the processes are processed, the current process is allowed to be preemptible, and the current process status is set to an interrupted state. */
- Preempt_enable ();
- Set_current_state (TASK_INTERRUPTIBLE );
- }
- /* If it will stop, the current process will be set to run and then return directly. The scheduler runs the current process based on the priority. */
- _ Set_current_state (TASK_RUNNING );
- Return 0;
- /* Wait until the current process is stopped */
- Wait_to_die:
- /* Allow the current process to be preemptible. */
- Preempt_enable ();
- /* Wait for kthread_stop */
- /* Set the current process status to an interrupted state, which can respond to signal processing. */
- Set_current_state (TASK_INTERRUPTIBLE );
- /* Determine whether the current process will be stopped. If not, set the process status to "interrupted" and discard the active switch of the current CPU. That is to say, it will wait until the current process is stopped. */
- While (! Kthread_should_stop ()){
- Schedule ();
- Set_current_state (TASK_INTERRUPTIBLE );
- }
- /* If it will stop, the current process will be set to run and then return directly. The scheduler runs the current process based on the priority. */
- _ Set_current_state (TASK_RUNNING );
- Return 0;
- }
Finally, it is explained that because tasklet is implemented through soft interruptions, too many tasklets will also cause scheduling of the ksoftirqd thread, and then execute tasklet in the process context. (Ksoftirqd executes the Soft Interrupt Processing Program, and the corresponding Soft Interrupt Processing Program of tasklet executes all scheduled tasklets)
2. events
Let's look at the events thread. When it comes to this thread, we have to say "workqueue". This thread is used by the Task Force to execute the work in the queue.
2.1 What is workqueue?
Workqueue is also implemented in the lower part of linux (including Soft Interrupt, tasklet, and work queue. The Workqueue mechanism in Linux aims to simplify the creation of kernel threads. You can create a kernel thread by calling the workqueue interface. In addition, the number of threads can be created based on the number of CPUs of the current system, so that the transaction processed by the thread can be parallel.
Workqueue is a simple and effective mechanism in the kernel. It obviously simplifies the creation of the kernel daemon and facilitates User Programming.
2.2Workqueue mechanism implementation
The Workqueue mechanism defines two important data structures. The analysis is as follows:
1. cpu_workqueue_struct structure. This structure binds the CPU and kernel threads. During the creation of workqueue, Linux creates cpu_workqueue_struct based on the number of CPUs of the current system. This structure mainly maintains a work_struct queue, a waiting queue for Kernel threads to sleep, and a task context, task_struct.
2. The work_struct structure is an abstraction of the task. In this structure, you need to maintain the specific task method, data to be processed, and the task processing time. The structure is defined as follows:
- Struct work_struct {
- Unsigned long pending;
- Struct list_head entry;/* mount the task to the mount point of the queue */
- Void (* func) (void *);/* task Method */
- Void * data;/* data processed by the task */
- Void * wq_data;/* owner of work */
- Strut timer_list timer;/* task delay processing timer */
- };
When you call the workqueue initialization interface create_workqueue or create_singlethread_workqueue to initialize the workqueue queue, the kernel begins to assign a workqueue object to the user and link it to a global workqueue queue. Linux then allocates the cpu_workqueue_struct object with the same number of CPUs for the workqueue object based on the current CPU condition. Each cpu_workqueue_struct object will have a task queue. Next, Linux assigns a kernel thread for each cpu_workqueue_struct object, that is, the kernel daemon, to process tasks in each queue. At this point, the user calls the initialization interface to initialize the workqueue and returns the pointer of the workqueue.
During the initialization of workqueue, the kernel needs to initialize the kernel thread. The registered kernel thread is relatively simple, that is, it constantly scans the task queue corresponding to cpu_workqueue_struct and obtains a valid task from it, then execute the task. Therefore, if the task queue is empty, the kernel daemon will sleep in the wait queue in cpu_workqueue_struct until someone wakes up daemon to process the task queue.
After the Workqueue Initialization is complete, the Context Environment for running the task is built, but there are no executable tasks, so you need to define a specific work_struct object. Then, add work_struct to the task queue. Linux will wake up daemon to process the task.
The workqueue kernel implementation principles described above can be described as follows:
In the Workqueue mechanism, a default workqueue queue-keventd_wq is provided, which is created during Linux Initialization. You can directly initialize a work_struct object and schedule it in the queue for more convenient use.
We can see that kernel threads such as events/0 and events/1 are the kthreads of the default work queue and the jobs created on each cpu.
Someone will ask, what if we create a work queue ourselves? If create_singlethread_workqueue is used for creation, only one kthread is generated. If create_workqueue is used for creation, a kthread is created on each cpu, just like the default working queue. The kthread name has parameters passed in.
LWorkqueue Programming Interface
Serial number |
Interface functions |
Description |
1 |
Create_workqueue |
Creates a workqueue queue and creates a kernel thread for each CPU in the system. Input parameters: @ Name: name of workqueue |
2 |
Create_singlethread_workqueue |
Creates workqueue and only one kernel thread. Input parameters: @ Name: workqueue name |
3 |
Destroy_workqueue |
Release the workqueue queue. Input parameters: @ Workqueue_struct: pointer of the workqueue queue to be released |
4 |
Schedule_work |
Schedule and execute a specific task. The task will be attached to the workqueue -- keventd_wq provided by Linux system. Enter the following parameters: @ Work_struct: task object pointer |
5 |
Schedule_delayed_work |
The function is similar to schedule_work. If a specific task is executed with a certain delay time, enter the following parameters: @ Work_struct: task object pointer @ Delay: delay Time |
6 |
Queue_work |
Schedule and execute a task in the specified workqueue. Input parameters: @ Workqueue_struct: Specifies the workqueue pointer. @ Work_struct: task object pointer |
7 |
Queue_delayed_work |
Delayed scheduling executes a task in a specified workqueue. The function is similar to queue_work. The input parameter has a delay parameter. |