This article is reproduced to: http://linux.cn/thread-5510-1-1.html
This article mainly from the user's point of view to the Linux 2.6 kernel of the lower half of the mechanism SOFTIRQ, tasklet and Workqueue analysis, for these three mechanisms in the kernel of the implementation of the concrete does not carry out in-depth analysis, if the reader is interested to understand, You can read the relevant parts of the Linux kernel source code directly.
Note This document is collected and collated by meteors from the Internet, released in the spirit of free software Open source code, free of charge, used and republished by anyone, but you do not limit the right of others to redistribute your content. The purpose of this article is to hope it will be useful to readers, but without warranty or even implied warranties for a particular purpose. For more information, refer to the GNU General Public License (GPL) and the GNU Free Documentation Agreement (GFDL).
Directory
1 overview
2 Linux 2.6 Kernel interrupts the lower half of the mechanism
2.1 SOFTIRQ mechanism
2.2 Tasklet mechanism
2.3 Workqueue mechanism
3 comparison of several lower-half mechanisms
4 selection of the lower half mechanism
5 comparison of the lower half mechanism between Linux and Ngsa
Analysis of the lower half mechanism of 5.1 ngsa interruption
Analysis of mechanism defects in the lower half of 5.2 ngsa
1 overview
Interrupt service programs often need to run in the case of CPU shutdown in order to avoid the interruption of nesting and complicate control, but the time to close the interrupt can not be too long, otherwise it will cause loss of interrupt signal. To do this, in Linux, the interrupt handler is divided into two parts, namely the upper half and the lower half. The upper half is typically used to perform critical programs that are closely related to hardware, which is performed very briefly and is run in a disconnected environment. Some operations that are not strict in time, and typically time-consuming, are performed by the lower part of the code, which is executed in the open interrupt. The upper half handles hardware-related, known as hardware interrupts, which typically requires immediate execution. The lower half can delay a certain amount of time to execute the program at the appropriate time period in the kernel, which is the soft interrupt we are discussing here.
This article takes the latest version of the Linux kernel 2.6.22 as an example to discuss the Linux interrupt the lower half of the mechanism. In the 2.6 version of the kernel, the lower half of the mechanism is mainly implemented by SOFTIRQ, Tasklet and Workqueue, the following focuses on the analysis of these 3 mechanisms.
2 Linux 2.6 Kernel interrupts the lower half of the mechanism
In the older version of the Linux kernel, the lower half is implemented in a mechanism called Bottom Half (BH), which was originally implemented with the interrupt vector, which uses a set of 32 function pointers in the system, representing 32 interrupt vectors, This implementation is currently in the 2.4 version of the kernel can also see its figure. But it is not visible in the 2.6 version of the kernel at this time. Now the Linux kernel, generally using a soft interrupt mechanism called SOFTIRQ to implement the lower half.
2.1 SOFTIRQ mechanism
The original BH mechanism has two obvious defects: first, the system can only have one CPU at a time to perform BH code, the second is BH function does not allow nesting. This may not matter in a single-processor system, but it is a fatal flaw in an SMP system. But the soft interrupt mechanism is different. The SOFTIRQ mechanism of Linux is closely connected with SMP, and the design and implementation of the whole SOFTIRQ mechanism always runs through a thought: "Who triggers, who executes" (who marks, who runs), that is, each CPU is solely responsible for the soft interrupts it triggers and does not interfere with each other. This effectively utilizes the performance and characteristics of SMP system, and greatly improves the processing efficiency.
Linux defines a softirq_action structure in include/linux/interrupt.h to describe a SOFTIRQ request, as follows:
struct softirq_action
{
void (*action) (struct softirq_action *);
void *data;
};
Where the function pointer action points to the service function of the soft interrupt request, and data points to the parameter data that is interpreted by the service function itself.
Based on the above structure, a global SOFTIRQ soft interrupt vector table SOFTIRQ_VEC[32] is defined in kernel/softirq.c, which corresponds to the soft interrupt descriptor represented by 32 softirq_action structures. In fact, Linux does not use 32 soft interrupt vectors, and the kernel defines some soft interrupt vectors for us to use:
Enum
{
Hi_softirq=0,
TIMER_SOFTIRQ,
NET_TX_SOFTIRQ,
NET_RX_SOFTIRQ,
BLOCK_SOFTIRQ,
TASKLET_SOFTIRQ,
SCHED_SOFTIRQ,
#ifdef config_high_res_timers
HRTIMER_SOFTIRQ,
#endif
};
The HI_SOFTIRQ is used to implement high priority soft interrupts, such as High-priority Hi_tasklet, while TASKLET_SOFTIRQ is used to implement generic soft interrupts such as Tasklet. About Tasklet, we'll introduce them later. We do not need to use 32 soft interrupt vectors, in fact, the predefined soft interrupt vectors of the kernel already meet the needs of most of our applications. Other vectors are reserved for future kernel extensions, and we should not use them.
To use SOFTIRQ, we must initialize it first. We use the OPEN_SOFTIRQ () function to open a specified soft interrupt vector nr, initialize the NR-corresponding descriptor Softirq_vec[nr], and set the corresponding bit of the soft interrupt mask for all CPUs to be 1. The function Do_softirq () is responsible for performing the soft interrupt service function set in array softirq_vec[32]. Each CPU performs a soft interrupt service by executing this function. Because the soft interrupt service routines on the same CPU do not allow nesting, the DO_SOFTIRQ () function initially checks to see if the current CPU is already in the interrupt service and returns immediately if it is. On the same CPU, the DO_SOFTIRQ () is executed serially.
After a soft interrupt is registered with OPEN_SOFTIRQ (), we need to trigger it. The kernel uses function RAISE_SOFTIRQ () to trigger a soft interrupt. For a specified SOFTIRQ, there will be only one handler function, which is shared by all CPUs. Since the processing functions of the same SOFTIRQ may be executed concurrently on different CPUs and produce competitive conditions, it is important to process the synchronization mechanism of the function itself. Activating a soft interrupt is typically performed in the upper half of the interrupt. When an interrupt handler wants to activate a soft interrupt, RAISE_SOFTIRQ () is invoked. At some later time, when DO_SOFTIRQ () is running on a CPU, the associated soft interrupt handler is invoked.
It is important to note that in the SOFTIRQ mechanism, there is also a small kernel thread ksoftirqd. This is designed to balance the system load. Just imagine, if the system is constantly triggering soft interrupt requests, the CPU will continue to deal with soft interrupts, because at least every time the clock interrupts will be executed once DO_SOFTIRQ (). In this way, other important tasks in the system are not to be starved for a long time without the CPU. This small kernel thread is especially useful when the system is busy, and too many soft interrupt requests are put into place for the appropriate time period to perform more opportunities for other processes.
In the 2.6 kernel, DO_SOFTIRQ () is executed in Irq_exit (). In the process of interrupting the upper half, the DO_SOFTIRQ () is invoked only in irq_exit () for soft interrupt processing, which is very advantageous to the upgrade and porting of the soft interrupt module. This process does give us a lot of convenience if we need to migrate Linux's soft interrupts in our Ngsa, because we only need to make minor changes to the top half of our interrupt execution. If there are many soft interrupt calls in the upper part of the interrupt, then our transplant will be painful.
Some people may have the problem: there are up to 32 SOFTIRQ in the system, so how many softirq,cpu are found. Obviously, when we perform a RAISE_SOFTIRQ () trigger on a soft interrupt, we have to have a good mechanism to ensure that the triggering action is performed quickly and accurately. In Linux, we use a structure irq_cpustat_t to organize soft interrupts. It is defined in Include/asm-xxx/hardirq.h, where xxx represents the corresponding processor architecture. For example, for PowerPC processors, this structure is defined in Include/asm-powerpc/hardirq.h as follows:
typedef struct{
unsigned int __softirq_pending; /* Set_bit is used on this */
unsigned int __last_jiffy_stamp;
} ____cacheline_aligned irq_cpustat_t;
extern irq_cpustat_t irq_stat[]; /* defined in ASM/HARDIRQ.H * *
#define __IRQ_STAT (CPU, member) (Irq_stat[cpu].member)
Where the __softirq_pending member uses the bit map to indicate whether the corresponding SOFTIRQ is activated (that is, whether it is in the pending state). The main task of RAISE_SOFTIRQ is to set the corresponding bit of SOFTIRQ in __softirq_pending, which is implemented as follows:
void Fastcall RAISE_SOFTIRQ (unsigned int nr)
{
unsigned long flags;
Local_irq_save (flags);
Raise_softirq_irqoff (NR);
Local_irq_restore (flags);
}
inline fastcall void Raise_softirq_irqoff (unsigned int nr)
{
__raise_softirq_irqoff (NR);
if (!in_interrupt ())
WAKEUP_SOFTIRQD (); /* Wake kernel thread KSOFTIRQD * *
}
#define __raise_softirq_irqoff (NR) do {or_softirq_pending 1UL << (NR)); } while (0)
#define OR_SOFTIRQ_PENDING (x) (Local_softirq_pending () |= (x))
#define LOCAL_SOFTIRQ_PENDING () \
__irq_stat (smp_processor_id (), __softirq_pending)
Here is a macro function local_softirq_pending (), which is actually the __softirq_pending member value that returns the corresponding irq_cpustat_t structure IRQ_STAT[CPU] for the current CPU. therefore __raise_softirq_irqoff (NR) is the role of the SOFTIRQ to be triggered in the __softirq_pending in the corresponding position 1, in DO_SOFTIRQ () through the check irq_stat[cpu] The corresponding pending bit in the is set to execute the SOFTIRQ.
2.2 Tasklet mechanism
Tasklet is actually a kind of special soft interrupt, the soft interrupt vector HI_SOFTIRQ and TASKLET_SOFTIRQ are all realized by tasklet mechanism. The term "Tasklet" is meant to be "small task", where it refers to a small piece of executable code. To some extent, the Tasklet mechanism is an extension of the Linux kernel to BH mechanism, but unlike BH, different tasklet codes can be executed in parallel on multiple CPUs at the same time. At the same time, it is not the same as the general Softirq soft Interrupt, a piece of Tasklet code at the same time can only run on one CPU, and SOFTIRQ registered soft Interrupt service function (that is, softirq_ The action function pointer in the action structure can be executed concurrently by multiple CPUs at the same time.
The Linux kernel uses the TASKLET_STRUCT structure to describe a tasklet, which is also defined in Include/linux/interrupt.h, as follows:
struct TASKLET_STRUCT
{
struct Tasklet_struct *next;
unsigned long state;
atomic_t count;
void (*func) (unsigned long);
unsigned long data;
};
Among them, the meaning of each member is as follows:
(1) The next pointer points to the next tasklet, which is used to connect multiple tasklet into a one-way cyclic list. To this end, the kernel also specifically defines a tasklet_head structure in softirq.c to represent the Tasklet queue:
struct Tasklet_head
{
struct Tasklet_struct *list;
};
(2) State defines the current status of the Tasklet, which is a 32-bit unsigned integer, but at present only the bit 0 and bit 1,bit 0 1 indicate that Tasklet has been scheduled to execute, and bit 1 is specifically set for the SMP system. 1 indicates that Tasklet is currently executing on a CPU in order to prevent multiple CPUs from executing a tasklet at the same time. The kernel also has a predefined meaning for the two bits:
Enum
{
tasklet_state_sched,/* Tasklet is scheduled for execution * *
Tasklet_state_run/* Tasklet is running (SMP only) * *
};
(3) Count is an atomic count, and a reference to the Tasklet is counted. Note that the Tasklet code snippet can only be executed when the value of Count is 0, that is, the Tasklet is enable at this time, and if the count value is not 0, the Tasklet is prohibited (disable). Therefore, before executing the Tasklet code snippet, you must first check whether the atomic value count is 0.
(4) Func is a function pointer to an executable Tasklet code snippet, and data is the parameter of the Func function.
The use of Tasklet is simple: Define a tasklet execution function, then initialize a Tasklet descriptor with the function, and then use the Tasklet soft interrupt trigger function to register the defined Tasklet so that the system can schedule it to run at the appropriate time.
The kernel prepares two macro definitions for Tasklet to declare and initialize a Tasklet descriptor:
#define Declare_tasklet (Name, func, data) \
struct Tasklet_struct name = {NULL, 0, atomic_init (0), Func, data}
#define DECLARE_TASKLET_DISABLED (Name, func, data) \
struct Tasklet_struct name = {NULL, 0, Atomic_init (1), Func, data}
As you can see from the definition above, declare_tasklet after a tasklet is initialized, the Tasklet is enable, and declare_tasklet_disabled is used to initialize and disable a tasklet.
The Enable and disable operations of the Tasklet always appear in pairs, using the tasklet_enable () function and the tasklet_disable () function respectively.
The general operation to initialize the specified Tasklet descriptor is implemented with Tasklet_init (), while Tasklet_kill () is used to kill a tasklet, which is restored to an unscheduled state. If the Tasklet is not finished, the kernel waits for it to finish. It should be noted that because calling the function may cause hibernation, it is prohibited to call it in the interrupt context.
Although the Tasklet mechanism is an implementation specific to the soft interrupt vector HI_SOFTIRQ and TASKLET_SOFTIRQ, the tasklet mechanism is still within the overall framework of the SOFTIRQ mechanism, so its design and implementation must still adhere to "who triggers, who executes "Thought. For this reason, Linux defines a tasklet queue header for each CPU in the system to represent the Tasklet queues that should be executed by each CPU.
Static define_per_cpu (struct tasklet_head, Tasklet_vec) = {NULL};
Static define_per_cpu (struct tasklet_head, Tasklet_hi_vec) = {NULL};
The implementation of soft interrupt vector TASKLET_SOFTIRQ and HI_SOFTIRQ is implemented by respective soft interrupt Service program Tasklet_action () function and tasklet_hi_action () function, which is in Softirq_init ( ) function specified in the As mentioned earlier, the Tasklet initialization must be registered with a trigger function, and the system can execute them at the appropriate time, and the triggering of these two soft interrupts is performed by the function Tasklet_schedule () and Tasklet_hi_schedule () respectively.
2.3 Workqueue mechanism
Because of the limitation of BH mechanism, the task queue mechanism has been extended in the 2.0 kernel as early as it is used. In the 2.6 kernel, another mechanism, Workqueue (Task Force column), is used to replace the task queue.
Workqueue looks a bit like tasklet, it also allows kernel code requests to call a function at some point in the future, unlike Workqueue running in a particular kernel process context, and tasklet running in an interrupt context. Its execution must be transient and atomic. Another difference from Tasklet is that you can request that the work queue function be deferred for a definite interval before it is executed. Workqueue is often used to handle events that are not very urgent, so it often has a higher execution cycle than Tasklet, but does not require atomic manipulation and allows sleep.
The workqueue mechanism is defined and implemented in Include/linux/workqueue.h and kernel/workqueue.c. The Task Force columns are maintained by the WORKQUEUE_STRUCT structure and are defined as follows:
struct Workqueue_struct {
struct Cpu_workqueue_struct *cpu_wq;
struct List_head list;
const char *name;
int singlethread;
int freezeable; /* Freeze Threads during suspend * *
};
Where the CPU_WORKQUEUE_STRUCT structure is defined for each CPU. For each CPU, the kernel hooks up a work queue for it, so that the new work can be dynamically placed into the working queues under different CPUs to support load balancing (assigning work to individual CPUs). The structure is defined as follows:
struct Cpu_workqueue_struct {
spinlock_t lock; /* Structure Lock * *
struct List_head worklist; /* Work List * *
wait_queue_head_t more_work; /* Wait queue to be processed * *
struct Work_struct *current_work; /* Processing completed waiting queue * *
struct Workqueue_struct *wq; /* Task Force column Node/*
struct Task_struct *thread; /* worker thread pointer/*
int run_depth; /* Detect Run_workqueue () recursion Depth * *
} ____cacheline_aligned;
We see that there is a work_struct structure, called the work node structure. To submit a task to a task queue, you must populate a work node. The structure is defined as follows:
struct Work_struct {
atomic_long_t data;
#define WORK_STRUCT_PENDING 0/T If WORK item PENDING execution * *
#define WORK_STRUCT_FLAG_MASK (3UL)
#define WORK_STRUCT_WQ_DATA_MASK (~work_struct_flag_mask)
struct List_head entry; /* Connect all working linked list nodes/*
work_func_t func; /* Task Force column function pointers, pointing to the specific work to be handled.
};
To facilitate the maintenance of work queues, the kernel creates a list of work queues that can be linked to this list:
Static List_head (workqueues);
A Work queue task can be created statically or dynamically, and it needs to be populated with a work_struct structure when it is created. The kernel provides a macro definition for easy declaration and initialization of a task Queue task:
#define DECLARE_WORK (n, f) \
struct Work_struct n = __work_initializer (n, F)
If you want to dynamically initialize work queue tasks at run time, or re-establish a work task structure, you need the following 2 interfaces:
#define Prepare_work (_work, _func) \
do {\
(_work)->func = (_func); \
} while (0)
#define Init_work (_work, _func) \
do {\
(_work)->data = (atomic_long_t) work_data_init (); \
Init_list_head (& (_work)->entry); \
Prepare_work ((_work), (_func)); \
} while (0)
In fact, as long as the use of init_work can be, prepare_work in the Init_work call.
The use of the Task Force column is, in fact, very simple. First you need to create a Task Force column, which is typically implemented by the function Create_workqueue (name), where name is the name of the work queue. It creates a worker thread for each CPU. Of course, if you think single-threaded is enough to handle your work, you can also use function Create_singlethread_workqueue (name) to create single-threaded task queues. Then you need to submit the work you have to do to the Task Force column. The task of creating a work queue begins, which is described above, and then uses the function queue_work (Wq, work) to submit the created task to the work queue, where Wq is the work queue where the task is to be submitted, work is a WORK_STRUCT structure, the task you are submitting. When you want to postpone submitting your assignment for a period of time, you can use Queue_delayed_work (Wq, work, delay) to submit, delay is the time you want to postpone, in tick, Delay guarantee that your task will not be executed until at least the minimum delay is specified. Of course, because the delay task's submission requires a timer, you should replace the work_struct with a different structure delayed_work, which is actually adding a timer on the basis of the WORK_STRUCT structure:
struct Delayed_work {
struct work_struct work;
struct Timer_list timer;
};
Accordingly, the interface for initializing the work task should be changed to Declare_delayed_work and Init_delayed_work:
#define DECLARE_DELAYED_WORK (n, f) \
struct Delayed_work n = __delayed_work_initializer (n, F)
#define Prepare_delayed_work (_work, _func) \
Prepare_work (& (_work)->work, (_func))
#define Init_delayed_work (_work, _func) \
do {\
Init_work (& (_work)->work, (_func)); \
Init_timer (& (_work)->timer); \
} while (0)
The tasks in the work queue are performed by the associated worker thread, possibly within an unpredictable period of time, depending on the system's load, interruption, and so on, or at least after a period of delay. If your task waits for an unlimited amount of time in a work queue and cannot be run, you can cancel it in the following ways:
int cancel_delayed_work (struct delayed_work *work);
If the task is executing when a cancel call returns, the task will continue and will not terminate because of your cancellation, but it will no longer be added to the work queue. You can clear all tasks in the work queue by using the following methods:
void Flush_workqueue (struct workqueue_struct *wq);
If the work queue still has committed tasks that are not finished, the kernel waits until all the committed tasks are completed. Flush_workqueue ensures that all committed tasks are completed, which is particularly useful in the process of device-driven shutdown.
When you run out of a Task Force column, you can destroy it:
void Destroy_workqueue (struct workqueue_struct *queue);
Note that when destroy a workqueue, if there are unfinished tasks on the queue, the function first executes them. The destroy operation ensures that all outstanding tasks are completed before the work queue is destroyed, so you do not have to worry about whether there is still work to be done when you want to destroy the work queue.
Because the work queue runs in the context of the kernel process, the execution may hibernate, so the work queue should be handled by those tasks that are not very urgent and are usually performed when the system is idle.
In the Workqueue initialization function, an event work queue Keventd_wq that is available for all threads in the kernel is defined, and the event work structures that are established by the other kernel threads are hooked up to the queue:
static struct workqueue_struct *keventd_wq __read_mostly;
void __init init_workqueues (void)
{
/* …… */
Keventd_wq = Create_workqueue ("events");
/* …… */
}
Using the kernel-provided event work queue Keventd_wq, in fact, you only need to submit work tasks using schedule_work (work) or schedule_delayed_work (work).
When we write device drivers, not all drivers need to have their own work queues. In fact, a Task Force column, in many cases, does not require the establishment of its own task force columns. If you only occasionally submit a task to a work queue, it may be more efficient to simply use the shared default work queue provided by the kernel. However, because the team column may be shared by many drivers, the task may take a long time before it can begin execution. To solve this problem, the work function delay should be kept to a minimum, or simply not.
For work queues, it is necessary to add that the work queue was introduced in the 2.5 kernel development version to replace the task queue, and its data structure is more complex. Perhaps up to now, you're still confused about the 3 data structures above, and you don't have a clue. Here, we put together 3 data structures to give a little explanation of their relationship. The relationships of these 3 data structures are shown in the following illustration:
As you can see from the graph above, the worker thread (worker_thread), at the top level, is the thread member we see in the cpu_workqueue_struct structure. The kernel creates a worker thread for each CPU, associating a cpu_workqueue_struct structure. Each worker thread is a specific kernel thread that executes the Worker_thread () function, which starts executing a dead loop and sleeps when it is initialized. When a task is submitted to a work queue, the thread wakes up to perform the tasks, or it continues to hibernate.
The work is at the bottom, described by the work_struct structure. The most important part of this structure is a pointer to a function that is responsible for handling specific tasks that need to be deferred. After the work is committed to the work queue, it is actually submitted to a specific worker thread, which then wakes up and executes the submitted work.
When we write device drivers, most drivers typically use the system's default worker thread, which is simple and convenient to use. However, in some cases where the requirements are stricter, the driver needs to use its own worker thread. In this case, the system allows the driver to create worker threads as needed. That is, the system allows multiple types of worker threads to exist, and for each type, the system has a worker thread of that class on each CPU, corresponding to a cpu_workqueue_struct structure. The WORKQUEUE_STRUCT structure is used to represent all worker threads of a given type. In this way, multiple work queues may exist on one CPU, and each task queue maintains a cpu_workqueue_struct structure, which is associated with a worker thread of one type.
For example, our driver builds on the default worker events type that the system already has (this is the system default worker created in Init_workqueues) and joins a Falcon worker type again:
struct Workqueue_struct *mydriver_wq;
Mydriver_wq = Create_workqueue ("falcon");
And we're working on a computer with 4 processors. So now there are 4 events-type threads and 4 Falcon-type threads (correspondingly, there are 8 cpu_workqueue_struct structures, each corresponding to 2 types of workers.) At the same time, there will be a corresponding type of events workqueue_struct and a corresponding Falcon type of workqueue_struct. At the time of submission of work, our work will be submitted to a special Falcon thread for processing.
3 comparison of several lower-half mechanisms
Several of the lower half mechanisms provided by the Linux kernel are used to push back on your work, but they differ in their use and differ in their scope of application.
The Linux 2.6 kernel provides several soft interrupt mechanisms that run through the idea of who triggers, who executes, but they have different characteristics. SOFTIRQ is the core of the entire soft interrupt framework system, is the most basic mechanism, kernel programmers rarely use it directly, most applications, we just need to use Tasklet on the line. The kernel provides 32 SOFTIRQ, but only a few of them are used. Softirq are statically allocated during compilation and are not dynamically created and deleted as Tasklet. SOFTIRQ of soft interrupt vectors through enumeration to its meaning