Deep understanding of Linux kernel day03--interrupts and exceptions

Source: Internet
Author: User
Tags signal handler terminates

Interrupts and exception interrupts (interrupt) are typically defined as an event that adapts the order of instructions performed by the processor. Such an event corresponds to an electrical signal generated by the internal external hardware circuitry of the CPU chip.
Interrupts are usually divided into synchronous interrupt (synchronous) interrupts and asynchronous (asynchronous) interrupts:
A synchronous interrupt is generated by the CPU control unit when the instruction executes, and is referred to as synchronization because the CPU interrupts only after an instruction terminates execution.
Asynchronous interrupts are generated randomly by other hardware devices according to the CPU clock signal.
Interrupts (asynchronous interrupts) are generated by the interval timer and the I/O device, for example, a user's keystroke can cause an interrupt.
An exception (synchronous interrupt) is generated by a program error, or by an exception condition that the kernel must handle.

The interrupt signal provides a special way for the processor to switch to code outside the normal control flow.
When an interrupt signal is reached, the CPU must stop what it is currently doing and switch to a new activity. To do this, save the current value of the program counter in the kernel stack and place an address associated with the interrupt type in the program counter.
Interrupt handling is one of the most sensitive task behaviors performed by the kernel, because it must meet the following constraints:
When the kernel is going to do something else, the interrupt will come at any moment. Therefore, the goal of the kernel is to let the interrupts be processed as quickly as possible, and to defer further processing as much as it can. Therefore, the operation to be performed after the corresponding interrupt of the kernel is divided into two parts: the critical and urgent part, the kernel is executed immediately, and the rest is deferred, the kernel then executes.
Because interrupts come at any time, when the kernel is processing one of the interrupts, another interrupt (different device) occurs. This situation should occur as much as possible, so this can keep more I/O devices busy. Therefore, the interrupt handler must be written so that the corresponding kernel control path can be executed in a nested manner.
Although the kernel can accept new interrupts while processing the previous interrupt, there are still some critical sections in the kernel code where interrupts must be disabled.

Interrupts and exceptions Intel documentation break and exceptions fall into the following categories:
Masked interrupts: All interrupt requests (IRQ) issued by the I/O device produce a masked interrupt. A shielded interrupt can be in two states: shielded or unshielded; If a shielded interrupt is still shielded, the control unit ignores it.
Unshielded interrupts: Only a few critical events (hardware failures) can cause unshielded interrupts, and unshielded interrupts are identified by the CPU.
Handle probe Exception: The exception that occurs when an abnormal condition is detected when the CPU executes the instruction.
Fault (fault): can usually be corrected.
Trap: Reports immediately after the execution of a trap instruction; The kernel returns control to the program and can continue its execution without losing coherence. Traps are primarily used to debug programs.
Exception Abort (abort): A fatal error occurred; An exception abort is used to report a critical error, such as a hardware failure or an invalid value or an inconsistent value in a system table.
Programming exception (programmed exception): Occurs when a request is made by a programmer.
Each interrupt and exception is identified by a number between the 0~255.

IRQ and interrupts each hardware device controller that can make an interrupt request has an output line called an IRQ (interrupt request). All existing IRQ lines are connected to an input pin of a hardware circuit called a programmable interrupt controller. The programmable interrupt controller performs the following actions:
1, monitor the IRQ line, check the resulting signal. If a signal is generated on a line or more than two IRQ lines, select the IRQ line with a smaller pin number.
2. If a trigger signal appears on the IRQ line:
A, the received trigger signal is converted to the corresponding vector.
B. Store this vector in an I/O port on the interrupt controller, allowing the CPU to read this vector over the data bus.
C, send the trigger signal to the processor's INTR pin, which generates an interrupt.
D, wait until the CPU confirms it by writing the interrupt signal into an I/O port of the programmable interrupt controller, and when this happens, clear the Intr line.
3, return to the 1th step.
You can modify the mapping between the IRQ and the vector by releasing the appropriate instructions on the interrupt controller port.
You can selectively disallow each IRQ line. The IRQ can be disabled through PIC programming. The forbidden interrupts are lost, and once they are activated, pic sends them to the CPU. This feature is used by most interrupt handlers because this allows the interrupt handler to process the same type of IRQ in turn.

Advanced Programmable Interrupt Controller interrupt requests from external hardware devices are distributed between available CPUs in two ways:
Static distribution: The IRQ signal is passed to the local APIC listed in the corresponding item in the redirect table. Interrupts are immediately delivered to a specific CPU, or to a set of CPUs, or to all CPUs.
Dynamic Distribution: If the processor is performing the lowest-priority process, the IRQ signal is passed to the local APIC of the processor.
Inter-processor interrupts (IPI) are an important part of the SMP architecture and are effectively used by Linux to exchange information between CPUs.

The 80x86 microprocessor has released about 20 different anomalies. The kernel must provide a specialized exception handler for each exception. For some exceptions, the CPU control unit generates a hardware error code before starting the exception handler and presses into the kernel-state stack.

Interrupt Descriptor Table Interrupt Descriptor Tables (Interrupt descriptor Table,idt) is a system sheet that is associated with each interrupt or anomaly vector, with each vector having an appropriate interrupt or entry address for the exception handler in the table. The kernel must initialize IDT properly before allowing interrupts to occur.
Before an interrupt is allowed, the IDTR must be initialized with the Lidt assembly instruction.
IDT contains three types of descriptors:
Task Gate Descriptor, interrupt descriptor, Trap Gate Descriptor
Linux uses interrupt gate processing interrupts to handle exceptions using trap gates.

The nesting of interrupts and exception handlers each break or exception causes a kernel control path, or a separate sequence of instructions on behalf of the current process in the kernel state.
The kernel control path can be arbitrarily nested another interrupt handler can be "interrupted" by another interrupt handler, resulting in nested execution of the kernel control path.
An interrupt handler can either preempt the other interrupt handlers or preempt the exception handlers. Instead, the exception handler never preempted the interrupt handler.
The only exception that can be triggered in the kernel state is the fault of the pages. However, the interrupt handler never executes an operation that can cause a missing pages (and therefore a process switchover).
The Linux interleaved execution kernel control path is based on the following two main reasons:
To increase the throughput of programmable interrupt controllers and device controllers.
In order to implement a non-priority interrupt model.

Initialize Interrupt descriptor table before the kernel starts the interrupt, the initial address of the IDT tables must be loaded into the IDTR register and each item in the table is initialized. This work is done in the initialization system.
The int instruction allows the user-state process to emit an interrupt signal whose value can be any vector of 0~255. Therefore, in order to prevent users from simulating illegal interrupts and exceptions through an int instruction, IDT must be very careful in its initialization. You can do this by setting the DPL field of the interrupt or trap gate descriptor to the zero.

Interrupt gate, trap door i.e. system Gate Linux uses a slightly different breakdown and terminology from Intel, classifying interrupt descriptors as follows:
Interrupt Gate: An Intel interrupt gate that the user-state process cannot access (the DPL field for the door is 0). All Linux interrupt handlers are activated by an interrupt gate. And all are limited to the kernel state.
System Gate: The user-state process can access an Intel Trap gate (the door's DPL field is 3). Three Linux exception handlers are activated through the system gate, their vectors are 4,5 and 128.
System Interrupt Gate: An Intel interrupt gate that can be accessed by the user-state process (the DPL field of the gate is 3). The exception handler associated with Vector 3 is activated by the system interrupt gate.
Trap Door: The user-state process cannot access an Intel trap door (the DPL field of the door is 0). Most Linux exception handlers are activated through trap gates.
Task Gate: An Intel task gate that cannot be accessed frequently by the user (the DPL field of the door is 0). Linux handlers for "Double fault" exceptions are activated by trap gates.
The following system-related functions are used to insert doors into IDT:
Set_intr_gate (N,ADDR): Inserts an interrupt gate in the nth table entry in IDT. DPL field is 0
Set_trap_gate (N,ADDR): Inserts a trap gate in the nth table entry in IDT. DPL field is 0
Set_system_intr_gate (N,ADDR): Inserts an interrupt gate in the nth table entry in IDT. DPL Field 3
Set_system_gate (N,ADDR): Inserts a trap gate in the nth table entry in IDT. DPL Field 3
Set_task_gate (N,ADDR): Inserts an interrupt gate in the nth table entry in IDT. DPL Field 3

Initial initialization of IDT when the computer is still running in real mode, IDT is initialized and used by the BIOS routines, however, once the Linux takeover, IDT is moved to another area of RAM and is initialized for a second time, there should be no BIOS routines for Linux.
IDT is stored in the Idt_table table with 256 table entries.
The Setup_idt () assembly uses the same interrupt gate (that is, the Ignore_int () interrupt handler) to populate the 256 idt_table when the kernel is initialized.

Most exceptions generated by exception handling CPUs are interpreted by Linux as error conditions. When one of the exceptions occurs, the kernel sends a signal to the process that caused the exception to notify it of an abnormal condition.
The exception handler has a standard structure that consists of the following three parts:
1. Save the contents of most registers in the kernel stack.
2, using advanced C language to deal with exceptions.
3. Exit from the exception handler through the Ret_from_exception () function.
In order to take advantage of the exception, IDT must be properly initialized, and each confirmed exception is represented by an exception handler. The job of the Trap_init () function is to insert some final values (that is, processing exception functions) into the unshielded interrupts and exception table entries of IDT.

Save the value of the register for the exception handler ENTRY (Divide_error)
PUSHL # no error code
PUSHL $do _divide_error
When the exception occurs, if the control unit does not automatically insert a hardware error code into the stack, the corresponding assembly language fragment will contain a PUSHL $ instruction, pad a control on the stack, and then, the address of the advanced C function is pressed into the stack.

The C function that enters and leaves the exception handler exception handler always has a do_ prefix and a handler name. Most functions store hardware error codes and exception vectors in the descriptor of the current process, and then the current process sends an appropriate signal.
As soon as the exception handler terminates, the current process is focused on the signal. The signal is either processed by the process's own signal handler or handled by the kernel.

Interrupt processing interrupt processing relies on the interrupt type.
I/O interrupts: An I/O device needs to be concerned, and the appropriate interrupt handler needs to query the device to determine the appropriate operating procedure.
Clock interrupt: A clock generates an interrupt that tells the kernel that a fixed time interval has elapsed.
Interrupt between processors: one CPU in a multiprocessor system sends an interrupt to another CPU.
I/O interrupt processing
In general, I/O interrupt handlers must be flexible enough to provide services to multiple devices at the same time.
The flexibility of the interrupt program is implemented in two different ways:
IRQ Sharing: Interrupt handlers perform multiple interrupt service routines (ISR). Each ISR is a related function that is associated with a separate device (shared IRQ line).
IRQ Dynamic Assignment: An IRQ line is associated with a device driver at the last possible moment;
Linux divides the operations that follow the interrupt into three categories:
Emergency, non-urgent, non-urgent, delayed

The interrupt vector physical IRQ can be assigned to any vector within the 32~238 range. However, Linux uses vector 128 to implement system calls.
There are three ways to select a line for an IRQ configurable device:
Set up a hardware adapter (for legacy device cards only)
A utility is executed when the device is installed.
In the system is to execute a hardware protocol.
The kernel must find the correspondence between the IRQ and the I/O device before starting the interrupt, and the corresponding IRQ number to the I/O device is established when each device driver is initialized.

IRQ Concurrent Linux quasi-symmetric multiprocessor model (SMP) in multiprocessor systems; This means that the kernel should not be favored by any CPU in nature. As a result, the kernel attempts to distribute IRQ signals from hardware devices across all CPUs in a rotational fashion. Therefore, the execution time slices for all CPU services for I/O interrupts are almost identical.

Multi-type kernel stacks the Thread_info descriptor for each process is adjacent to the kernel stack in the thread_union structure, and the thread_union structure may occupy one or two page boxes depending on the options at the time the kernel compiles.
If the size of the thread_union structure is 8KB, then the kernel stack of the current process is used for all types of kernel control paths: exceptions, interrupts, and deferred functions.
Conversely, if the size of the thread_union structure is 4KB, the kernel is using three types of kernel stacks:
Exception stack, which handles exceptions (including system calls), which contains the thread_union data structure for each process, so the kernel uses a different exception stack for each process in the system.
The hard interrupt request stack, which handles interrupts, each CPU in the system consists of a hard interrupt request stack, and each stack occupies a separate page box.
Soft interrupt request stack for processing of deferred functions (soft interrupts or tasklet), each CPU in the system consists of a soft interrupt request stack, and each stack occupies a separate page box.
All hard interrupt requests are stored in the Harding_stack array, and all soft interrupt requests are stored in the Softirq_stack array, each of which is a union of IRQ_CTX types that span a single page box.

When the worth CPU that holds the register for the interrupt handler receives an interrupt, the corresponding interrupt handler code is executed and the address of the code is stored in the corresponding gate of IDT.
Call the DO_IRQ () function to perform all interrupt service routines related to an interrupt.

Interrupt Service routines an interrupt service routine (ISR) implements the operation of a particular device. When an interrupt handler must execute an IRQ, it calls the Handle_irq_event () function.

Interrupt processing between processors interrupts between processors allows one CPU to send interrupt signals to other CPUs in the system.
In multiprocessor systems, Linux defines the following three types of inter-processor interrupts:
Call_function_vector: Send to all CPUs (not including senders), forcing those CPUs to run functions passed by the sender. The corresponding interrupt handler is called Call_function_interrupt ().
Reschedule_vector: When a CPU receives this type of interrupt, the corresponding handler (called Reschedule_interrupt ()) qualifies itself to answer the interrupt.
Invalidate_tlb_vector: Sent to all CPUs (not including senders) to force their conversion backup buffers (TLB) to become invalid. The corresponding handler (called Invalidate_interrupt ()) flushes some of the TLB table entries for the processor.
Generating a processor interrupt (IPI) becomes an easy task due to the following set of functions:
Send_ipi_all (): Send an IPI to all CPUs (including senders)
Send_ipi_allbutself (): Send an IPI to all CPUs (not including the sender)
Send_ipi_self (): Sends a IPI to the sender's CPU.
Send_ipi_mask (): Sends an IPI to a set of CPUs specified by the mask.

Soft interrupts and Tasklet we mentioned in the "Interrupt Handling" section earlier that some of the tasks performed by the kernel are not urgent: they can be delayed for a period of time if necessary.
Meeting this challenge in Linux2.6 is through two non-urgent, interruptible kernel functions: So-called deferred functions (including soft interrupts and tasklets) and functions that are executed through the work queue.
Soft interrupts and Tasklet are closely related, and Tasklet are implemented on top of soft interrupts.
In fact, the term "soft break (SOFTIRQ)", which appears in kernel code, often represents all kinds of deferred functions.
The allocation of soft interrupts is static (that is, defined at compile time), while the allocation and initialization of Tasklet can be performed at run time (for example, when installing a kernel module). Soft interrupts can run concurrently on multiple CPUs.
So a soft interrupt is a reentrant function and must explicitly protect its data structure by using a spin lock.
Tasklet do not have to worry about these issues because the kernel has more stringent control over the execution of Tasklet. The same type of tasklet is always executed serially, in other words, no more than 2 CPUs running the same type of tasklet. But different types can be run by colleagues on different CPUs.
In general, there are four actions that can be performed on a deferred function:
Initialize (initialization): Defines a new deferred function, which is usually done when the kernel itself initializes or loads the module.
Activation (activation): marks a deferred function as "pending" (executed in the next round of scheduling of the deferred function). Activation can be done at any time (even if the interrupt is being processed).
Masking (masking): Selectively masks a deferred function so that even if he is activated, the kernel does not execute it.
Execute (Execution): Executes a pending deferred function and all other pending deferred functions of the same type; execution is performed at a specific time.
Activation and execution are somehow always bound together: a deferred function that is activated by a given CPU must be executed on the same CPU.

Soft interrupt soft Interrupt the main data structure used is the Softirq_vec array, which contains 32 elements of type softirq_action. The priority of a soft interrupt is the subscript of the corresponding softirq_action element within the array.

Handle the soft interrupt Open_softirq () function to handle the initialization of soft interrupts. It uses three parameters: a soft interrupt subscript, a pointer to a soft interrupt function to execute, and a pointer to a data structure that may be used by the soft interrupt function.
void Open_softirq (int nr, void (*action) (struct softirq_action*), void *data)
OPEN_SOFTIRQ () restricts its own initialization of the appropriate elements in the Softirq_vec array.
The RAISE_SOFTIRQ () function is used to activate the soft interrupt, which accepts the soft interrupt subscript nr as the parameter, performing the following operation:
1. Execute the LOCAL_IRQ_SAVE macro to save the status value of the EFlags register if flag and disable interrupts on the local CPU.
2. The soft interrupt is marked as pending, which is achieved by setting the bit associated with the subscript nr in the soft interrupt mask of the local CPU.
3. If In_interrupt () produces a value of 1, skip to step 5th. This scenario indicates that either RAISE_SOFTIRQ () has been called in the interrupt context or a soft interrupt is currently disabled.
4. Otherwise, call WAKEUP_SOFTIRQD () to wake the local CPU's KSOFTIRQD kernel thread when needed.
5. Execute Local_irq_restore macro and restore the If flag status saved in the 1th step.
DO_SOFTIRQ () function:
If a pending soft interrupt is detected at such a checkpoint (local_softirq_pending () not 0), the kernel calls DO_SOFTIRQ () to process them.
__DO_SOFTIRQ () function:
The __DO_SOFTIRQ () function reads the soft interrupt mask of the local CPU and executes the deferred function associated with each setting.
Because a soft interrupt function is being performed, a new pending soft interrupt is possible, so in order to guarantee the low latency of the deferred function, __DO_SOFTIRQ () runs until all the pending soft interrupts have been performed.

KSOFTIRQD Kernel threads in the most recent kernel version, each CPU has its own ksoftirqd/n kernel thread.
Each ksoftirqd/n kernel thread runs the KSOFTIRQD () function.

The Tasklet Tasklet is the preferred method for implementing deferred functions in an I/O driver. Tasklet is built on two soft interrupts called HI_SOFTIRQ and TASKLET_SOFTIRQ. Several tasklet can be associated with the same soft interrupt, and each tasklet executes its own function. There is no real difference between the two soft interrupts, except that DO_SOFTIRQ () executes the HI_SOFTIRQ Tasklet first and then Tasklet_softirq Tasklet.

Work queues are introduced in the Linux2.6 to replace the task queue. They allow kernel functions (much like the lazy function) to be activated, and are later executed by a special kernel thread called worker threads.
Although the deferred function and the work queue are very similar, they differ greatly. The main difference is that the deferred function runs in the context of the interrupt, while the function in the queue runs in the process context. The one way to execute a blocking function is to run it in the context of the process.
Data Structures for Task columns
The primary data structure associated with the work queue is a descriptor named Workqueue_struct, which contains an array of Nr_cpus elements, Nr_cpus is the maximum number of CPUs in the system. Each element is a descriptor of type cpu_workqueue_struct.
struct Cpu_workqueue_struct {

spinlock_t lock; Spin lock to protect this data

Long remove_sequence; /* least-recently added (next to run) */
Long insert_sequence; /* Next to add */

struct List_head worklist; Suspending the head node of a linked list
wait_queue_head_t more_work;
wait_queue_head_t Work_done;

struct Workqueue_struct *wq;//A pointer to the WORKQUEUE_STRUCT structure that contains the descriptor
task_t *thread;//Pointer to the process descriptor for worker threads in the structure

int run_depth; /* Detect Run_workqueue () recursion depth current execution depth */
} ____cacheline_aligned;
Worker functions
The Create_workqueue ("foo") function takes a string as a parameter and returns the address of the WORKQUEUE_STRUCT descriptor for the newly created work queue. The function also creates n worker threads (n is the number of CPUs that are effectively running in the current system) and names worker threads based on the string passed to the function, such as: FOO/0,FOO/1, and so on.
Queue_work () Inserts a function into the work queue, which accepts WQ and two pointers. Wq points to the Workqueue_struct descriptor, and work points to the Work_struct descriptor.
Queue_work () primarily performs the following steps:
1. Check that the function to be inserted is already in the work queue (the work->pending field equals 1), and if so, end.
2, the work_struct description multibyte to the Work queue list, and then put work->pending 1.
3. If the worker thread sleeps in the More_work wait queue on the local CPU's cpu_workqueue_struct descriptor, the function wakes up the thread.
The Queue_delayed_work () function is almost the same as Queue_work (), except that the Queue_delayed_work () function accepts a time delay parameter in the number of system ticks, which is used to ensure that the pending function waits as short as possible before execution.
Each worker thread continually loops through the worker_thread () function, so that the majority of the thread's time is asleep and waiting for some work to be inserted into the queue. Once awakened, the work queue invokes the Run_workqueue () function, which removes all work_struct descriptors from the list of worker queues and executes the corresponding suspend function.

Predefined work queues In most cases, the start of creating the entire worker thread to run a function is too large. Therefore, the kernel introduces a predefined work queue called events, and all kernel developers are free to use it.
The predefined work queue value is a standard work queue that includes different kernel-level functions and I/O drivers, and his workqueue_struct descriptor is stored in the KEVENTD_WQ array.
Functions supported by predefined work queues
Standard Work queue functions equivalent to predefined work queue functions
Schedule_work (W) queue_work (keventd_wq,w)
Schedule_delayed_work (w,d) queue_delayed_work (keventd_wq,w,d) (on any CPU)
schedule_delayed_work_on (cpu,w,d) queue_delayed_work (Keventd_wq,w,d) (on a CPU)
Flush_schedule_work () flush_workqueue (KEVENTD_WQ)

It is clear that the main purpose of the termination phase, from interrupts and exceptions, is to restore the execution of a program. But we also need to consider the following issues:
Kernel control path Concurrent execution Quantity: If there is only one then the CPU must switch to the user state.
Suspend the process's switch request: If there is any request, the kernel must execute the process dispatch; otherwise, return control to the current process.
Pending signal: If a signal is sent to the current process, it must be processed.
Single-Step mode: If the debugger is tracking the execution of the current process, it must resume stepping before the process switches back to the user state.

Deep understanding of Linux kernel day03--interrupts and exceptions

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.