Interrupt handling (network device) in the kernel)

Source: Internet
Author: User

First, let's see how the device wakes up the driver when the network receives the frame.

1 round robin
That is, the kernel constantly monitors the corresponding registers to determine whether there are network frames.

2. interrupted
When there is data, the device executes a hardware interruption and then the kernel calls the corresponding processing function. When the network is under high load, the processing efficiency is very low (the interruption is too frequent ). It will also cause receive-livelock. This is because the kernel processes the input frames in two parts, one is to copy the frames to the input queue, and the other is to execute the relevant code in the kernel. The priority of the first part is higher than that of the second part. In this case, the input queue is congested because the queue is full, and the copied frames are too frequently interrupted to occupy the CPU ..

3. Execute multiple frames at an interruption.
The old processing method, that is, the above processing method, means that each frame is interrupted and the interruption must be disabled every time the frame is interrupted. What the new napi interface in the kernel does now is to disable the interrupt after the first hardware interruption and then enter the polling process, which greatly reduces the disadvantages of frequent interruptions under high loads.

3. Timer drive interruption

This method is enhanced by the previous method, but requires hardware support. In this way, the driver drives the device to interrupt within the specified interval. Handler then monitors whether frames have arrived, and then processes multiple frames at a time (in the memory of the hardware ). The timer MUST be hardware. Therefore, the hardware must support the timer.

For related interrupt registration functions, see my previous blog:

Http://simohayha.javaeye.com/blogs/361971

Next, let's take a look at the 3c59x. c interrupt handler vortex_interrupt. This function is registered as an interrupt handler in the probe function through request_irq.

Note that a network device may be interrupted for multiple reasons (that is, the following status variable)

Static irqreturn_t
Vortex_interrupt (int irq, void * dev_id)
{
.....................................
/// This is the maximum number of polling times. The default value is 32.
Int work_done = max_interrupt_work;
Int handled = 0;

Ioaddr = VP-> ioaddr;
Spin_lock (& VP-> lock );
/// Read the Register to obtain the current interrupt status.
Status = ioread16 (ioaddr + el3_status );

If (vortex_debug> 6)
Printk ("vortex_interrupt (). Status = 0x % 4x/N", status );
........................................ ..........

Do {
If (vortex_debug> 5)
Printk (kern_debug "% s: In interrupt loop, status % 4.4x./N ",
Dev-> name, status );
/// Rxcomplete indicates that the new frame has been received and can be retrieved by the driver. Therefore, if you call vortex_rx to obtain data, the interrupt will be blocked in this function. Although the interrupt is blocked, our function can still poll the hardware interrupt register and continue to get the corresponding interrupt status.
If (Status & rxcomplete)
Vortex_rx (Dev );

If (Status & txavailable ){
If (vortex_debug> 5)
Printk (kern_debug "TX room bit was handled./N ");
/* There's room in the FIFO for a full-sized packet .*/
Iowrite16 (ackintr | txavailable, ioaddr + el3_cmd );
Netif_wake_queue (Dev );
}

........................................ ..............................

/// Enter the last round robin
If (-- work_done <0 ){
Printk (kern_warning "% s: too much work in interrupt, status"
"% 4.4x./N", Dev-> name, status );
/* Disable all pending interrupts .*/
Do {
VP-> deferred | = status;
Iowrite16 (setstatusenb | (~ VP-> deferred & VP-> status_enable ),
Ioaddr + el3_cmd );
Iowrite16 (ackintr | (VP-> deferred & 0x7ff), ioaddr + el3_cmd );
} While (status = ioread16 (ioaddr + el3_cmd) & intlatch );
/* The timer will reenable interrupts .*/
/// Open the interrupt and jump out of the loop
Mod_timer (& VP-> timer, jiffies + 1 * Hz );
Break;
}
/* Acknowledge the IRQ .*/
Iowrite16 (ackintr | intreq | intlatch, ioaddr + el3_cmd );
/// Here is the judgment condition. When there is an pending interruption and the new network frame is acceptable, we will keep repeating.
} While (status = ioread16 (ioaddr + el3_status) & (intlatch | rxcomplete ));

If (vortex_debug> 4)
Printk (kern_debug "% s: exiting interrupt, status % 4.4x./N ",
Dev-> name, status );
Handler_exit:
Spin_unlock (& VP-> lock );
Return irq_retval (handled );
}

Next let's look at the upper half

We need to know that in the Linux kernel, the interrupt processing is divided into the upper half and the lower half. Here, the upper half is in the interrupt context, while the lower half is not. That is to say, after the upper half is processed, the interrupt is directly opened, and the lower half can be executed under the Open interrupt. The reason for dividing it into the upper and lower half is due to the following reasons (extracted from the design and implementation of the Linux kernel:

The referenced interrupt handler is executed asynchronously and may interrupt the execution of other important code. Therefore, they should be executed as soon as possible.
If an interrupt handler is currently being executed, other interruptions at the same level as the interrupt will be blocked in the best case. In the worst case, all other interruptions will be blocked. Therefore, the sooner they are executed, the better.
Since interrupt handlers often need to operate on hardware, they usually have high time limits.
Interrupt handlers are not running in the context of the process, so they cannot be blocked. In my understanding, the upper half is used to get data, and the lower half is used to process data. All of this is to make the interruption end earlier.

In the kernel, there are three mechanisms to implement the lower half: softirq, tasklet, and work queue. Because network devices mainly use the first two methods (mainly Soft Interrupt), work queue will not be introduced.

The main difference between tasklet and softirq is that tasklet has only one instance of the same type at any time, even on SMP. Softirq only has one instance on one CPU at the same time. Therefore, you must pay attention to the implementation of the lock when using softirq.

Tasklet can be dynamically created, while Soft Interrupt is statically created.

Sometimes we need to disable software interruption or hardware interruption. The following are some functions or macros related to interruptions (including software and hardware:

Because the kernel can be preemptible now, developers must disable preemptible in many places (for example, in case of hardware software interruption ).

Some network code does not directly call the APIS provided by preemption. It is indirectly called through other functions, such as rcu_read_lock and spin_lock.

Each process has a preempt_count bitmap variable, which indicates whether the current process is allowed to be preemptible. This variable can be read through preempt_count (), and the reference count can be increased and reduced through inc_preempt_count and dec_preempt_count. It is divided into three parts: hardware interruption, software interruption, and non-preemption:

Next let's look at the processing in the lower half. First, let's look at the Soft Interrupt.

The soft terminal mode has the following types. The priority is from large to small, that is, hi_softirq has the highest priority:
Enum
{
Hi_softirq = 0,
Timer_softirq,
Net_tx_softirq,
Net_rx_softirq,
Block_softirq,
Tasklet_softirq,
Sched_softirq,
# Ifdef config_high_res_timers
Hrtimer_softirq,
# Endif
Rcu_softirq,/* preferable RCU shoshould always be the last softirq */
}. The main network types are net_tx_softirq and net_rx_softirq. the Soft Interrupt mainly runs when an on-premises interrupt (Hardware Interrupt) occurs, and the kernel cannot be suspended on a CPU, request this Soft Interrupt again (the same Soft Interrupt can be run on multiple CPUs at the same time ). Each Soft Interrupt contains a softnet_data data structure, which stores the current Soft Interrupt status.

Soft Interrupt registration through open_irq:

Static struct softirq_action softirq_vec [32] _ cacheline_aligned_in_smp;

Void open_softirq (int nr, void (* Action) (struct softirq_action *))
{
Softirq_vec [Nr]. Action = action;
} Softirq_vec is a global vector that stores Soft Interrupt information. Soft interruptions can be queued through _ raise_softirq_irqoff, raise_softirq_irqoff, and raise_softirq.

To prevent soft interruptions from exclusive CPU resources, the kernel has a soft interrupt thread unique to each CPU, such as ksoftirqd_cpu0 ..

The Soft Interrupt will be executed and tested at the following time points:

1. return from a hardware interruption code
2. Run the command in ksoftirqd (as described later ).
3. explicitly detect and execute the Soft Interrupt code to be processed (Network part)

In the code of the network device, we generally use raise_softirq (the figure above) to suspend a Soft Interrupt, so that the Soft Interrupt can be executed during the next processing.

In any case, the Soft Interrupt must be handled through do_softirq. Let's take a look at its code:

Asmlinkage void do_softirq (void)
{
_ U32 pending;
Unsigned long flags;
/// If the current CPU is processing a soft interrupt or hardware interrupt, return directly
If (in_interrupt ())
Return;
Local_irq_save (flags); // obtain whether the current CPU has a pending Soft Interrupt (that is, the 0 to be executed.
Pending = local_softirq_pending ();
/// Then call the processing function
If (pending)
_ Do_softirq (); local_irq_restore (flags );
} Asmlinkage void _ do_softirq (void)
{
Struct softirq_action * h;
_ U32 pending;
/// A New Soft Interrupt may occur during function execution. To prevent the Soft Interrupt from occupying the CPU exclusively, only a fixed number of round robin is executed here.
Int max_restart = max_softirq_restart;
Int CPU; pending = local_softirq_pending ();
Account_system_vtime (current); _ local_bh_disable (unsigned long) _ builtin_return_address (0 ));
Trace_softirq_enter (); CPU = smp_processor_id ();
Restart:
/* Reset the pending bitmask before enabling irqs */
Set_softirq_pending (0); // enable Soft Interrupt
Local_irq_enable ();
/// The current global Soft Interrupt variable
H = softirq_vec; do {
If (pending & 1 ){
/// Call the corresponding processing function
H-> action (h); rcu_bh_qsctr_inc (CPU );
}
H ++;
Pending> = 1;
} While (pending); local_irq_disable (); // obtain the number of pending disconnections.
Pending = local_softirq_pending ();
/// If the number of polling times reaches the maximum value, the loop will jump out.
If (pending & -- max_restart)
Goto restart; // if there are pending soft interruptions, call wakeup_softirq to wake up the Soft Interrupt Processing thread in the kernel.
If (pending)
Wakeup_softirqd (); trace_softirq_exit (); account_system_vtime (current );
_ Local_bh_enable ();
} Let's see the source code of ksoftirqd;
Static int ksoftirqd (void * _ bind_cpu)
{
/// Set the running status of the kernel thread
Set_current_state (task_interruptible); // starts to enter the loop and performs Soft Interrupt Processing.
While (! Kthread_should_stop ()){
Preempt_disable ();
/// There is no pending Soft Interrupt, so the CPU is freed. Disable preemption.
If (! Local_softirq_pending ()){
Preempt_enable_no_resched ();
Schedule ();
Preempt_disable ();
} /// Set the thread status.
_ Set_current_state (task_running );
/// Enter the soft interrupt handling page
While (local_softirq_pending ()){
/* Preempt disable stops CPU going offline.
If already offline, we'll be on wrong CPU:
Don't process */
/// The thread executes for too long and is requested to release the CPU. For example, the timer is interrupted. Jump to Wait _ to_die. I will introduce it below.
If (cpu_is_offline (long) _ bind_cpu ))
Goto wait_to_die;
/// Call do_softirq to handle soft interruptions
Do_softirq ();
Preempt_enable_no_resched ();
Cond_resched ();
Preempt_disable ();
}
Preempt_enable ();
Set_current_state (task_interruptible );
}
_ Set_current_state (task_running );
Return 0; wait_to_die:
/// Enable Preemption
Preempt_enable ();
/* Wait For kthread_stop */
Set_current_state (task_interruptible );
/// If the current thread should be stopped or not, the CPU is released. Set related status.
While (! Kthread_should_stop ()){
Schedule ();
Set_current_state (task_interruptible );
}
/// Set the status to running and wait for the resume.
_ Set_current_state (task_running );
Return 0;
} A tasklet is a function that is interrupted or will be executed later by other tasks. It is implemented based on soft interruptions. The tasklet Soft Interrupt type is hi_softirq or tasklet_softirq.

Let's look at the tasklet structure:
Struct tasklet_struct
{
/// Linked list of suspended interruptions on the same CPU
Struct tasklet_struct * next;
/// The status of the current tasklet (Bitmap)
Unsigned long state;
/// Reference count
Atomic_t count;
/// The function to be executed
Void (* func) (unsigned long );
/// Data that may be passed to fucn
Unsigned long data;
}; Each CPU has two tasklet linked lists, one of which is hi_softirq and the other is tasklet_softirq:
Static define_per_cpu (struct tasklet_head, tasklet_vec );
Static define_per_cpu (struct tasklet_head, tasklet_hi_vec); Next let's look at the initialization of the two types of soft interrupts of tasklet:
Void _ init softirq_init (void)
{
Int CPU;
/// Each CPU initializes two tasklet linked lists
For_each_possible_cpu (CPU ){
Per_cpu (tasklet_vec, CPU). Tail =
& Per_cpu (tasklet_vec, CPU). head;
Per_cpu (tasklet_hi_vec, CPU). Tail =
& Per_cpu (tasklet_hi_vec, CPU). head;
}
/// Soft Interrupt of Registration
Open_softirq (tasklet_softirq, tasklet_action );
Open_softirq (hi_softirq, tasklet_hi_action );
} It should be mentioned that the Soft Interrupt registration of network devices is registered in net_dev_init.

The highest priority of hi_softirq is only used in the driver of the sound card device. Tasklet_softirq is used more in the NIC (which is lower than the network priority ).

Finally, let's introduce cpu_chain, which notifies the sub-system on this chain of some CPU information .. For example, when the CPU Initialization is complete, we can start the Soft Interrupt thread. The corresponding event here is cpu_online. For more events, see notifier. h.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.