A supplemental description of the interrupt stack and kernel stack in the Linux kernel "go"

Source: Internet
Author: User

Transferred from: http://blog.chinaunix.net/uid-12461657-id-3487463.html

Original address: A supplemental description of the interrupt stack and kernel stack in the Linux kernel MagicBoy2010

interrupt stack and kernel stack topic is more of the core category, so in the "deep Linux device driver kernel mechanism" in the 5th chapter "Interrupt Processing", basically did not relate to the above content, just in section 5.4 has a little text to discuss the interrupt stack in the case of the interruption of the potential overflow problem.

This post on the basis of the core stack and interrupt stack to complement the topic, discussed based on the x86 32-bit system, because the 64-bit system Linux kernel on the stack support principle is the same, but also some features are 64-bit specific, such as IST (Interrupt Stack Table), If possible in the future will be in the processor section to send a post specifically discussed.

1.  x86 if the kernel stack is shared with the interrupt stack

We know that every user process in the Linux system has a task_ A struct object to represent, at the same time at the processor level also corresponds to a TSS (Task state Segment), when the interrupt occurs, the user process is either in the user (Ring 3) or in the kernel State (ring 0), if it is in the user state, then the stack switching problem occurs, That is, switching to the kernel state of the stack, if it is in the kernel state, then there is no problem of stack switching. But the x86 processor has only one ESP on Ring 0, which means that only one stack can be used after an outage, which is the kernel stack (kernel stack). The hardware logic of the processor will press the next instruction (CS,EIP) of the interrupted process and the eflag into the stack, of course, if the user-state stack is switched to the kernel stack, the processor will also press the user-state (SS, ESP) into the stack, when the kernel stack is used. This behavior belongs to the hardware logic category of the processor, not the behavior of the system software.

The question of whether the kernel stack is shared with the interrupt stack under x86 is actually a kernel design problem, in other words, the interrupt stack can be shared with the kernel stack, or a separate interrupt stack can be reassigned. The kernel version of 2.4 seems to use the interrupt stack with the kernel stack sharing design, because the benefits of this design is relatively simple code, as mentioned earlier, directly using ESP0, but the negative factor is that the interrupt stack if nested, may destroy some of the kernel stack of data, because after all, share, So the stack space is sometimes hard to avoid. So in the 2.5 kernel version development, a warrior from IBM has submitted a patch (see HTTP://LWN.NET/ARTICLES/21846/), attempting to switch from kernel stack to a separate interrupt stack when an interrupt occurs. Later did not know by the kernel community adopted no, in short I now in 3.2 of the kernel source does not see the man's patch code, of course, it may be that the patch has grown into the current code looks like.

The Linux kernel now uses the kernel stack and interrupt stack separation design, below we from the source level to see how this separation is done.

The core code for separating the kernel stack from the interrupt stack takes place at DO_IRQ ()-HANDLE_IRQ ()-Execute_on_irq_stack ()
The last function literally means executing an interrupt-handling routine in the interrupt stack, which means that the interrupt handler executes in the context separate from the interrupted process. The Execute_on_irq_stack function is implemented as:

<arch/x86/kernel/irq_32.c>

  1. static inline int

  2. Execute_on_irq_stack (int overflow, struct irq_desc *desc, int IRQ)

  3. {
  4. Union Irq_ctx *curctx, *irqctx;
  5. U32 *isp, arg1, arg2;

  6. Curctx = (Union Irq_ctx *) Current_thread_info ();
  7. Irqctx = __this_cpu_read (HARDIRQ_CTX);
  8. /*
  9. * This is where we switch to the IRQ stack. However, if we are
  10. * Already using the IRQ stack (because we interrupted a HARDIRQ
  11. * Handler) We can ' t do and just has to keep using the
  12. * Current stack (which are the IRQ stack already after all)
  13. */

  14. if (unlikely (curctx = = Irqctx))
  15. return 0;

  16. /* Build the stack frame on the IRQ stack */
  17. ISP = (U32 *) ((char *) Irqctx + sizeof (*IRQCTX));
  18. Irqctx->tinfo.task = curctx->tinfo.task;
  19. Irqctx->tinfo.previous_esp = Current_stack_pointer;

  20. /*
  21. * Copy the SOFTIRQ bits in Preempt_count so, the
  22. * SOFTIRQ checks work in the HARDIRQ context.
  23. */

  24. Irqctx->tinfo.preempt_count =
  25. (Irqctx->tinfo.preempt_count & ~softirq_mask) |
  26. (Curctx->tinfo.preempt_count & Softirq_mask);

  27. if (unlikely (overflow))
  28. Call_on_stack (Print_stack_overflow, ISP);

  29. ASM volatile ("Xchgl%%ebx,%%esp \ n"
  30. "Call *%%edi \ n"
  31. "Movl%%ebx,%%esp \ n"
  32. : "=a" (arg1), "=d" (arg2), "=b" (ISP)
  33. : "0" (IRQ), "1" (DESC), "2" (ISP),
  34. "D" (DESC->HANDLE_IRQ)
  35. : "Memory", "CC", "ECX");

  36. return 1;
  37. }
the curctx= (Union Irq_ctx *) in the Code current_thread_info () is used to obtain the context of the currently interrupted process, Irqctx = __this_cpu_read (hardirq_ctx) The context used to get the HARDIRQ is actually getting the separate start address of the interrupt stack. The size of the interrupt stack is exactly the same as the layout and the kernel stack. The ISP then points to the top of the interrupt stack, and the final stack switch takes place in the assembler code: the current process's kernel stack ESP pointer is saved in EBX, and the ISP that interrupts the stack is assigned to ESP, so the next code will use the interrupt stack. The call statement is responsible for invoking the DESC->HANDLE_IRQ () function, where interrupt processing occurs, and the interrupt handler that is registered by the device driver is called. When the interrupt processing routine finishes returning, ESP will re-point to the kernel stack of the interrupted process. (We should note here that the kernel stack also retains the CS, EIP, and other registers that the processor hardware logic presses into when the interrupt occurs, so the interrupt return in the kernel stack is completely correct).

2. Allocation of interrupt Stacks

The allocation of the memory space of the separate interrupt stack occurs in the Irq_ctx_init function of the arch/x86/kernel/irq_32.c (if it is a multiprocessor system, each processor will have a separate interrupt stack), and the function uses the __ALLOC_ Pages allocates 2 physical pages (2 Thread_order) in the low-end memory area, which is the size of 8KB space. Interestingly, this function also allocates a separate stack of the same size for SOFTIRQ, so that SOFTIRQ will not execute on the HARDIRQ interrupt stack, but in its own context.

Summing up, each process in the system will have its own kernel stack, and each CPU in the system will prepare two separate interrupt stacks for interrupt processing, namely the HARDIRQ stack and the SOFTIRQ stack. The sketches are as follows:



Finally, the problem of calling a possible blocking function in the interrupt handling routine of the device driver can be simply attributed to the possibility of scheduling in the context of the interrupt processing. In reality, this should never be done, because it can cause a lot of problems. But from a theoretical point of view, if the scheduler is willing, it does not have a technical barrier to finding the context of the interrupted process, which means that if a process switch occurs in the interrupt handler, it is possible for the interrupted process to be dispatched again if the scheduler is willing to do so.

(Original: http://www.embexperts.com/forum.php/forum.php?mod=viewthread&tid=499&extra=page%3D1, slightly modified)

A supplemental description of the interrupt stack and kernel stack in the Linux kernel "go"

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.