about the Idle process
Which is the pid=0 process. It is the first process created after the kernel completes initialization, and executes when the system is idle. Its code is simple:
for (;;) pause ();
Emphasize that the idle process is a user-state process. So the problem is, the kernel is always in the kernel from boot to initialization, so how does the kernel create idle and switch to user state?
A very straightforward idea is that the kernel directly invokes user-space code to convert the kernel state to user state, but this is not possible because the rules
Can't do that. How about that. This is the question to be told in this article.
process-related structures
Process is a dynamic concept, to manage processes first we need to abstract this dynamic concept with some static data structures. This structure is often called the task_struct. It contains information about the process in protected mode (such as LDT and TSS, etc.). Please refer to the previous chapters of the linux0.11 kernel full comment for knowledge about protection patterns. So we want to create idle, first of all, we should prepare the corresponding task_struct. is directly initialized in linux0.11:
Static Union Task_union Init_task = {init_task,};
#define INIT_TASK \
/* State ETC/* * 0,15,15, \
/* Signals * * 0,{{},},0, \
/* EC,BRK ... * * 0,0,0,0,0,0, \
/* pid etc.. * * 0,-1,0,0,0, \
/* UID etc */0,0,0,0,0,0, \
/* Alarm * * 0,0,0,0,0,0, \
/* Math */0, \
/* FS Info */ -1,0022,null,null,null,0, \
/* Filp */{NULL,}, \
{ \
{0,0}, \
/* LDT * * 0x9f,0xc0fa00}, \
{0x9f,0xc0f200}, \
}, \
/*tss*/{0,page_size+ (long) &init_task,0x10,0,0,0,0, (long) &pg_dir,\
0,0,0,0,0,0,0,0, \
0,0,0x17,0x17,0x17,0x17,0x17,0x17, \
_ldt (0), 0x80000000, \
{} \
}, \
}
The LDT and TSS of the idle task are then placed in the global descriptor GDT:
Set_tss_desc (gdt+first_tss_entry,& (INIT_TASK.TASK.TSS));
Set_ldt_desc (gdt+first_ldt_entry,& (Init_task.task.ldt));
Kernel state---> User state
Process-related information is ready, and the next step is state switching.
Direct invocation is obviously not a privilege-level change, but we know that interrupt processing can be toggled between different privilege levels. So the kernel uses a way to "simulate interrupt return."
Let's take a look at how CPU processing interrupts:
wherein the upper half of the coloring "original SS, original ESP, the original flags, the original CS, the original EIP, is interrupted by the program SS, ESP, flags, CS, EIP, these registers into the stack and out of the stack (by the Iret instructions) are automatically completed by the CPU, And all the other registers are handled by programmers themselves. To remind, here's CS and EIP is used in the protection mode, so CS is in GDT to address the corresponding code snippet (idle of code snippets and data segments in the previous section is ready to put in the GDT. )
Next look at how linux0.11 is mimicking the interrupted return:
#define MOVE_TO_USER_MODE () \//Switch to User state
__asm__ ("Movl%%esp,%%eax\n\t" \
"Pushl $0x17\n\t" \//pressure into the original SS (point to Idle code segment, low two-bit representative cpl=3, representing user state)
"PUSHL%%eax\n\t" \//pressed into the original ESP
"pushfl\n\t" \//pressed into the original flags
"PUSHL $0x0f\n\t"///press into the original CS (point to the Idle code segment, low two representative cpl=3, representative user state)
"PUSHL $1f\n\t" \//pressed into the original EIP, pointing to the code behind the iret directive
/* The above stack operation should have been interrupted by the CPU is automatically completed, here is manual press, create the illusion of interruption. In order to call Iret immediately, the CPU automatically completes the stack operation of these registers, complete the switch of kernel state to user state.
"iret\n" \//Interrupt return, the CPU restores the previously pressed registers.
"1:\tmovl $0x17,%%eax\n\t" \
"MOVW%%ax,%%ds\n\t" \
"MOVW%%ax,%%es\n\t" \
"MOVW%%ax,%%fs\n\t" \
"MOVW%%ax,%%gs" \
::: "Ax")