Process switching TSS

Source: Internet
Author: User

[Transfer] http://www.eefocus.com/article/09-06/74895s.html

The Intel i386 architecture includes a special segment type, called the task status segment (TSS), as shown in Figure 5.4. Each task contains a TSS segment whose minimum length is 104 bytes. It is defined as a tss_struct structure in/include/i386/processor. h:

Struct tss_struct {
Unsigned short back_link ,__ bLH;
Unsigned long esp0;
Unsigned short ss0 ,__ ss0h;/* level 0 Stack pointer, that is, kernel level in Linux */
Unsigned long esp1;
Unsigned short SS1 ,__ ss1h;/* level 1 Stack pointer, unused */
Unsigned long esp2;
Unsigned short ss2 ,__ ss2h;/* level 2 stack pointer, unused */
Unsigned long _ 33;
Unsigned long EIP;
Unsigned long eflags;
Unsigned long eax, ECx, EDX, EBX;
Unsigned long ESP;
Unsigned long EBP;
Unsigned long ESI;
Unsiged long EDI;
Unsigned short es, _ esh;
Unsigned short CS, _ CSH;
Unsigned short SS, _ SSH;
Unsigned short ds, _ dsh;
Unsigned short FS, _ FSH;
Unsigned short Gs, _ SH;
Unsigned short LDT, _ ldth;
Unsigned short trace, bitmap;
Unsigned long io_bitmap [io_bitmap_size + 1];
/*
* Pads the TSS to be cacheline-aligned (size is 0x100)
*/
Unsigned long _ cacheline_filler [5];
};

Each TSS has its own 8-byte task segment descriptor (Task state segment descriptor
(Tssd ). This descriptor includes the 32-bit base address domain pointing to the start address of the TSS, and the 20-bit boundary domain. The boundary Domain value cannot be less than 104 in decimal format (determined by the minimum length of the TSS segment ).
The TSS descriptor is stored in gdt, which is a table item in gdt.

You will see later,Linux only uses a small amount of information in TSS during process switching. Therefore, the Linux kernel defines another data structure, which is the thread_struct structure.:

Struct thread_struct {
Unsigned long esp0;
Unsigned long EIP;
Unsigned long ESP;
Unsigned long FS;
Unsigned long GS;
/* Hardware debugging REGISTERS */
Unsigned long debugreg [8];/* % db0-7 Debug Registers */
/* Fault info */
Unsigned long CR2, trap_no, error_code;
/* Floating Point Info */
Union i387_union i387;
/* Virtual 86 mode info */
Struct vm86_struct * vm86_info;
Unsigned long screen_bitmap;
Unsigned long v86flags, v86mask, v86mode, saved_esp0;
/* IO permissions */
Int ioperm;
Unsigned long io_bitmap [io_bitmap_size + 1];
};

Use this data structure to save the Cr2 registers, floating point registers, debugging registers, and
Other information about the 80x86 processor. Bitmap is required because ioperm () and iopl (
System calls allow user-State processes to directly access special I/O Ports. In particular, if the iopl in the eflag register
If the field is set to 3, user-State processes are allowed to access any of the corresponding I/O access bitmap ports with 0.

So how does a process switch?

From Chapter 3, we know that in the Interrupt Descriptor Table (IDT), apart from the interrupt door, trap door, and call door, there is also a "task ". The task door contains a TSS segment selector. When the CPU is interrupted
When a task is completed, the segment selector in the task is automatically loaded into the tr register, so that tr points to the new TSS, and the task switching is completed. The CPU can also use the JMP or call command to switch tasks.
In other words, when the jump or call target segment (code segment) actually points to a TSS descriptor in the gdt table, a task switchover will occur.

Intel's design is really thoughtful and provides a very simple mechanism for task switching. However, since the system structure of i386 is basically CISC
The call (or interrupt) task completion process is actually a "complex command" execution process. The execution process is more than 300 CPU cycles (one pop command occupies 12 CPU cycles), because
Here,The Linux kernel does not fully use the task switching mechanism provided by the i386cpu.

Because the i386cpu requires the software to set Tr and TSS, the Linux kernel only sets Tr and TSS "Go through" to meet CPU requirements. However, the kernel does not use the task gate or
Use the JMP or call command to Implement Task switching. The kernel only sets TR in the initialization phase to point to a TSS, so that the tr content will not be changed later. That is to say, each CPU (if
There are multiple CPUs) the initial TSS will always be used during all the operations after initialization. At the same time, the kernel does not rely solely on TSS to store the register copies during each process switch, but instead stores these registers
Copies are stored in each process's own kernel stack (see the storage of the task_struct structure in the previous chapter ).

In this way, the vast majority of TSS content has lost its original meaning. So how can I automatically change the stack when switching tasks? We know that the kernel stack pointer (ss0 and esp0) of the new task should
In Linux, not every task has a TSS, but each CPU has only one TSS. Intel originally intended to make tr content (that is
Tss) as the task is switched, the switch turns out to be replaced by only ss0 and esp0 In the TSS in the Linux kernel, instead of the TSS itself, that is, the switch does not change the tr
. This is because the overhead of changing ss0 and esp0 in TSS is much smaller than that of replacing a TSS by loading tr. Therefore, in the Linux kernel, TSS does not belong to a specific process.
Resources, but global public resources. In the case of multi-processor, although the kernel does have multiple TSS, each CPU still has only one TSS.

5.4.2 process switching

The schedule () mentioned above calls the switch_to macro, which implements real switching between processes. Its code is stored in include/i386/system. h:

1 # define switch_to (prev, next, last) do {\
2 ASM volatile ("pushl % ESI \ n \ t "\
3 "pushl % EDI \ n \ t "\
4 "pushl % EBP \ n \ t "\
5 "movl % ESP, % 0 \ n \ t"/* save ESP */\
6 "movl % 3, % ESP \ n \ t"/* restore ESP */\
7 "movl $ 1f, % 1 \ n \ t"/* save EIP */\
8 "pushl % 4 \ n \ t"/* restore EIP */\
9 "JMP _ switch_to \ n "\
10 "1: \ t "\
11 "popl % EBP \ n \ t "\
12 "popl % EDI \ n \ t "\
13 "popl % ESI \ n \ t "\
14: "= m" (prev-> thread. ESP), "= m" (prev-> thread. EIP ),\
15 "= B" (last )\
16: "M" (next-> thread. ESP), "M" (next-> thread. EIP ),\
17 "a" (prev), "D" (next ),\
18 "B" (prev ));\
19} while (0)

Switch_to macro is written in embedded assembly, which is difficult to understand. For the convenience of description, we have compiled a line number for the Code. Here we give a specific explanation:

· The thread type is the thread_struct structure described earlier.
· There are three output parameters, indicating that three data items will change after the code is executed. Their Relations with variables and registers are as follows:
0% and Prev-> thread. ESP, 1% and Prev-> thread. EIP corresponds. Both parameters are stored in the memory, while 2% corresponds to the EBX register, indicating that the last parameter is stored in the EBX register.
· There are five input parameters, which correspond to the following:
3% corresponds to next-> thread. ESP, and 4% corresponds to next-> thread. EIP. Both parameters are stored in the memory, and 5%, 6%, and 7%
Do not correspond to eax, EDX, or EBX. The Prev, next, and Prev parameters are included in these three registers. Table 5.1 lists these mappings:

· 2nd ~ The four lines store the content of ESI, EDI and EBP registers in the kernel stack of the current process Prev.
· Store the kernel stack pointer EBP of row 5th Prev into Prev-> thread. ESP.
·
Line 3 places the kernel stack pointer next-> thread. esp of the next process to be run in the ESP register. From now on, the kernel operates on the next Kernel stack, because
Here, this command executes the real context switch from Prev to next, because the address of the Process descriptor is closely linked with the address of the kernel stack (see chapter 4). Therefore, changing the kernel stack means
Changes the current process. If current is referenced here, it will point to the task_struct structure of next. In this sense, the process switching is completed in this line of command execution.
It will be completed later. However, the execution of the program is another element that forms a process. The switchover has not been completed yet.
· The address of Line 1 in line 7th, that is, the address of the first popl command (line 1), is saved in Prev-> thread. in the EIP, this address is the "return" address when Prev is cut into the next scheduled operation.
·
Line 1 will press Next-> thread. EIP to the next Kernel stack. So, which address does next-> thread. EIP point? In fact, it is

Next the address saved in the last time when it was transferred, that is, the address of the 7th-line popl command. Because every process is transferred out of 7th rows, this determines that each process (
The newly created process) starts from row 11th when it is scheduled and resumed for execution.
· 9th lines passed the jump command (instead
Call Command) is transferred to a function _ switch_to (). The specific implementation of this function is described below. When the CPU executes the RET command of the _ switch_to () function
The next-> thread. EIP that finally enters the stack becomes the return address, which is the address marked as "1.
· 11th ~ Row 13 restores the content of the stack pushed when next was last removed. From now on, the next process becomes the current process and actually starts execution.

Next we will discuss the _ switch_to () function.

Before calling the _ switch_to () function, fastcall is defined for it:

Extern void fastcall (_ switch_to (struct task_struct * Prev, struct task_struct * Next ));

Fastcall calls a function different from a general function call, because _ switch_to () takes parameters from registers (such as table 5.1), rather than parameters from the stack as a general function, that is, the Prev and next parameters are passed to the _ switch_to () function through the registers eax and EDX.

 

Void _ switch_to (struct task_struct * prev_p, struct task_struct * next_p)
{
Struct thread_struct * Prev = & prev_p-> thread,
* Next = & next_p-> thread;
Struct tss_struct * TSS = init_tss + smp_processor_id ();
Unlazy_fpu (prev_p);/* If the mathematical processor is working, save the value of its register */
/* Replace the kernel-level (level 0) Stack pointer in the TSS with next-> esp0, which means that the next process runs in the kernel
Stack pointer
Tss-> esp0 = Next-> esp0;
/* Saves FS and GS, but does not need to save es and DS, because when it is in the kernel, the kernel segment
Always unchanged */
ASM volatile ("movl % FS, % 0": "= m" (* (int *) & Prev-> FS ));
ASM volatile ("movl % Gs, % 0": "= m" (* (int *) & Prev-> Gs ));
/* Restore the FS and Gs of the next process */
Loadsegment (FS, next-> FS );
Loadsegment (GS, next-> Gs );
/* If the debug register is used when next is suspended, load 0 ~ 6 of the 7 registers, of which 4th and 5 are not used */
If (next-> debugreg [7]) {
Loaddebug (next, 0 );
Loaddebug (next, 1 );
Loaddebug (next, 2 );
Loaddebug (next, 3 );
/* No 4 and 5 */
Loaddebug (next, 6 );
Loaddebug (next, 7 );
}
If (prev-> ioperm | next-> ioperm ){
If (next-> ioperm ){
/* Copy the I/O operation permission bitmap of the next process to TSS */
Memcpy (TSS-> io_bitmap, next-> io_bitmap,
Io_bitmap_size * sizeof (unsigned long ));
/* Assign the offset of io_bitmap in TSS to TSS-> bitmap */
Tss-> bitmap = io_bitmap_offset;
} Else
/* If a process uses the I/O command, but the bitmap offset exceeds the TSS range,
* A controllable SIGSEGV signal is generated. The first call to sys_ioperm () will
* Create an appropriate bitmap */
Tss-> bitmap = invalid_io_bitmap_offset;
}
}

From the above description, we can see that although Intel itself provides hardware support for process (task) Switching in the operating system, the Linux kernel designers did not fully adopt this idea, instead, we use software to implement process switching. In addition, software implementation is more efficient and flexible than hardware implementation.

 

-----------------------------------------------------

[Transfer] http://www.linuxidc.com/Linux/2011-03/33367.htm

Example of the role of TSS:It is particularly important to save the registers used by tasks at different privileged levels, because, for example, when a task is interrupted and involves a privileged-level switch (one task is switched), you must first switch
Stack. This stack is obviously a kernel stack. So how can we find the stack address? It needs to be obtained from the TSS segment so that subsequent execution can be relied on (on x86 machines, c function calls are implemented through stacks.
). As long as the task switches from the Local Privilege ring to the high privilege ring, the stack corresponding to the high privilege ring needs to be found. Therefore, esp2, esp1, and esp0 must be at least three ESP instances. However, Linux only uses
Esp0.

What is TSS: TSS is a segment and the segment is x86. In protection mode, the segment selector participates in addressing, and the segment selector is in the segment register, the TSS segment is in the tr register.

Intel's suggestion: Prepare an independent TSS segment for each process. During process switching, switch the tr register to point it to the corresponding TSS segment of the process, then, when the task is switched (for example, an interrupt that involves a privileged-level switch), all registers are retained using this segment.

Linux practices:

1. Linux does not prepare a TSS segment for each process. Instead, each CPU uses a TSS segment and the tr register saves the segment. During process switching, only the esp0 field in the unique TSS segment is updated to the kernel stack of the new process.

2. in the TSS segment of Linux, only fields such as esp0 and iomap are used, instead of storing registers. When a user process is interrupted and enters ring0, esp0 is extracted from the TSS and then switched to esp0, other registers are stored on the kernel stack indicated by esp0 instead of the TSS.

3. As a result, each CPU in Linux has only one TSS segment, and the tr register always points to it. It complies with the x86 processor usage specifications, but does not follow Intel's recommendations. The consequence is that the overhead is lower because you do not have to switch the tr register.

Linux implementation:

1. Define TSS:
Struct tss_struct init_tss [nr_cpus] _ cacheline_aligned = {[0... NR_CPUS-1] = init_tss}; (ARCH/i386/kernel/init_task.c)
Init_tss is defined:
# Define init_tss {\
. Esp0 = sizeof (init_stack) + (long) & init_stack ,\
. Ss0 = _ kernel_ds ,\
. Esp1 = sizeof (init_tss [0]) + (long) & init_tss [0], \
. SS1 = _ kernel_cs ,\
. LDT = gdt_entry_ldt ,\
. Io_bitmap_base = invalid_io_bitmap_offset ,\
. Io_bitmap = {[0... io_bitmap_longs] = ~ 0 },\
}

 

 

Http://www.linuxidc.com/Linux/2011-03/33367.htm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.