When a process executes a system call, it first calls the System Call Library to define a function. This function is usually expanded into the core through int 0x80 in the _ syscalln format mentioned above, its parameters will also be passed to the core through registers.
In this section, we will introduce the processing function system_call of int 0x80.
When you think about it, you will find that the execution state is completely different before and after the call: the former executes the user State Program on the user stack, and the latter executes the core State Code on the core stack. Therefore, to ensure that the user code can be returned to the call point after the system call is executed within the core, a context layer must be added to the core when the core State is saved; A context layer will pop up when returning data from the core so that the user process can continue to run.
How is the context information saved and what context information is saved? The following uses x86 as an example.
When executing the int command, the following operations are actually completed:
(1) Since int commands are transferred between different priorities, the core stack information (SS and ESP) with high priority is obtained from TSS (Task status segment) first );
(2) Keep the low-priority stack information (SS and ESP) to the high-priority stack (that is, the core stack;
(3) Push eflags, external CS, and EIP into the high-priority stack (core stack.
(4) Load CS and EIP through IDT (Control Transfer to interrupt processing function)
Then the handler system_call that interrupts 0x80 is entered. A macro save_all is used in this function. The macro definition is as follows:
# Define save_all/
CLD ;/
Pushl % es ;/
Pushl % Ds ;/
Pushl % eax ;/
Pushl % EBP ;/
Pushl % EDI ;/
Pushl % ESI ;/
Pushl % edX ;/
Pushl % ECx ;/
Pushl % EBX ;/
Movl $ (_ kernel_ds), % edX ;/
Movl % edX, % Ds ;/
Movl % edX, % es;
This macro is used to push the register context to the core stack. For system calls, it is also the input process of system call parameters, because the conversion is controlled between different privileged levels, unlike the Call Command, the int command does not automatically copy the parameters of the outer stack to the inner stack. Therefore, when calling a system call, you must first specify the parameter to each register as mentioned in the previous example, then, after entering the core, use save_all to push the parameters stored in the Register into the core stack in sequence so that the core can use the parameters passed by the user.
The source code of system_call is given below:
Entry (system_call)
Pushl % eax # Save orig_eax
Save_all
Get_current (% EBX)
CMPL $ (nr_syscils), % eax
Jae badsys
Testb $0x20, flags (% EBX) # pf_tracesys
JNE tracesys
Call * symbol_name (sys_call_table) (, % eax, 4)
......
All the work done here is:
I. Save the eax register because the eax register saved in save_all will be overwritten by the called return value;
Ii. Call save_all to save the register context;
Iii. Determine whether the current call is a legal system call (eax is the system call number and it should be smaller than nr_syscils );
IV. If the pf_tracesys flag is set, the system will jump to syscall_trace, where the future will be suspended and the sigtrap will be sent to its parent process. This is mainly designed to set debugging breakpoints;
V. If the pf_tracesys flag is not set, it will jump to the handler function entry called by the system. Here, we use eax (the system call number mentioned above) as the offset. In the system call table sys_call_table, find the processing function entry address and jump to the entry address.