1. User processes use C-library functions such as open.
2. Run the "int 0x80" or syscall command from the library function (open) to enter system_call.
Some source code in glibc // Sysdeps/Unix/sysv/Linux/x86_64/sysdeps. hInline_syscall ---> internal_syscall ---> internal_syscall_ncs ---> syscall. # Define inline_syscall (name, NR, argS ...)\ ({\ Unsigned long int resultvar = internal_syscall (name, NR, argS );\ If (_ builtin_expect (internal_syscall_error_p (resultvar,), 0 ))\ {\ _ Set_errno (internal_syscall_errno (resultvar ,));\ Resultvar = (unsigned long INT)-1 ;\ }\ (Long INT) resultvar ;})
# Define internal_syscall (name, err, NR, argS ...)\ Internal_syscall_ncs (_ nR _ # name, err, NR, # ARGs) # Define internal_syscall_ncs (name, err, NR, argS ...)\ ({\ Unsigned long int resultvar ;\ Load_args _ # NR (ARGs )\ Load_regs _ # Nr \ ASM volatile (\ "Syscall \ n \ t" // key command syscall for x86_64. : "= A" (resultvar )\ : "0" (name) asm_args _ # Nr: "Memory", "cc", "R11", "CX ");\ (Long INT) resultvar ;})
|
There is a problem: On the i386 CPU, we know that Linux has prepared an interrupt processing function (trap_init --> set_system_trap_gate (syscall_vector, & system_call) for the interrupt code 0x80 during initialization );). When int 0x80 occurs, the CPU reaches the interrupt processing function through a series of actions (this hardware action is described in many other articles. So on the x86_64 CPU, Where can the syscall command jump? How does the CPU handle this command? What information does the OS need to prepare for this command? And how did it come back?
3. The CPU completes a series of switching actions.
4. Go to the interrupt handler function. More accurately, go to the kernel.
In ARCH/x86/kernel/entry_64.s #455 or in arch/x86/kernel/entry_32.s # l498 (in entry_32.s)
ENTRY(system_call)
499 RING0_INT_FRAME # can't unwind into user space anyway
500 pushl_cfi %eax # save orig_eax
501 SAVE_ALL
502 GET_THREAD_INFO(%ebp)
503 # system call tracing in operation / emulation
504 testl $_TIF_WORK_SYSCALL_ENTRY,TI_flags(%ebp)
505 jnz syscall_trace_entry
506 cmpl $(nr_syscalls), %eax
507 jae syscall_badsys
508syscall_call:
509 call *sys_call_table(,%eax,4)
510 movl %eax,PT_EAX(%esp) # store the return value
511syscall_exit:
512 LOCKDEP_SYS_EXIT
513 DISABLE_INTERRUPTS(CLBR_ANY) # make sure we don't miss an interrupt
514 # setting need_resched or sigpending
515 # between sampling and the iret
516 TRACE_IRQS_OFF
517 movl TI_flags(%ebp), %ecx
518 testl $_TIF_ALLWORK_MASK, %ecx # current->work
519 jne syscall_exit_work
After entering system_call.
SAVE_ALL:???
。。。。
Go to sys_call_table.xxx.
Where is the definition of sys_call_table and where is its reference? Why does it exist in many files in the kernel and x86 ????
Sys_call_table definition: (INLinux/ARCH/x86/kernel/syscall_table_32.s ).
ENTRY(sys_call_table) 2 .long sys_restart_syscall /* 0 - old "setup()" system call, used for restarting */
3 .long sys_exit
4 .long ptregs_fork
5 .long sys_read
6 .long sys_write
7 .long sys_open /* 5 */
8
In this way, you can find the sys_function function defined by the kernel.
5. run various functions called by the system.
So where are various sys_functions defined?Example:
SYSCALL_DEFINE3(semop, int, semid, struct sembuf __user *, tsops,unsigned, nsops)
{Function entity.
。。。。。
。。。。。
}
In some files, we can find the preceding statements.
Syscall. H is defined as follows:
#define SYSCALL_DEFINE1(name, ...) SYSCALL_DEFINEx(1, _##name, __VA_ARGS__)#define SYSCALL_DEFINE2(name, ...) SYSCALL_DEFINEx(2, _##name, __VA_ARGS__)#define SYSCALL_DEFINE3(name, ...) SYSCALL_DEFINEx(3, _##name, __VA_ARGS__)#define SYSCALL_DEFINE4(name, ...) SYSCALL_DEFINEx(4, _##name, __VA_ARGS__)#define SYSCALL_DEFINE5(name, ...) SYSCALL_DEFINEx(5, _##name, __VA_ARGS__)#define SYSCALL_DEFINE6(name, ...) SYSCALL_DEFINEx(6, _##name, __VA_ARGS__)
It is defined gradually by a series of macros. How is it implemented ??
What are the meanings of these macros?
The syscall_define3 (semop, Int, Semid,...) macro defines the "semop" function
sys_semop(int semid,struct sembuf __user* tsops,unsigned nsops);
This sys_semop function isPart of sys_call_table.
In this way, the user process can call the corresponding sys_xxxx function when using the System Call, read, write, or semget or semop.
Appendix: cond_syscall/** "Conditional" syscall** what we want is _ attribute _ (weak, alias ("sys_ni_syscall "))), * But it doesn' t work on all toolchains, so we just do it by hand */# ifndef cond_syscall # define cond_syscall (x) ASM (". weak \ t "# X" \ n \ t. set \ t "# X", sys_ni_syscall ") # endifIn sys_ni.c: cond_syscall (sys_fanotify_mark );
Cond_syscall is defined as follows:
/*
* "Conditional" syscils
*
* What we want is _ attribute _ (weak, alias ("sys_ni_syscall "))),
* But it doesn' t work on all toolchains, so we just do it by hand */
# Ifndef cond_syscall # define cond_syscall (x) ASM (". Weak \ t" # X "\ n \ t. Set \ t" # X ", sys_ni_syscall ")
# Endif
ASM (". Weak \ t" # X "\ n \ t. Set \ t" # X ", sys_ni_syscall") is analyzed as follows:
. Weak # X. Set # X, sys_ni_syscall
The meaning of the above is: at the time of compilation, tell the compiler that when there is no symbol X, use the symbol sys_ni_syscall to replace the symbol X.
In http://www.acsu.buffalo.edu /~ Charngda/cc.html:
# Pragma weak symbol1 = symbol2
Declare symbol1 as a weak alias of symbol2. equivalently, one can use
# Pragma weak symbol1 = symbol2 defines symbol1 as a weak alias of symbol2, which is equivalent to the following code.
_ ASM _ (". Weak symbol1"); // defines that symbol1 is a weak symbol.
_ ASM _ (". Set symbol1, symbol2"); // associate symbol1 with symbol2.
A better way to achieve this is through the "weak, alias" function attributes.
There is also a better way to implement the "weak, alias" function attribute.