In-depth understanding of Linux system calls

Source: Internet
Author: User

1. What is system call

In the Linux World, we often encounter the term system call. The so-called system call is a series of functions provided by the kernel with very powerful functions. These system calls are implemented in the kernel, and then the system is called to the user in a certain way. Generally, they are implemented through gate traps. System calling is an interface for user programs to interact with the kernel.


Ii. Functions of system calls

System calling plays a huge role in the Linux system. Without system calling, the application will lose the Kernel support.

Many functions we use in programming, such as fork and open, are finally implemented in system calls. For example, we have such a program:

   #include <unistd.h>    #include <stdlib.c>             exit(    }    

 



Here we use two functions, fork and exit, both of which are functions in glibc. However, if we track the function execution process, looking at the implementation of the fork and exit functions in glibc, we can find that the implementation code in glibc falls into the kernel in a soft interrupt mode and then implements the function through system calls. The specific process is described in detail in the implementation process of the system call.

It can be seen that system calls are implemented by user interfaces in the kernel. Without system calls, users cannot use the kernel.

 

Iii. system call reality and call Process

We have also discussed some protection mechanisms in Linux before giving a detailed description of system calls.

The Linux system provides four levels of privilege in the CPU protection mode. Currently, the kernel only uses two levels of privilege, namely "privileged level 0" and "privileged level 3 ", level 0 is the kernel mode we usually talk about. Level 3 is also the user mode we usually talk about. These two levels are mainly used to protect the system. In kernel mode, you can execute some privileged commands and enter user mode, but not in user mode.

It is particularly pointed out that the kernel mode and the user mode use their own stacks respectively, and stack switching is also required when mode switching occurs.

Each process has its own address space (also called process space). The address space of a process is divided into two parts: user space and system space, in user mode, you can only access the user space of the process. In kernel mode, you can access all the address spaces of the process. The address in this address space is a logical address, through the system segment-plane management mechanism, the actual memory to be accessed must undergo second-level address conversion, that is, logical address & #61664; linear address & #61664; physical address.

System calling is equivalent to a function for the kernel. The key issue is the conversion from user mode to kernel mode, stack switching, and parameter passing.

The following process is analyzed based on the kernel source code. The following analysis environment is FC2 and kernel 2.6.5.

Below is a piece of code from arch/i386/kernel/entry. S in the kernel source code.

  

            __SWITCH_KERNELSPACE \      cmpl $, %                         movl $swapper_pg_dir-__PAGE_OFFSET, %                              __SWITCH_USERSPACE \             movl EIP(%esp), %       jb 22f;               movl EFLAGS(%esp),%     movb CS(%esp),%     testl $(VM_MASK   ),%                           GET_THREAD_INFO(%     movl TI_virtual_stack(%ebp), %     movl TI_user_pgd(%ebp), %       movl %esp, %     andl $(THREAD_SIZE-), %     orl %ebx, %       movl %edx, %     movl %ecx, %                    /* !CONFIG_X86_HIGH_ENTRY */         __SWITCH_KERNELSPACE      __SWITCH_USERSPACE                 __SAVE_ALL \        __RESTORE_INT_REGS \        __RESTORE_REGS \       : popl %    : popl %    .section .fixup,    : movl $,(%      : movl $,(%        .section __ex_table,     .align      .     .            __RESTORE_ALL \        addl $, %        .section .fixup,         movl $(__USER_DS), %     movl %edx, %     movl %edx, %     pushl $        .section __ex_table,     .align      .            SAVE_ALL \               RESTORE_ALL \         

 



The above Code defines two very important macros, namely, SAVE_ALL and RESTORE_ALL.

 

SAVE_ALL first saves the registers and stack information in user mode, and then switches to kernel mode. The process of converting macro _ SWITCH_KERNELSPACE to address space RESTORE_ALL is the opposite of SAVE_ALL.

There is a system call table in the original kernel code: (in the entry. S file)
  

      . sys_restart_syscall       .     .     .     .     . sys_open         . sys_mq_timedreceive       .     .    syscall_table_size=(.-sys_call_table)    

 

In kernel 2.6.5, there are more than 280 system calls. The names of these system calls are all in this system call table.

In this original file, there is a very important section

  

      pushl %       GET_THREAD_INFO(%     cmpl $(nr_syscalls), %       # system call tracing      testb $(_TIF_SYSCALL_TRACE _TIF_SYSCALL_AUDIT),TI_flags(%         call *sys_call_table(,%eax,     movl %eax,EAX(%esp) # store the        cli # make sure we don         movl TI_flags(%ebp), %     testw $_TIF_ALLWORK_MASK, %cx # current->         RESTORE_ALL    

 

This section completes the execution of system calls.

The system_call function finds the corresponding system call in the system call table based on the system call number sent by the user and then executes it.

A very important part from glibc functions to system calls is the system call number.

The system call number is defined in include/asm-i386/unistd. h.

  

    __NR_restart_syscall 0      __NR_exit 1      __NR_fork 2      __NR_read 3      __NR_write 4      __NR_open 5      __NR_close 6      __NR_waitpid 7     …………………………………..   

 

Each system call number corresponds to a system call.

The next step is to expand the macro of the system call.

  

         _syscall0(type,name) \     type name(          __asm__  (     :      :                     _syscall1(type,name,type1,arg1) \             __asm__  (     :      :  (__NR_##name), ((             _syscall2(type,name,type1,arg1,type2,arg2) \            _syscall3(type,name,type1,arg1,type2,arg2,type3,arg3) \             _syscall4(type,name,type1,arg1,type2,arg2,type3,arg3,type4,arg4) \           _syscall5(type,name,type1,arg1,type2,arg2,type3,arg3,type4,arg4, \                _syscall6(type,name,type1,arg1,type2,arg2,type3,arg3,type4,arg4, \       _res); \    

 

From this code, we can see that int $0x80 is used to trigger the system call through a soft disconnection. When a call occurs, the name in the function will be replaced by the system call name. Then call system_call. This process includes the initialization of the system call. The original code for the initialization of the system call is as follows:

Arch/i386/kernel/traps. c

Every time you execute int 0x80, the system will interrupt the processing and hand over the control to the kernel's system_call.

The entire system call process can be summarized as follows:

1. Execute the user program (for example, fork)

2. According to the function implementation in glibc, the system call number is obtained and int $0x80 is executed to generate an interruption.

3. Switch the address space and stack, and run SAVE_ALL. (Kernel Mode)

4. Perform Interrupt Processing and call kernel functions according to the system call table.

5. Run the kernel function.

6. Execute RESTORE_ALL and return to user mode

After resolving the implementation and calling process of system calls, we can modify or add the system calls of the kernel as needed.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.