Linux system call detailed __linux

Source: Internet
Author: User
Tags abs chmod function definition time zones

Reproduced from:


1. Linux system call principle

2. Implementation of System call

3. Linux system call Classification and list

4. System call, User programming Interface (API), System command and kernel function relationship

5. Linux system call Instance

6. Linux Custom system call 1. System call principle

System calls, as the name suggests, are a set of "special" interfaces that the operating system provides to a user program to invoke. User programs can use this set of "special" interface to obtain the services provided by the operating system kernel, such as the user can use the file system-related call request system to open the file, close the file or read and write files, through the clock-related system calls to obtain system time or set timer.

Logically, a system call can be viewed as an interface between the kernel and the user-space program-it is like a middleman that communicates the request of the user process to the kernel and then sends the processing results back to the user space after the kernel has finished processing the request.

The fundamental reason why system services need to be used to provide user space through system calls is to "protect" the system because we know that Linux's operating space is divided into kernel space and user space, each of which runs at different levels and is logically isolated from each other. So the user process does not normally allow access to kernel data or kernel functions, they can only manipulate user data in user space and invoke user space functions. For example, our familiar "Hello World" program (when executed) is the standard user space process, it uses the print function printf is a user space function, the printed character "Hello word" string also belongs to user space data.

But in many cases, the user process needs to obtain the system service (invokes the System program), then must use the system to provide the user "the special Interface"-the system calls, its particularity mainly is stipulates the user process to enter the kernel the specific position; in other words, the path that the user accesses the kernel is predetermined, Can only enter the kernel from the specified position, and do not allow wanton jump into the kernel. With such a sinking into the kernel of the unified access path limit to ensure that the kernel security is no risk. We can describe this mechanism graphically: as a tourist, you can buy tickets to the safari park, but you have to sit down in the sightseeing car and follow the prescribed route for sightseeing. Of course, do not get off the bus, because it is too dangerous, not to let you lose your life, is to scare the wild animals.

Note: In some embedded operating systems, the operating system often provides the user with the interface through the form of API, then realizes the call to the system through the way of the static link, so this pattern system state and user state are not obvious, that is, the user can call the function of the system directly in its thread, and did not switch to the kernel state.
2. Implementation of System call

Implementing system calls in Linux leverages software interrupts in the 0x86 architecture. The difference between software interrupts and what we often call interrupts (hardware interrupts) is that they are triggered by software directives rather than by peripherals, that is, an exception that is developed by programmers (which is a normal exception), in particular by invoking the int $0x80 assembly instruction. This assembly instruction will generate a vector for the 0x80 programming exception.

The reason that system calls need to be implemented by exceptions because when a user-state process invokes a system call, the CPU is switched to the kernel state to perform kernel functions, and we have already talked about entering the kernel in the i386 architecture--entering the high privilege level--must pass through the system door mechanism, Here the exception is actually through the system door into the kernel (in addition to the int 0x80 user space can also through the int3--vector 3, into--vector 4, bound--vector 5, and other abnormal instructions into the kernel, and other exceptions can not be used by user space programs, are used by the system).

Let's explain the process in more detail. The purpose of the Int $0x80 directive is to produce a programming exception numbered 0x80, which corresponds to the 128th item in the interrupt descriptor chart IDT-that is, the corresponding system gate descriptor. The door descriptor contains a preset kernel space address that points to the system call handler: System_call () (not confused with the system Invoke service), which is entry. s file is written in assembly language).

Obviously, all system calls will be transferred to this address uniformly, but Linux has 2, 300 system calls from here to the kernel and how to distribute them to their respective service programs. Don't get dizzy, the way to solve this problem is very simple: first Linux is numbered for each system call (0-nr_syscall), while a system call table is saved in the kernel, which holds the system call number and its corresponding service routines, so that before the system is transferred through the system door into the kernel, You need to pass the system call number into the kernel, and on the x86, the transfer action is implemented by loading the call number into the EAX register before executing the int0x80. This way, once the system call handler runs, it can get the data from the EAX and then look for the corresponding service routines in the system call table.

In addition to passing the system call number, many system calls need to pass some parameters to the kernel, such as sys_write (unsigned int fd, const char * buf, size_t count) call to pass the file descriptor FD, the content to write buf, and write several bytes of count to the kernel. In this case, Linux has 6 registers that can be used to pass these parameters: EAX (the system call number), EBX, ecx, edx, ESI, and EDI to hold these additional parameters (in ascending alphabetical order). The practice is to use Save_all macros in System_call () to store the values of these registers in the kernel stack.

System call is actually very simple, that is, the operating system API is through the software interrupt dynamic call, by calling int $0x80 trigger the software interrupt, and then through some registers to pass the parameters, implementation of the operating system API calls. In the embedded operating system has the concept of soft interrupt, the soft interrupt refers to the hard interrupt in the secondary priority of the task to soft interrupt processing, it runs on the system stack, priority is higher than the task, and the software interrupts mentioned in this chapter are very different from the software interrupt processing and the hard interrupt processing process, except that the interrupt is triggered by the software.

3. system call, User programming Interface (API), System command and kernel function relationship

System calls are not directly related to programmers or system administrators, it is simply an interface for kernel services to be submitted to the kernel via a soft interrupt mechanism (described later). In actual use, the programmer calls more than the user programming interface--api, while the administrator uses more system commands.

The user programming interface is actually a function definition that shows how to get a given service, such as read (), malloc (), Free (), ABS (), and so on. It may be consistent with system calls, such as the read () interface corresponds to the read system call, but this correspondence is not one by one corresponding, there are often several different APIs internal use of the same system call, such as malloc (), free () internal use BRK () System calls to enlarge or shrink the heap of the process, or an API that uses several system invocation combinations to complete the service. Some APIs do not even require any system calls--because it is not necessary to use kernel services, such as the ABS () interface for calculating the absolute value of integers.

The other thing to add is that Linux's user programming interface follows the most popular application programming interface standard--POSIX standard in the UNIX world, which defines a series of APIs. In Linux (this is also true of Unix), these APIs are mainly implemented through the C library (LIBC), in addition to the definition of some standard C functions, a very important task is to provide a set of encapsulation routines (wrapper routine) to the user space packaging system calls for user programming use.

The next issue that needs to be explained is the relationship between kernel functions and system calls. We do not think of the kernel function is too complex, in fact, they are similar to ordinary functions, but only in the kernel implementation, so to meet some kernel programming requirements. System call is a layer of user access to the kernel of the interface, which itself is not a kernel function, after entering the kernel, different system calls will find corresponding to their respective kernel functions--another professional say is called: System call service routines. The kernel function is actually served on the request rather than the calling interface.

For example, the system call Getpid is actually called the kernel function sys_getpid.

Asmlinkage long sys_getpid (void)


Return current->tpid;


There are many kernel functions in the Linux system, some are used in kernel files, others can be export for the other parts of the kernel to use, the specific circumstances of their own decision.

Kernel functions exposed by the kernel are--export-can be viewed using command ksyms or cat/proc/ksyms. In addition, there is an inductive classification of the kernel function of the book called "The Linux Kernel API books", interested readers can go to see.

All in all, from the user's perspective to the kernel, the system commands, programming interfaces, system calls, and kernel functions are followed. After describing the implementation of the system call, we'll look back at the entire execution path.

Note: Kernel functions are functions that the operating system uses, which are not externally displayed and are not available to the user, so interfaces can be changed. The user programming interface API is the interface that is presented directly to the user, it can construct an API using multiple system calls, or a system call can be used by multiple APIs, and the API can not use system call, Linux API is different from Ucos operating system API, The latter directly invokes the API function for static connection, and the system code is also connected to the API. The command, in my opinion, should be an executable program that compiles the API into executable files for processing alone.
4. Linux system call classification and list

The following is a list of Linux system calls that contain most common system calls and functions derived from system calls. This is probably the only list of Linux system calls you can see on the Internet, even if it's a simple alphabetical list, and it's quite rare.

By convention, this list is modelled on the Manpages 2nd section, which is the system call section. According to the author's understanding, it made a general classification, at the same time, some minor changes have been made, deleting a few system calls for kernel use only, not allowing users to call, making minor modifications to individual points where they are slightly inappropriate, and attaching a brief comment to all listed system calls.

Some of these functions function exactly the same, except for different parameters. (perhaps many familiar C + + friends can immediately associate the function overload, but do not forget that the Linux core is written in C, so you can only take a different function name). There are also some functions that are obsolete, replaced by newer and better functions (GCC warns when linking to these functions), but because they remain for compatibility reasons, these functions will be marked with a "*" number in front of them.

Linux system calls have inherited Unix system calls in many places, but Linux has done a lot of sublation compared to traditional UNIX system calls, eliminating the redundant system calls of many Unix systems, retaining only the most basic and useful system calls, So all Linux system calls are only about 250 (while some operating systems call up more than 1000).

System calls are mainly divided into the following categories: Control hardware-system calls are often used as hardware resources and user space of the abstract interface, such as reading and writing files used in the Write/read call. Set the system state or read kernel data-because system calls are the only means of communication between user space and the kernel, the user sets the system state, such as on/off a kernel service (setting a kernel variable), or when reading kernel data must be called through the system. such as Getpgid, GetPriority, setpriority, sethostname process Management--a system call interface is used to ensure that processes in the system can be run in a virtual memory environment with multitasking. such as fork, clone, Execve, exit, etc.

2.1 Process Control:

Fork Create a new process
Clone Create child processes by specified criteria
Execve Running an executable file
Exit Abort process
_exit Abort the current process immediately
Getdtablesize Maximum number of files that the process can open
Getpgid Gets the specified process group identification number
Setpgid Sets the specified process group flag number
Getpgrp Get the current Process group identification number
Setpgrp Set the current Process group flag number
Getpid Get process identification number
Getppid Get the parent process identification number
GetPriority Get scheduling priority
SetPriority Set scheduling priority
Modify_ldt Local Description table for read-write process
Nanosleep Make a process sleep specified time
Nice Change the priority of a time-sharing process
Pause Suspend process, wait for signal
Personality Set up a process run domain
Prctl To perform specific operations on a process
Ptrace Process tracking
Sched_get_priority_max Gets the upper limit of the static priority
Sched_get_priority_min Get the lower bound of a static priority
Sched_getparam Get scheduling parameters for a process
Sched_getscheduler Gets the scheduling policy for the specified process
Sched_rr_get_interval Gets the time slice length of the real-time process scheduled by the RR algorithm
Sched_setparam Set schedule parameters for a process
Sched_setscheduler To set scheduling policies and parameters for a specified process
Sched_yield The process proactively conceded the processor and waited for the queue to be dispatched.
Vfork Create a subprocess to execute a new program, often with EXECVE, etc.
Wait Wait for child process to terminate
Wait3 See also wait
Waitpid Wait for the specified child process to terminate
Wait4 See Waitpid
Capget Get Process Permissions
Capset Set process Permissions
GetSID Get the meeting identification number
Setsid Set up a meeting identification number

1.2 File operations

Fcntl File control
Open Open File
creat Create a new file
Close Close the file description Word
Read Read files
Write Write a file
Readv Reading data from a file into a buffer array
Writev Writes data from a buffer array to a file
Pread Read randomly to a file
Pwrite Write randomly to a file
Lseek Move file pointer
_llseek Moving a file pointer in a 64-bit address space
Dup Copy an open file descriptor
Dup2 Copy a file descriptor by specified criteria
Flock File Plus/unlock
Poll I/o multi-channel conversion
Truncate Truncate file
Ftruncate See truncate
Umask Set File Permission Mask
Fsync Write the file back to disk in the memory section

1.3 File system operations
Access Determining the accessibility of a file
ChDir Change the current working directory
Fchdir See ChDir
chmod Change file Mode
Fchmod See chmod
Chown Change the owner or user group of a file
Fchown See Chown
Lchown See Chown
Chroot Change the root directory
Stat Take file status information
Lstat See Stat
Fstat See Stat
Statfs Fetching File System Information
Fstatfs See Statfs
Readdir Reading directory entries
Getdents Reading directory entries
Mkdir Create a table of contents
Mknod To create an index node
RmDir Delete Directory
Rename File name change
Link Create a link
Symlink Create a symbolic link
Unlink Delete link
Readlink Read the value of a symbolic link
Mount Installing the file system
Umount Removing the file system
Ustat Fetching File System Information
Utime Change the file access modification time
Utimes See Utime
Quotactl Controlling disk quotas

1.4 System Control
Ioctl I/O Total control function
_sysctl Read/write system parameters
Acct Enable or disable process accounting
Getrlimit Get System Resource Caps
Setrlimit Set the system resource limit
Getrusage Get System Resource Usage
Uselib Select the binary function library to use
Ioperm Set Port I/O permissions
Iopl Change process I/O permission level
Outb Low-level port operation
Reboot Reboot
Swapon Open swap files and devices
Swapoff Turn off swap files and devices
Bdflush Controlling the Bdflush daemon
Sysfs File system type with core support
SysInfo Get System Information
Adjtimex Adjust system clock
Alarm Set the alarm clock for the process
Getitimer Get timer value
Setitimer Set Timer value
Gettimeofday Take time and TimeZone
Settimeofday Setting times and time zones
Stime Set system date and time
Time Get system time
Times Take process run time
Uname Get information about the name, version, and host of the current UNIX system
Vhangup Suspend current terminal
Nfsservctl Controlling the NFS Daemon
Vm86 Enter analog 8086 mode
Create_module To create a loadable module item
Delete_module To delete a loadable module item
Init_module Initializing modules
Query_module Query Module Information
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.