Reproduced from: http://blog.csdn.net/orange_os/article/details/7485069
Directory:
1. Linux system call principle
2. Implementation of System call
3. Linux system call Classification and list
4. System call, User programming Interface (API), System command and kernel function relationship
5. Linux system call Instance
6. Linux Custom system call 1. System call principle
System calls, as the name suggests, are a set of "special" interfaces that the operating system provides to a user program to invoke. User programs can use this set of "special" interface to obtain the services provided by the operating system kernel, such as the user can use the file system-related call request system to open the file, close the file or read and write files, through the clock-related system calls to obtain system time or set timer.
Logically, a system call can be viewed as an interface between the kernel and the user-space program-it is like a middleman that communicates the request of the user process to the kernel and then sends the processing results back to the user space after the kernel has finished processing the request.
The fundamental reason why system services need to be used to provide user space through system calls is to "protect" the system because we know that Linux's operating space is divided into kernel space and user space, each of which runs at different levels and is logically isolated from each other. So the user process does not normally allow access to kernel data or kernel functions, they can only manipulate user data in user space and invoke user space functions. For example, our familiar "Hello World" program (when executed) is the standard user space process, it uses the print function printf is a user space function, the printed character "Hello word" string also belongs to user space data.
But in many cases, the user process needs to obtain the system service (invokes the System program), then must use the system to provide the user "the special Interface"-the system calls, its particularity mainly is stipulates the user process to enter the kernel the specific position; in other words, the path that the user accesses the kernel is predetermined, Can only enter the kernel from the specified position, and do not allow wanton jump into the kernel. With such a sinking into the kernel of the unified access path limit to ensure that the kernel security is no risk. We can describe this mechanism graphically: as a tourist, you can buy tickets to the safari park, but you have to sit down in the sightseeing car and follow the prescribed route for sightseeing. Of course, do not get off the bus, because it is too dangerous, not to let you lose your life, is to scare the wild animals.
Note: In some embedded operating systems, the operating system often provides the user with the interface through the form of API, then realizes the call to the system through the way of the static link, so this pattern system state and user state are not obvious, that is, the user can call the function of the system directly in its thread, and did not switch to the kernel state.
2. Implementation of System call
Implementing system calls in Linux leverages software interrupts in the 0x86 architecture. The difference between software interrupts and what we often call interrupts (hardware interrupts) is that they are triggered by software directives rather than by peripherals, that is, an exception that is developed by programmers (which is a normal exception), in particular by invoking the int $0x80 assembly instruction. This assembly instruction will generate a vector for the 0x80 programming exception.
The reason that system calls need to be implemented by exceptions because when a user-state process invokes a system call, the CPU is switched to the kernel state to perform kernel functions, and we have already talked about entering the kernel in the i386 architecture--entering the high privilege level--must pass through the system door mechanism, Here the exception is actually through the system door into the kernel (in addition to the int 0x80 user space can also through the int3--vector 3, into--vector 4, bound--vector 5, and other abnormal instructions into the kernel, and other exceptions can not be used by user space programs, are used by the system).
Let's explain the process in more detail. The purpose of the Int $0x80 directive is to produce a programming exception numbered 0x80, which corresponds to the 128th item in the interrupt descriptor chart IDT-that is, the corresponding system gate descriptor. The door descriptor contains a preset kernel space address that points to the system call handler: System_call () (not confused with the system Invoke service), which is entry. s file is written in assembly language).
Obviously, all system calls will be transferred to this address uniformly, but Linux has 2, 300 system calls from here to the kernel and how to distribute them to their respective service programs. Don't get dizzy, the way to solve this problem is very simple: first Linux is numbered for each system call (0-nr_syscall), while a system call table is saved in the kernel, which holds the system call number and its corresponding service routines, so that before the system is transferred through the system door into the kernel, You need to pass the system call number into the kernel, and on the x86, the transfer action is implemented by loading the call number into the EAX register before executing the int0x80. This way, once the system call handler runs, it can get the data from the EAX and then look for the corresponding service routines in the system call table.
In addition to passing the system call number, many system calls need to pass some parameters to the kernel, such as sys_write (unsigned int fd, const char * buf, size_t count) call to pass the file descriptor FD, the content to write buf, and write several bytes of count to the kernel. In this case, Linux has 6 registers that can be used to pass these parameters: EAX (the system call number), EBX, ecx, edx, ESI, and EDI to hold these additional parameters (in ascending alphabetical order). The practice is to use Save_all macros in System_call () to store the values of these registers in the kernel stack.
Note:
System call is actually very simple, that is, the operating system API is through the software interrupt dynamic call, by calling int $0x80 trigger the software interrupt, and then through some registers to pass the parameters, implementation of the operating system API calls. In the embedded operating system has the concept of soft interrupt, the soft interrupt refers to the hard interrupt in the secondary priority of the task to soft interrupt processing, it runs on the system stack, priority is higher than the task, and the software interrupts mentioned in this chapter are very different from the software interrupt processing and the hard interrupt processing process, except that the interrupt is triggered by the software.
3. system call, User programming Interface (API), System command and kernel function relationship
System calls are not directly related to programmers or system administrators, it is simply an interface for kernel services to be submitted to the kernel via a soft interrupt mechanism (described later). In actual use, the programmer calls more than the user programming interface--api, while the administrator uses more system commands.
The user programming interface is actually a function definition that shows how to get a given service, such as read (), malloc (), Free (), ABS (), and so on. It may be consistent with system calls, such as the read () interface corresponds to the read system call, but this correspondence is not one by one corresponding, there are often several different APIs internal use of the same system call, such as malloc (), free () internal use BRK () System calls to enlarge or shrink the heap of the process, or an API that uses several system invocation combinations to complete the service. Some APIs do not even require any system calls--because it is not necessary to use kernel services, such as the ABS () interface for calculating the absolute value of integers.
The other thing to add is that Linux's user programming interface follows the most popular application programming interface standard--POSIX standard in the UNIX world, which defines a series of APIs. In Linux (this is also true of Unix), these APIs are mainly implemented through the C library (LIBC), in addition to the definition of some standard C functions, a very important task is to provide a set of encapsulation routines (wrapper routine) to the user space packaging system calls for user programming use.
The next issue that needs to be explained is the relationship between kernel functions and system calls. We do not think of the kernel function is too complex, in fact, they are similar to ordinary functions, but only in the kernel implementation, so to meet some kernel programming requirements. System call is a layer of user access to the kernel of the interface, which itself is not a kernel function, after entering the kernel, different system calls will find corresponding to their respective kernel functions--another professional say is called: System call service routines. The kernel function is actually served on the request rather than the calling interface.
For example, the system call Getpid is actually called the kernel function sys_getpid.
Asmlinkage long sys_getpid (void)
{
Return current->tpid;
}
There are many kernel functions in the Linux system, some are used in kernel files, others can be export for the other parts of the kernel to use, the specific circumstances of their own decision.
Kernel functions exposed by the kernel are--export-can be viewed using command ksyms or cat/proc/ksyms. In addition, there is an inductive classification of the kernel function of the book called "The Linux Kernel API books", interested readers can go to see.
All in all, from the user's perspective to the kernel, the system commands, programming interfaces, system calls, and kernel functions are followed. After describing the implementation of the system call, we'll look back at the entire execution path.
Note: Kernel functions are functions that the operating system uses, which are not externally displayed and are not available to the user, so interfaces can be changed. The user programming interface API is the interface that is presented directly to the user, it can construct an API using multiple system calls, or a system call can be used by multiple APIs, and the API can not use system call, Linux API is different from Ucos operating system API, The latter directly invokes the API function for static connection, and the system code is also connected to the API. The command, in my opinion, should be an executable program that compiles the API into executable files for processing alone.
4. Linux system call classification and list
The following is a list of Linux system calls that contain most common system calls and functions derived from system calls. This is probably the only list of Linux system calls you can see on the Internet, even if it's a simple alphabetical list, and it's quite rare.
By convention, this list is modelled on the Manpages 2nd section, which is the system call section. According to the author's understanding, it made a general classification, at the same time, some minor changes have been made, deleting a few system calls for kernel use only, not allowing users to call, making minor modifications to individual points where they are slightly inappropriate, and attaching a brief comment to all listed system calls.
Some of these functions function exactly the same, except for different parameters. (perhaps many familiar C + + friends can immediately associate the function overload, but do not forget that the Linux core is written in C, so you can only take a different function name). There are also some functions that are obsolete, replaced by newer and better functions (GCC warns when linking to these functions), but because they remain for compatibility reasons, these functions will be marked with a "*" number in front of them.
Linux system calls have inherited Unix system calls in many places, but Linux has done a lot of sublation compared to traditional UNIX system calls, eliminating the redundant system calls of many Unix systems, retaining only the most basic and useful system calls, So all Linux system calls are only about 250 (while some operating systems call up more than 1000).
System calls are mainly divided into the following categories: Control hardware-system calls are often used as hardware resources and user space of the abstract interface, such as reading and writing files used in the Write/read call. Set the system state or read kernel data-because system calls are the only means of communication between user space and the kernel, the user sets the system state, such as on/off a kernel service (setting a kernel variable), or when reading kernel data must be called through the system. such as Getpgid, GetPriority, setpriority, sethostname process Management--a system call interface is used to ensure that processes in the system can be run in a virtual memory environment with multitasking. such as fork, clone, Execve, exit, etc.
2.1 Process Control:
Fork |
Create a new process |
Clone |
Create child processes by specified criteria |
Execve |
Running an executable file |
Exit |
Abort process |
_exit |
Abort the current process immediately |
Getdtablesize |
Maximum number of files that the process can open |
Getpgid |
Gets the specified process group identification number |
Setpgid |
Sets the specified process group flag number |
Getpgrp |
Get the current Process group identification number |
Setpgrp |
Set the current Process group flag number |
Getpid |
Get process identification number |
Getppid |
Get the parent process identification number |
GetPriority |
Get scheduling priority |
SetPriority |
Set scheduling priority |
Modify_ldt |
Local Description table for read-write process |
Nanosleep |
Make a process sleep specified time |
Nice |
Change the priority of a time-sharing process |
Pause |
Suspend process, wait for signal |
Personality |
Set up a process run domain |
Prctl |
To perform specific operations on a process |
Ptrace |
Process tracking |
Sched_get_priority_max |
Gets the upper limit of the static priority |
Sched_get_priority_min |
Get the lower bound of a static priority |
Sched_getparam |
Get scheduling parameters for a process |
Sched_getscheduler |
Gets the scheduling policy for the specified process |
Sched_rr_get_interval |
Gets the time slice length of the real-time process scheduled by the RR algorithm |
Sched_setparam |
Set schedule parameters for a process |
Sched_setscheduler |
To set scheduling policies and parameters for a specified process |
Sched_yield |
The process proactively conceded the processor and waited for the queue to be dispatched. |
Vfork |
Create a subprocess to execute a new program, often with EXECVE, etc. |
Wait |
Wait for child process to terminate |
Wait3 |
See also wait |
Waitpid |
Wait for the specified child process to terminate |
Wait4 |
See Waitpid |
Capget |
Get Process Permissions |
Capset |
Set process Permissions |
GetSID |
Get the meeting identification number |
Setsid |
Set up a meeting identification number |
1.2 File operations
Fcntl |
File control |
Open |
Open File |
creat |
Create a new file |
Close |
Close the file description Word |
Read |
Read files |
Write |
Write a file |
Readv |
Reading data from a file into a buffer array |
Writev |
Writes data from a buffer array to a file |
Pread |
Read randomly to a file |
Pwrite |
Write randomly to a file |
Lseek |
Move file pointer |
_llseek |
Moving a file pointer in a 64-bit address space |
Dup |
Copy an open file descriptor |
Dup2 |
Copy a file descriptor by specified criteria |
Flock |
File Plus/unlock |
Poll |
I/o multi-channel conversion |
Truncate |
Truncate file |
Ftruncate |
See truncate |
Umask |
Set File Permission Mask |
Fsync |
Write the file back to disk in the memory section |
1.3 File system operations
Access |
Determining the accessibility of a file |
ChDir |
Change the current working directory |
Fchdir |
See ChDir |
chmod |
Change file Mode |
Fchmod |
See chmod |
Chown |
Change the owner or user group of a file |
Fchown |
See Chown |
Lchown |
See Chown |
Chroot |
Change the root directory |
Stat |
Take file status information |
Lstat |
See Stat |
Fstat |
See Stat |
Statfs |
Fetching File System Information |
Fstatfs |
See Statfs |
Readdir |
Reading directory entries |
Getdents |
Reading directory entries |
Mkdir |
Create a table of contents |
Mknod |
To create an index node |
RmDir |
Delete Directory |
Rename |
File name change |
Link |
Create a link |
Symlink |
Create a symbolic link |
Unlink |
Delete link |
Readlink |
Read the value of a symbolic link |
Mount |
Installing the file system |
Umount |
Removing the file system |
Ustat |
Fetching File System Information |
Utime |
Change the file access modification time |
Utimes |
See Utime |
Quotactl |
Controlling disk quotas |
1.4 System Control
Ioctl |
I/O Total control function |
_sysctl |
Read/write system parameters |
Acct |
Enable or disable process accounting |
Getrlimit |
Get System Resource Caps |
Setrlimit |
Set the system resource limit |
Getrusage |
Get System Resource Usage |
Uselib |
Select the binary function library to use |
Ioperm |
Set Port I/O permissions |
Iopl |
Change process I/O permission level |
Outb |
Low-level port operation |
Reboot |
Reboot |
Swapon |
Open swap files and devices |
Swapoff |
Turn off swap files and devices |
Bdflush |
Controlling the Bdflush daemon |
Sysfs |
File system type with core support |
SysInfo |
Get System Information |
Adjtimex |
Adjust system clock |
Alarm |
Set the alarm clock for the process |
Getitimer |
Get timer value |
Setitimer |
Set Timer value |
Gettimeofday |
Take time and TimeZone |
Settimeofday |
Setting times and time zones |
Stime |
Set system date and time |
Time |
Get system time |
Times |
Take process run time |
Uname |
Get information about the name, version, and host of the current UNIX system |
Vhangup |
Suspend current terminal |
Nfsservctl |
Controlling the NFS Daemon |
Vm86 |
Enter analog 8086 mode |
Create_module |
To create a loadable module item |
Delete_module |
To delete a loadable module item |
Init_module |
Initializing modules |
Query_module |
Query Module Information |
|