Linux system call Running Process __linux

Source: Internet
Author: User
Tags naming convention posix
In Linux, system calls are the only means by which user space accesses the kernel, which are the only legitimate portals of the kernel.

In general, applications are programmed by applying programming interfaces (APIs) rather than directly through system calls, and this programming interface does not actually need to correspond to the system call provided by the kernel. An API defines the programming interfaces used by a set of applications. They can be implemented as a system call, or they can be implemented by calling multiple system calls, even if no system calls are used. In fact, APIs can be implemented on a variety of operating systems, providing identical interfaces to applications, and their implementations on these systems may be different.

In the Unix world, the most popular application programming interfaces are based on POSIX standards, and Linux is POSIX compliant.

From a programmer's point of view, they just need to deal with the API, and the kernel only deals with system calls; how library functions and applications use system calls is not the kernel concern.

System calls (often referred to as syscalls in Linux) are often invoked through functions. They usually need to define one or several parameters (inputs) and may have some side effects. These side effects represent success (0 value) or error (negative) by a long return value. The error code is written to the errno global variable when a system call occurs incorrectly. By calling the Perror () function, you can translate the variable into an error string that the user can understand.

The implementation of the system call has two special places:

1 There are asmlinkage qualifiers in the function declaration to tell the compiler to extract only the parameters of the function from the stack.

2 The system call GETXXX () is defined as sys_getxxx () in the kernel. This is the naming convention that all system calls in Linux should follow.

System call Number: In Linux, each system call is given a system call number that can be associated with the system call through this unique number. When a user-space process executes a system call, the system call number is used to indicate which system call is to be executed, and the process does not mention the name of the system call. The system call number can no longer be changed once it is allocated (otherwise the compiled application crashes) and if a system call is deleted, the system call number it occupies is not allowed to be recycled. Linux has an "unused" system called Sys_ni_syscall () that does nothing else except return-enosys, and this error number is specifically for invalid system calls. Although rare, this function is responsible for "filling vacancies" if a system call is removed.

The kernel records the list of all registered system calls in the system call table, stored in the sys_call_table. It is related to the architecture and is generally defined in ENTRY.S. This table specifies a unique system call number for each valid system call.

User-space programs cannot execute kernel code directly. They cannot call kernel-space functions directly, because the kernel resides on the protected address space, the application should somehow notify the system that the kernel itself needs to perform a system call, and the system switches to the kernel state so that the kernel can execute the system call on behalf of the application. The mechanism of this notification kernel is implemented through soft interrupts. Soft interrupts on x86 systems are generated by int$0x80 directives. This instruction triggers an exception that causes the system to switch to the kernel state and perform the 128th exception handler, which is the system call handler, named System_call (). It is closely related to the hardware architecture and is usually written in ENTRY.S files in assembly language.

All system calls are the same in the kernel, so it's not enough to just get into kernel space. Therefore, the system call number must be passed along to the kernel. On the x86, the transfer action is implemented by loading the call number into the EAX register before the soft interrupt is triggered. This allows the system call handler to get the data from the EAX once it is run. The above mentioned System_call () examines its validity by comparing the given system call number with the Nr_syscalls. If it is greater than or equal to Nr_syscalls, the function returns-enosys. Otherwise, the corresponding system call is performed: called *sys_call_table (,%eax, 4);

Because the table entries in the system call table are stored in 32-bit (4-byte) types, the kernel needs to multiply the given system call number by 4 and then use the resulting results to query the location in the table. As shown in Tutu:

As mentioned above, some external parameter input is required in addition to the system call number. The easiest way to do this is to store these parameters in registers just as you would a system call number. On x86 systems Ebx,ecx,edx,esi and EDI store the first 5 parameters in order. It is rare to have six or more than six parameters, at which point a single register should be used to hold pointers to all of these parameters in the user-space address. Return values to the user space are also passed through registers. On the x86 system, it is stored in a eax register.

System calls must carefully check that all of their arguments are valid. System calls are performed in kernel space. If the user is allowed to pass the illegal input to the kernel, the security and stability of the system will be faced with great test. One of the most important checks is to check that the user-supplied pointer is valid, and that the kernel must ensure that before it receives a pointer to a user space:

1 The memory area that the pointer points to belongs to user space
2 The memory area that the pointer points to is in the process's address space
3 if read, read memory should be marked as readable. If it is written, the memory should be marked as writable.

The kernel provides two ways to perform the necessary checks and back-and-forth copies of data between kernel space and user space. Both methods must have one called.

Copy_to_user (): Write data to User space, requires 3 parameters. The first parameter is the destination memory address in the process space. The second is the source address within the kernel space. The third is the length of the data to be copied (in bytes).
Copy_from_user (): Read data to user space, requires 3 parameters. The first parameter is the destination memory address in the process space. The second is the source address within the kernel space. The third is the length of the data to be copied (the number of bytes).
Note: Both of these are likely to cause blocking. This occurs when a page containing user data is swapped out on the hard disk rather than on physical memory. At this point, the process will hibernate until the page-fault handler returns it from the hard disk back to physical memory.

The kernel is in the process context while executing the system call, and the current pointer is to the present task, which is the process that raised the system call. In a process context, the kernel can hibernate (for example, when a system call blocks or explicitly invokes schedule ()) and can be preempted. When the system call returns, control is still in System_call (), and it will eventually be responsible for switching to user space and allowing the user process to continue.

Adding a system call time to Linux is a very simple thing, how to design and implement a system call is the problem. The first step in implementing a system call is to determine its purpose, which is clear and unique, and does not attempt to write multi-purpose system calls. IOCTL is a negative example. The parameters of the new system call, the return value, and what the error code should be, are critical. Once a system call is written, it is trivial to register it as a formal system call, generally following steps:

1 Add a table entry at the end of the system call table (typically located in Entry.s). From 0 onwards, the position of the system table entry in the table is its system call number. If the 10th system call is assigned to the system call number 9.
2 any architecture, the system call number must be defined in the Include/asm/unistd.h
3 system calls must be compiled into the kernel image (cannot be compiled into modules). Just put it in a related file under kernel/.

The user's program cannot execute the kernel code directly. They cannot call the kernel's functions directly, because the kernel resides in the protected address space. So the application should notify the kernel in some way, telling the kernel that it needs to perform a system call, and that it wants the system to switch to the kernel state so that the kernel can execute the system call on behalf of the application.

The mechanism for notifying the kernel is implemented through a soft interrupt mechanism: An exception is thrown to cause the system to switch to the kernel state to execute the exception handler. The exception handler at this point is actually the system call handler.

Typically, system calls rely on the C library support, and the user program can use system calls (or use library functions, and then actual calls by library functions) by including standard header files and links to C libraries. Thankfully, Linux itself provides a set of macros that are used to directly access system calls. It will set the register and invoke the int $0x80 instruction. These macros are _syscalln (), where n ranges from 0 to 6. Represents the number of parameters that need to be passed to a system call. This is because the macro must understand exactly how many parameters are pressed into the registers in what order. As an example of an open system call:

The open () system call is defined as follows:
Long open (const char *filename, int flags, int mode)
The macro that calls this system directly is in the form of:
#define Nr_open 5
_syscall3 (Long, open, const char *, filename, int, flags, int, mode)

This allows the application to use the open () directly. Call Open () system call to place the above macro directly in the application. For each macro, there are 2+2*n parameters. The meaning of each parameter is simple and straightforward, and is not explained in detail here.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.