System Call (2): System Call
(5) implementation of system calls
1: Implement system calls
Implementing a system call is to consider its purpose. Each system call has a definite purpose, in Linux, multi-purpose system calls are not recommended (a system call selects different tasks by passing different parameter values ).
2: parameter verification
System calls must carefully check whether all their parameters are valid and valid. The most important check is to check whether the pointer provided by the user is valid.
Before receiving a user space pointer, the kernel must ensure that:
1: The memory area pointed to by the pointer belongs to the user space, and the process must not lie to the kernel to read the data in the kernel space. 2: The memory area pointed to by the pointer is in the address space of the process. The process must not fool the kernel to read data from other processes. 3. If it is read, the memory should be marked as readable. If it is write, the memory should be marked as writable, if it is executable, the memory should be marked as executable. The process must not bypass the memory access restriction.
The kernel provides two methods to check required information and copy back and forth data between the kernel space and the user space.
In order to write data to the user space, the kernel provides copy_to_user (), which requires three parameters. The first parameter is the destination memory address in the process space, and the second parameter is the source address in the kernel space, the third parameter is the length (in bytes) of the data to be copied ).
To read data from the user space, the kernel provides copy_from_user (), which is similar to copy_to_user, this function copies the data at the specified location of the second parameter to the location specified by the first parameter. The length of the copied data is specified by the third parameter.
If the execution fails, the two functions return the number of bytes of data that failed to be copied. If yes, 0 is returned. When the preceding error occurs, the system returns the standard-EFAULT.
The following is an example of the silly_copy () function.
/** Silly_copy is a system call that has no actual value. It transfers the len byte data from the 'dst 'copied by 'src'. * There is no reason to use the kernel space as a transfer station. */SYSTEMCALL_DEFINE3 (silly_copy, unsigned long * src, unsigned long * dst, unsigned long len) {unsigned long buf; /* copy the src from the user address to dst */if (copy_from_user (& buf, src, len) return-EFAULT; /* copy the buf to the dst */if (copy_to_user (dst, & buf, len) in the user address space) return-EFAULT; /* return the copied data volume */return len ;}
Note that both copy_to_user () and copy_from_user () functions may cause blocking. When a page containing user data is swapped out of the hard disk rather than in the physical memory, this happens, and the process will sleep, the page missing handler will switch the page from the hard disk to the physical memory.
The last one is to check whether the target has valid permissions. In linux, you can use the capable () function to check whether you have the permission to operate the specified resource process. If a non-zero value is returned, the caller has the right to perform the process operation. If the return value is 0, the caller has no permission to perform the operation.
Next, let's take a look at the use of the capality () function in the reboot system call.
/** Reboot system call: for obvious reasons only root may call it, * and even root needs to set up some magic numbers in the registers * so that some mistake won't make this reboot the whole machine. * You can also set the meaning of the ctrl-alt-del-key here. ** reboor System Call: for some notable reasons, only the root user can call it. * Even the root user needs to set some parameters in the register, so some errors won't cause * the whole machine to restart. ** Reboot doesn' t sync: do that yourself before calling this. ** reboot is not collaborative: do those things before calling this. */SYSCALL_DEFINE4 (reboot, int, magic1, int, magic2, unsigned int, cmd, void _ user *, arg) {char buffer [256]; int ret = 0; /* We only trust the superuser with rebooting the system. * // * We only trust the super user who starts the system */if (! Capable (CAP_SYS_BOOT) return-EPERM;/* For safety, we require "magic" arguments. * // * for security, we need the "magic" parameter */if (magic1! = LINUX_REBOOT_MAGIC1 | (magic2! = LINUX_REBOOT_MAGIC2 & magic2! = LINUX_REBOOT_MAGIC2A & magic2! = LINUX_REBOOT_MAGIC2B & magic2! = LINUX_REBOOT_MAGIC2C) return-EINVAL;/* Instead of trying to make the power_off code look like * halt when pm_power_off is not set do it the easy way. ** when pm_power_off is not set, do not try to make the power_off Code * look like it can be shut down, but use a simpler method */if (cmd = LINUX_REBOOT_CMD_POWER_OFF) &&! Pm_power_off) cmd = cursor; mutex_lock (& reboot_mutex); switch (cmd) {case linux_reboot_assist_restart: kernel_restart (NULL); break; case when: C_A_D = 1; break; case when: c_A_D = 0; break; case linux_reboot_1__halt: kernel_halt (); do_exit (0); panic ("cannot halt"); case when: kernel_power_off (); do_exit (0); break; case LINUX_REBOOT_CMD_RESTART2: if (strncpy_from_user (& buffer [0], arg, sizeof (buffer)-1) <0) {ret =-EFAULT; break;} buffer [sizeof (buffer) -1] = ''; kernel_restart (buffer); break; # ifdef CONFIG_KEXEC case linux_reboot_assist_kexec: ret = kernel_kexec (); break; # endif # ifdef CONFIG_HIBERNATION case when: ret = hibernate (); break; # endif default: ret =-EINVAL; break;} mutex_unlock (& reboot_mutex); return ret ;}
First, determine whether the calling process has the CAP_SYS_REBOOT permission. In the linux/capality. h file, a list of all these permissions and their corresponding permissions is contained. Let's take a look:
/** ** POSIX-draft defined capabilities. **//* In a system with the [_POSIX_CHOWN_RESTRICTED] option defined, this overrides the restriction of changing file ownership and group ownership. */#define CAP_CHOWN 0/* Override all DAC access, including ACL execute access if [_POSIX_ACL] is defined. Excluding DAC access covered by CAP_LINUX_IMMUTABLE. */#define CAP_DAC_OVERRIDE 1/* Overrides all DAC restrictions regarding read and search on files and directories, including ACL restrictions if [_POSIX_ACL] is defined. Excluding DAC access covered by CAP_LINUX_IMMUTABLE. */#define CAP_DAC_READ_SEARCH 2/* Overrides all restrictions about allowed operations on files, where file owner ID must be equal to the user ID, except where CAP_FSETID is applicable. It doesn't override MAC and DAC restrictions. */#define CAP_FOWNER 3//........
(6): context of system calls
In the above article, we know that the sys_call () process will process the calls of the process system. After the system calls and returns the call, the control is still in the hands of system_call, he will eventually switch to the user space and let the user process continue to run.
1: The last step of binding a system call
When writing a system call, registering it as a formal system call is cumbersome:
1): first, add a table item at the end of the system call table. 2): For the supported architecture, the system call table must be defined in asm/unistd. h 3): system calls must be compiled into the kernel image (cannot be compiled into modules ). You only need to put it in a related file under the kernel/, such as sys. c, which contains a variety of system calls.
First, let's create a system call foo () to use these steps.
First, add the sys_foo System Call to the last table item of the call table, which is located in the kernel/syscall_table_32.S file.
ENTRY(sys_call_table) .long sys_restart_syscall /* 0 - old "setup()" system call, used for restarting */ .long sys_exit .long ptregs_fork .long sys_read .long sys_write .long sys_open /* 5 */ .long sys_close .long sys_waitpid .long sys_creat .long sys_link .long sys_unlink /* 10 */// .............................................. .long sys_rt_tgsigqueueinfo /* 335 */ .long sys_perf_event_open .long sys_recvmmsg .long sys_foo /* 338 */
Obviously, our system call number is 338. This system call is related to the architecture, so it needs to be placed in the appropriate architecture file.
Next, we add the system call number to the asm/unistd. h file.
#define __NR_rt_tgsigqueueinfo 240__SYSCALL(__NR_rt_tgsigqueueinfo, sys_rt_tgsigqueueinfo)#define __NR_perf_event_open 241__SYSCALL(__NR_perf_event_open, sys_perf_event_open)#define __NR_accept4 242__SYSCALL(__NR_accept4, sys_accept4)#define __NR_recvmmsg 243__SYSCALL(__NR_recvmmsg, sys_recvmmsg)#undef __NR_syscalls#define __NR_syscalls 244#define _NR_foo 338
Finally, the system calls the function foo (). Based on the functions of the function, we can put it into the relevant file. The foo () function is put into the kernel/sys. c file.
# Include <asm/page. h>/** sys_foo-return the kernel stack size **/asmlinkage long sys_foo (void) {return THREAD_SIZE ;}
In this way, you can start the kernel and call the foo () system call in the user space.
2: access the system call from the user space
The Linux Kernel provides a macro to call the system call in the user space. Next we use this method to test the previous foo () System Call.
#define _NR_foo 338__syscall0(long,foo)int main(){ long stack_size; stack_size = foo(); printf("The kernel stack size is %ld.",stack_size); return 0;}
Where, # define _ NR_foo 338 indicates the system call number of the foo system call.
_ Syscall0 (long, foo): 0 indicates that 0 parameters are passed to the foo system call. This value indicates the number of parameters passed to the system call.
3: Why not use the System Call method?
First, let's take a look at the benefits of system calls:
1: easy to create system calls and convenient to use 2: High Performance of Linux system calls is obvious
System Call problems:
1: A system call number is required, which must be officially allocated during kernel development. 2. system calls are solidified after being added to the stable kernel. To avoid program v crash, this interface cannot be changed. 3. You must register the system call for each architecture to be supported. 4. It is not easy to call the system call in the script, you cannot directly access the system call from the file system. 5. Due to the existence of the system call number, it is difficult to maintain and use the system call outside the main kernel. 6. If it is only information exchange, system calling is a little useful