Syscall system calls Linux kernel tracking

Source: Internet
Author: User
In linux user space, we often call the system call. Next we track the read system call. the Linux kernel version used is linux2.6.37. The implementation varies slightly in different Linux versions. In some applications, we can see some of the following definitions:

#define real_read(fd, buf, count ) (syscall(SYS_read, (fd), (buf), (count)))

Actually, the system function syscall (sys_read) is called, that is, the sys_read () function, which is implemented using several macro definitions in linux2.6.37.

The implementation mechanism of the Linux System Call (SCI, system call interface) is actually a process of multi-channel aggregation and decomposition. The aggregation point is the entry point of 0x80 interruptions (x86 system structure ). That is to say, all system calls are aggregated from the user space to 0x80, and the specific system call number is saved. When the 0x80 interrupt handler program runs, different system calls are processed based on the system call number (Different kernel functions are called for processing ).

Two ways to cause system calls

(1) int $0 × 80, the only method that causes system calls in the old Linux kernel version

(2) sysenter Assembly command

Use the following macro to call the system in the Linux Kernel

SYSCALL_DEFINE3(read, unsigned int, fd, char __user *, buf, size_t, count)
{
    struct file *file;
    ssize_t ret = -EBADF;
    int fput_needed;

    file = fget_light(fd, &fput_needed);
    if (file) {
        loff_t pos = file_pos_read(file);
        ret = vfs_read(file, buf, count, &pos);
        file_pos_write(file, pos);
        fput_light(file, fput_needed);
    }

    return ret;
}

The macro definition of syscall_define3 is as follows:

#define SYSCALL_DEFINE3(name, ...) SYSCALL_DEFINEx(3, _##name, __VA_ARGS__)

# Replace the characters in the macro directly,
If name = read, _ nR _ # name in the macro is replaced with _ nr_read. _ NR _ # name indicates the system call number. # indicates two macro expansions. replace "name" with the actual system call name, and then replace _ nR _... expand. for example, name = IOCTL is _ nr_ioctl.

 

#ifdef CONFIG_FTRACE_SYSCALLS
#define SYSCALL_DEFINEx(x, sname, ...)                \
    static const char *types_##sname[] = {            \
        __SC_STR_TDECL##x(__VA_ARGS__)            \
    };                            \
    static const char *args_##sname[] = {            \
        __SC_STR_ADECL##x(__VA_ARGS__)            \
    };                            \
    SYSCALL_METADATA(sname, x);                \
    __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
#else
#define SYSCALL_DEFINEx(x, sname, ...)                \
    __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
#endif

Whether or not the config_ftrace_syscils macro is defined, the following macro definition will be executed:

_ Syscall_definex (x, sname, _ va_args __)

#ifdef CONFIG_HAVE_SYSCALL_WRAPPERS

#define SYSCALL_DEFINE(name) static inline long SYSC_##name

#define __SYSCALL_DEFINEx(x, name, ...)                    \
    asmlinkage long sys##name(__SC_DECL##x(__VA_ARGS__));        \
    static inline long SYSC##name(__SC_DECL##x(__VA_ARGS__));    \
    asmlinkage long SyS##name(__SC_LONG##x(__VA_ARGS__))        \
    {                                \
        __SC_TEST##x(__VA_ARGS__);                \
        return (long) SYSC##name(__SC_CAST##x(__VA_ARGS__));    \
    }                                \
    SYSCALL_ALIAS(sys##name, SyS##name);                \
    static inline long SYSC##name(__SC_DECL##x(__VA_ARGS__))

#else /* CONFIG_HAVE_SYSCALL_WRAPPERS */

#define SYSCALL_DEFINE(name) asmlinkage long sys_##name
#define __SYSCALL_DEFINEx(x, name, ...)                    \
    asmlinkage long sys##name(__SC_DECL##x(__VA_ARGS__))

#endif /* CONFIG_HAVE_SYSCALL_WRAPPERS */

The following macro definitions will be called:

Asmlinkage long sys # NAME (_ SC _decl # X (_ va_args __))
That is, the sys_read () system function we mentioned earlier.
Asmlinkage notifies the compiler to extract only the parameters of this function from the stack. This qualifier is required for all system calls! This is similar to the macro definition mentioned in quagga in the previous article.

That is, the following code in macro definition:

struct file *file;
    ssize_t ret = -EBADF;
    int fput_needed;

    file = fget_light(fd, &fput_needed);
    if (file) {
        loff_t pos = file_pos_read(file);
        ret = vfs_read(file, buf, count, &pos);
        file_pos_write(file, pos);
        fput_light(file, fput_needed);
    }

    return ret;

Code parsing:

  • Fget_light (): extracts the corresponding file object from the current process descriptor Based on the index specified by FD (see figure 3 ).
  • If the specified file object is not found, an error is returned.
  • If the specified file object is found:
  • Call the file_pos_read () function to retrieve the current location of the read/write file.
  • Call vfs_read () to execute the file read operation, and this function finally calls the function pointed to by file-> f_op.read (). The Code is as follows:

If (file-> f_op-> Read)
Ret = file-> f_op-> Read (file, Buf, Count, POS );

  • Call file_pos_write () to update the current read/write location of the file.
  • Call fput_light () to update the reference count of the file.
  • Finally, the number of bytes of data read is returned.

At this point, the processing done by the virtual file system layer is complete, and the control is handed over to the ext2 file system layer.

Http://blogold.chinaunix.net/u3/104447/showart_2527011.html

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.