A Method for interpreting the Linux kernel source code

Source: Internet
Author: User
Tags intel pentium

Introduction to Linux kernel Interpretation

 

 

For many Linux enthusiasts who are very interested in the kernel, this article aims to introduce an entry method for interpreting the Linux kernel source code, rather than explaining the complicated Linux kernel mechanism;

 

1. file organization of the core source program:

 

1. the Linux core source code is usually installed in/usr/src/Linux, and it has a very simple numbering Convention: Any even core (such as 2.0.30) is the core of a stable release, and any odd core (such as 2.1.42) is the core of development.

 

 

This article is based on the stable 2.2.5 source code. The second part of the implementation platform is RedHat Linux 6.0.

 

2. The core source program files are organized in a tree structure. At the top of the source program tree, you will see the following directories:

 

● Arch: the arch sub-directory contains all core Code related to the architecture. Each of its subdirectories represents a supported architecture. For example, i386 is about intel

The CPU and its sub-directories that are compatible with the architecture. Generally, PCs are based on this directory;

 

● Include: The include sub-Directory includes most of the header files required for compiling the core. Platform-independent header files are in the include/Linux subdirectory and are not related to intel

CPU-related header files are in the include/asm-i386 subdirectory, while the include/SCSI directory is the header file directory of the SCSI device;

 

● Init: This directory contains the core initialization code (Note: it is not the System Boot Code) and contains two main files. C and version. c. This is a very good starting point for studying how the core works.

 

● Mm: This Directory includes all

The memory management code of the architecture, such as the distribution and release of the page-based storage management memory. The Memory Management Code related to the architecture is located in arch/*/MM /, for example, arch/i386/MM/fault. c

 

 

● Kernel: the main core code. The files in this directory implement the kernel functions of most Linux systems. The most important file is sched. c; similarly, the architecture-related code is in arch/*/kernel;

 

 

● Drivers: Place all the device drivers in the system. Each driver occupies one sub-directory, for example,/block.

For example, IDE (IDE. C ). If you want to check how all devices that may contain a file system are initialized, you can check device_setup () in drivers/block/genhd. C (). It not only initializes the hard disk, but also the network, because the network is required when the NFS file system is installed:

For example, lib stores the core library code; net, core and network-related code; IPC, which contains the code for inter-process communication between the core; FS

All File System Code and various types of file operation code. Each subdirectory of the Code supports a file system, such as fat and ext2;

 

Scripts, which contains the script files used to configure the core.

 

Generally, each directory contains a. Depend file and a makefile.

Files, both of which are auxiliary files used for compilation. reading these files carefully is helpful for figuring out the relationship and dependency between each file. Besides, there is readme in some directories.

File, which is a description of the files in this directory, which is also conducive to our understanding of the kernel source code;

 

 

2. interpreting practice: Add a system call for your Kernel

 

Although the Linux kernel source code is organized reasonably and scientifically in a tree structure, all files associated with functions are placed under the same subdirectory, which makes the program more readable. However

The kernel source code is too big and very complex. Even if a reasonable file organization method is adopted, there are still many associations between files under different directories, some code at the analysis core usually needs to view several other related files, and these files may not be in the same subdirectory.

 

 

The complexity of the system and the complexity of the association between documents may be the main reason why many people are afraid of it. Of course, the return of this daunting labor is also fascinating: You can not only learn a lot of underlying computer knowledge (such as the system guidance described below ), I realized that the entire operating system architecture is exquisite and the algorithm is clever when solving a specific problem in detail. More importantly, in the source code analysis process, you will be professionally specialized at 1.1 points. Even after analyzing the code in 10 million points, you will deeply understand what kind of code is written by a professional programmer, what kind of code is written by a hobbyist.

 

In order to enable readers to better understand this feature, the following describes a specific Kernel Analysis instance, hoping to use this instance

The organization of the Linux kernel has some specific knowledge, from which readers can also learn some kernel analysis methods.

 

The following is an analysis instance:

 

[1] operating platform:

 

Hardware: CPU Intel Pentium II;

 

Software: RedHat Linux 6.0; kernel version 2.2.5 [2] kernel source code analysis:

 

1. System Boot and initialization: Linux system boot has several methods: Lilo,

The loadin and Linux Bootstrap boot (bootsect-loader), while the latter corresponds to the source program ARCH/i386/boot/bootsect. s, which is an assembly program of the Real-mode. The length is not analyzed here. No matter which mode of guidance is used, the system will jump

ARCH/i386/kernel/setup. S, setup. S is the initialization in the ongoing mode to prepare for the system to enter the protection mode. After that, the system runs

ARCH/i386/kernel/head. S (perform ARCH/i386/boot/compressed/head. s for the compressed kernel );

Setup_idt, an assembler defined in head. s, is responsible for creating an IDT table (Interrupt Descriptor

Table), which stores all the inactive and inactive endpoints, including system_call

Of course, in addition, head. s also needs to do some initialization work;

 

2. The first kernel program asmlinkage void _ init start_kernel (void) that runs after system initialization is defined in

 

In/usr/src/Linux/init/Main. C, it calls a function in usr/src/Linux/ARCH/i386/kernel/traps. C.

 

Void _ init trap_init (void) sets the entry address of each sink and interrupt service program to IDT

In the table, the system calls the general control program system_cal, which is one of the interrupted service programs. The Void _ init trap_init (void) function calls a macro

 

Set_system_gate (syscall_vector, & system_call); hangs the entry of the System Call Control Program on the interrupt 0x80;

 

Syscall_vector is a constant 0x80 defined in/usr/src/Linux/ARCH/i386/kernel/IRQ. h.

System_call

That is, the entry address of the interrupt control program. The interrupt control program is defined in/usr/src/Linux/ARCH/i386/kernel/entry. s in assembly language;

 

3. the interrupt control program is mainly responsible for saving the status before the processor executes the system call, checking whether the current call is valid, and redirecting the processor to the sys_call_table

The entry of the corresponding system service routine in the table; restores the processor status and returns it to the user program after the system service routine returns;

 

While the system call vector is defined in/usr/src/Linux/include/asm-386/unistd. h; sys_call_table

The table is defined in/usr/src/Linux/ARCH/i386/kernel/entry. S.

/Usr/src/Linux/include/asm-386/unistd. H also defines the User Programming Interface called by the system;

 

4. It can be seen that the Linux system call is also like the DOS system's int 21 h interrupt service. It uses 0x80 interrupt as the total entry, and then forwards it to

The entry addresses of various interrupt service routines in the sys_call_table table form different interrupt services;

 

According to the source code analysis above, to add a system call, you must add one in the sys_call_table table,

Save the entry address of your system service routine and re-compile the kernel. Of course, the system service routine is essential.

 

It can be seen that in the Linux kernel source program of this version, the source program files related to system calls include the following:

 

1. Arch/i386/boot/bootsect. s

 

2. Arch/i386/kernel/setup. s

 

3. Arch/i386/boot/compressed/head. s

 

4. Arch/i386/kernel/head. s

 

5. init/Main. c

 

6. Arch/i386/kernel/traps. c

 

7. Arch/i386/kernel/entry. s

 

8. Arch/i386/kernel/IRQ. h

 

9. Include/asm-386/unistd. h

 

Of course, this is only a few of the main files involved. In fact, adding a system call really wants to modify the file only include/asm-386/unistd. h and arch/i386/kernel/entry. S;

 

 

[3] modify the kernel source code:

 

1. Add the system service routine in kernel/sys. C as follows:

Asmlinkage int sys_addtotal (INT numdata)

 

{

 

Int I = 0, enddata = 0;

 

While (I <= numdata)

 

Enddata + = I ++;

 

Return enddata;

 

}

 

This function has an int-type entry parameter numdata and returns the accumulated value from 0 to numdata;

Of course, you can also put the system service routine in a custom file or other files, just to make necessary instructions in the corresponding file;

 

2. Add the entry address of asmlinkage int sys_addtotal (INT) to the sys_call_table table:

 

The last few lines of source code in arch/i386/kernel/entry. s are changed:

 

......

 

. Long symbol_name (sys_sendfile)

 

. Long symbol_name (sys_ni_syscall)/* streams1 */

 

. Long symbol_name (sys_ni_syscall)/* streams2 */

 

. Long symbol_name (sys_vfork)/* 190 */

 

. Rept NR_syscalls-190

 

. Long symbol_name (sys_ni_syscall)

 

. Endr

 

Modified :......

 

. Long symbol_name (sys_sendfile)

 

. Long symbol_name (sys_ni_syscall)/* streams1 */

 

. Long symbol_name (sys_ni_syscall)/* streams2 */

 

. Long symbol_name (sys_vfork)/* 190 */

 

/* Add By I */

 

. Long symbol_name (sys_addtotal)

 

. Rept NR_syscalls-191

 

. Long symbol_name (sys_ni_syscall)

 

. Endr

 

3. Add the vector corresponding to the added sys_call_table table item in the include/asm-386/unistd. h

To query or call user processes and other system processes:

 

The added part of the/usr/src/Linux/include/asm-386/unistd. h file is as follows:

 

......

 

# DEFINE _ nr_sendfile 187

 

# DEFINE _ nr_getpmsg 188

 

# DEFINE _ nr_putpmsg 189

 

# DEFINE _ nr_vfork 190

 

/* Add By I */

 

# DEFINE _ nr_addtotal 191

 

4. The test program (test. c) is as follows:

 

# Include

 

# Include

 

_ Syscall1 (INT, addtotal, Int, num)

 

Main ()

 

{

 

Int I, J;

 

 

Do

 

Printf ("Please input a number/N ");

 

While (scanf ("% d", & I) = EOF );

 

If (j = addtotal (I) =-1)

 

Printf ("error occurred in syscall-addtotal ();/N ");

 

Printf ("Total from 0 to % d is % d/N", I, j );

 

}

 

Compile the new kernel after modification and guide it as a new operating system. After running several programs, you can see that everything is normal; compile the test program in the new system (* Note: because the original kernel does not provide this system call, only in the new kernel after compilation, this test program can be compiled and passed). The running conditions are as follows:

 

 

$ Gcc-O test. c

 

$./Test

 

Please input a number

 

36

 

Total from 0 to 36 is 666

 

Visible, modification successful;

 

Further analysis of the source code shows that in the kernel of this version, from/usr/src/Linux/ARCH/i386/kernel/entry. s

 

The settings of the sys_call_table table in the file show that the service routines called by several systems are defined in/usr/src/Linux/kernel/sys. C.

The same function in:

 

Asmlinkage int sys_ni_syscall (void)

 

{

 

Return-enosys;

 

}

 

For example, this is true for items 188th and 189th:

 

......

 

. Long symbol_name (sys_sendfile)

 

. Long symbol_name (sys_ni_syscall)/* streams1 */

 

. Long symbol_name (sys_ni_syscall)/* streams2 */

 

. Long symbol_name (sys_vfork)/* 190 */

 

......

 

The two items are declared in the file/usr/src/Linux/include/asm-386/unistd. h as follows:

 

......

 

# DEFINE _ nr_sendfile 187

 

# DEFINE _ nr_getpmsg 188/* Some people actually want streams */

 

# DEFINE _ nr_putpmsg 189/* Some people actually want streams */

 

# DEFINE _ nr_vfork 190

 

It can be seen that in the kernel source code of this version, the asmlinkage int sys_ni_syscall (void) function does not perform any operation, so it includes getpmsg,

Several system calls, including putpmsg, do not perform any operation, that is, air conditioners to be expanded;

However, they still occupy sys_call_table table items. It is estimated that they were arranged by the designers to facilitate the expansion of system calls;

Therefore, you only need to add the corresponding service routines (for example, add the service routines getmsg or putpmsg) to increase system calls.

 

Conclusion: Of course, for a large and complex Linux

In terms of the kernel, an article is far from enough, and the Code related to system calls is only an extremely small part of the kernel; but it is important to have a method and a good grasp of the analysis method; therefore, the analysis is only a guiding function, and the readers still need to make their own efforts.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.