Exclusive: in-depth introduction to how the Linux kernel works

Source: Internet
Author: User
Exclusive: this topic describes how the Linux kernel works-general Linux technology-Linux programming and kernel information. For more information, see the following. Source: http://news.csdn.net/n/20090327/124513.html

[Csdn March 27 compilation] This article is published in the Linux Format magazine. The author explains in depth how Linux Kernel works. I believe it will be of great help to Linux developers.

The definition of the word "kernel" in the Oxford dictionary is: "A soft, usually edible part of a nut. "Of course, there is another definition:" The core or most important part of something. "For Linux, its Kernel is undoubtedly a second explanation. Let's take a look at how this important thing works. Let's start with a theory.

In a broad sense, kernel is a software that provides a layer between hardware and applications running on computers. Strictly speaking, from the perspective of computer science, Linux's Kernel refers to the code written by Linus Torvalds in the early 1990s S.

All the other things you see in Linux versions-Bash shell, KDE Window Manager, web browser, X server, Tux Racer, and all others, it is just an application running on Linux, not a part of the operating system itself. To give you a more intuitive feeling, let me give you an example. For example, the installation of RHEL5 occupies GB of Hard Disk Space (depending on your choice ), among them, the kernel and Its modules are only 47 MB, accounting for about 2%.

Inside the kernel

How does the kernel work? See the following chart. Kernel makes the application running on it available through many entry ports, that is, the system call we call from the technical point of view. System calls used by Kernel, such as "read" and "write", provide hardware abstraction ).

(400) {this. resized = true; this. width = 400; this. alt = 'click here to open new window';} "onmouseover =" if (this. resized) this. style. cursor = 'hand'; "onclick =" window. open ('HTTP: // info-database.csdn.net/Upload/2009-03-27/LXF107.tut_adv.diagram.jpg'); ">

From the programmer's perspective, these seem to be just common function calls. However, in fact, in the operating mode of the processor, there is a significant switch from the user space to the Kernel space. At the same time, system calls provide a "Linux Virtual Machine" that can be considered as an abstraction of hardware.

One of the more obvious abstractions provided by Kernel is the file system. For example, a short program is written in C. It opens a file and copies the content to the standard output:

# Include
Int main ()
Int fd, count; char buf [1000];
Fd = open ("mydata", O_RDONLY );
Count = read (fd, buf, 1000 );
Write (1, buf, count );
Close (fd );

Here, you can see four examples of system calls: open, read, write, and close. Not to mention the details of this program syntax, the key point is: calling Linux Kernel through these systems provides a file "illusion", but in fact it is just a pile of data with a name, in this way, you do not have to negotiate with the underlying stack, partition, header, pointer, and partition of the hardware, but directly communicate with the hardware in the example ", this is what we call abstract action, which expresses the underlying things in a more understandable way.


System files are a type of abstraction provided by Kernel. Some other features are not so obvious, such as process scheduling. At any time, there may be several processes or programs waiting to run. The Time Scheduling of the Kernel allocates CPU time for each process, so for a period of time, we have the illusion that the computer runs several programs at the same time. This is another C program:

# Include
Main ()
If (fork ()){
Write (1, "Parent \ n", 7 );
Wait (0 );
Exit (0 );
Else {
Write (1, "Child \ n", 6 );
Exit (0 );

In this program, a new process is created, and the original process (parent process) and the new process (child process) are both compiled with standard output and ended. Note that fork (), exit (), and wait () Execution programs are created, ended, and synchronized. This is the most typical simple call in process management and scheduling.

Kernel also has a more difficult-to-see function that is hard to detect by programmers, that is, storage management. Each program runs as if it has its own address space for calling. In fact, it shares the physical storage of the computer like other processes. If the system runs too low, its address space is even temporarily sent to the disk's interactive zone. Another aspect of storage management is to prevent a process from accessing the address space of other processes-this is a necessary precaution for a multi-process operating system.

Kernel also configures network connection protocols such as IP, TCP, and UDP. They provide machine-to-machine on the network) and process-to-process. This creates an illusion that TCP provides a fixed connection between two processes, just like a copper wire connecting two phones, but there is no fixed connection in reality, special reference protocols such as FTP, DNS, and HTTP are implemented through user-level programs, rather than part of the Kernel.

Linux (like the previous Unix) has a good reputation in terms of security, because the Kernel trace records the user ID and group ID of each running process, each time an application attempts to access resources (such as opening a file to write data), the Kernel checks the access permission on the file and then makes a command to allow/Deny access. This access control mode ultimately plays a significant role in the security of the entire Linux system.

Kernel also provides a set of modules, including how to handle the communication details with hardware devices, how to read a partition from the disk, and how to obtain data packets from the network interface card. Sometimes we call these device drivers.

Modular Kernel

Now we have some knowledge about what Kernel is, so let's take a look at its physical composition. In earlier versions, Linux Kernel is integral. That is to say, all components are statically connected to an (very large) Execution file.

In comparison, the current Linux Kernel is modular: many functions are included in the module, and then dynamically loaded into the kernel. This makes the kernel of the kernel very small, and can load and replace modules without reboot when running the kernel.

The Kernel of the kernel is loaded into the storage from a file in the/boot directory during the boot time. This/boot directory is usually called KERNELVERSION, which is related to the Kernel version. (If you want to know what your kernel version is, run the command line to display the system information-r .) The kernel module is located under the/lib/modules/KERNELVERSION directory. All components will be copied during kernel installation.

Management Module

In most cases, Linux does not need your help to manage its modules. However, if necessary, you can manually check and manage the modules using the command line. For example, to find out which module is loading the kernel. Here is an output example:

# Lsmod
Pcspkr 4224 0
Hci_usb 18204 2
Psmouse 38920 0
Bluetooth 55908 7 rfcomm, l2cap, hci_usb
Yenta_socket 27532 5
Rsrc_nonstatic 14080 1 yenta_socket
Isofs 36284 0

The output content includes the module name, size, usage, and list of modules dependent on it. The number of times of use is very important to prevent the current active modules from being uninstalled. In Linux, only modules with zero usage are allowed to be removed.

You can use modprobe to manually load and uninstall modules. (two command lines are called insmod and rmmod, but modprobe is easier to use because it automatically removes module dependencies ). For example, lsmod output shows an offloading module named isofs on our computer. It is used for zero times and has no dependency module (isofs is a module, it supports the ISO system file format used on CD) in this case, the kernel will allow us to uninstall the module:

# Modprobe-r isofs

Now, the isofs is no longer displayed in the Ismod output, and the kernel saves 36,284 bytes of storage. If you put the CD into it and have it automatically installed, the kernel will automatically reload the isofs module, and the isofs usage will increase to 1 time. If you try to remove the module at this time, it will not succeed because it is being used:

# Modprobe-r isofs
FATAL: Module isofs is in use.

Lsmod only lists the modules currently loaded. modprobe lists all available modules. It actually outputs all modules under the/lib/modules/KERNELVERSION directory, and the list will be very long!

In fact, it is not common to use modprobe to manually load a module. However, you can use the modprobe command line to set parameters for the module. For example:

# Modprobe usbcore blinkenlights = 1

We are not creating blinkenlights, but the real parameters of the USB core module.

How can we know what parameters a module will accept? A better method is to use the modinfo command, which lists various information about the module. Here is an example of the module snd-hda-intel.

# Modinfo snd-hda-intel
Description: Intel HDA driver
License: GPL
Srcversion: A3552B2DF3A932D88FFC00C
Alias: pci: vda-10ded0000055dsv * sd * bc * SC * I *
Alias: pci: vda-10ded0000055csv * sd * bc * SC * I *
Depends: snd-pcm, snd-page-alloc, snd-hda-codec, snd
Vermagic: 2.6.20-16-generic SMP mod_unload 586
Parm: index: Index value for Intel HD audio interface. (int)
Parm: id: ID string for Intel HD audio interface. (charp)
Parm: model: Use the given board model. (charp)
Parm: position_fix: Fix DMA pointer (0 = auto, 1 = none, 2 = POSBUF, 3 = FIFO size). (int)
Parm: probe_mask: Bitmask to probe codecs (default =-1). (int)
Parm: single_cmd: Use single command to communicate with codecs (for debugging only). (bool)
Parm: enable_msi: Enable Message Signaled Interrupt (MSI) (int)
Parm: enable: bool

The parts starting with "parm" that are of interest to us: displays the parameters accepted by the module. These descriptions are concise. If you want more information, install the kernel source code, which will be found in a directory similar to/usr/src/KERNELVERSION/Documentation.

There will be something interesting, such as the file/usr/src/KERNELVERSION/Documentation/sound/alsa/ALSA-Configuration.txt describing the parameters that are recognized by many ALSA sound modules; /usr/src/KERNELVERSION/Documentation/kernel-parameters.txt this file is also useful.

There was an example in the Ubuntu Forum a few days ago about how to pass parameters to a module (see https://help.ubuntu.com/community/HdaIntelSoundHowto for details ). In fact, the key to the problem is that the snd-hda-intel parameter requires some operations when the sound hardware is correctly driven and will be aborted during boot time loading. Part of the solution is to assign the probe_mask = 1 option to the module. If you manually load the module, you need to enter:

# Modprobe snd-hda-intel probe_mask = 1

More likely, you put a similar line in the file/etc/modprobe. conf: options snd-hda-intel probe_mask = 1

This "tells" modprobe to include the probe_mask = 1 option each time the snd-hda-intel module is loaded. In some Linux versions, this information is separated into different files under/etc/modprobe. d, rather than in modprobe. conf.

/Proc System File

Linux kernel also presents many details through/proc system files. To illustrate/proc, we first need to extend our understanding of the file. In addition to the persistent information stored on a hard disk, CD, or storage space, we should also consider it as any call that can be made through a traditional system, for example: open, read, write, close and other access information, of course, it can also be accessed by common programs.

The "file" under/proc is completely a part of the kernel Virtual. We can see the data structure inside the kernel from a perspective. In fact, many Linux reporting tools can display the formatted version information found in the/proc file. For example, a column of/proc/modules displays the currently loaded modules.

Similarly,/proc/meminfo provides more details about the current status of the virtual storage system, tools like vmstat provide the same information in a more understandable way;/proc/net/arp displays the current content of the system's ARP cache, from the command line, arp-a displays the same information.

Particularly interesting is the "file" under/proc/sys ". The setting in/proc/sys/net/ipv4/ip_forward tells us whether the kernel forwards IP packets, that is, whether it plays the role of the gateway. Now, kernel tells us that this is disabled:

# Cat/proc/sys/net/ipv4/ip_forward

When you find that you can write these files, you will find them more interesting. For example:

# Echo 1>/proc/sys/net/ipv4/ip_forward

IP forwarding will be enabled in the running kernel (IP forwarding)

Besides using cat and echo to check and correct the settings in/proc/sys, you can also use the sysctl command:

# Sysctl net. ipv4.ip _ forward
Net. ipv4.ip _ forward = 0

This is equivalent:
# Cat/proc/sys/net/ipv4/ip_forward

It is also equivalent:
# Sysctl-w net. ipv4.ip _ forward = 1
Net. ipv4.ip _ forward = 1

It is also equivalent:
# Echo 1>/proc/sys/net/ipv4/ip_forward

Note that the changes you make in this way can only affect the currently running kernel and will no longer be valid when reboot is enabled. If you want the settings to be permanently valid, place them in the/etc/sysctl. conf file. During boot time, sysctl automatically redetermines any settings it finds in this file.

The code line in/etc/sysctl. conf is probably like this: net. ipv4.ip _ forward = 1

Performance tuning)

There is a saying that the writable parameters in/proc/sys give birth to the sub-culture of Linux performance optimization. I personally think this is a bit exaggerated, but here are a few examples you really want to try: Oracle 10g installation instructions (www.oracle.com/policy/obe/ob... st/linuxpreinst.htm) requires you to set a set of parameters, including: kernel. shmmax = 2147483648 this sets the size of the public memory to 2 GB. (Public memory is a communication mechanism during the processing period, allowing storage units to be available simultaneously in the address space of multiple processes)

IBM 'redpaper' provides many suggestions on Linux performance and optimization parameters (www.redbooks.ibm.com/?acts/redp4285.html), including vm. swappiness = 100 this parameter controls how storage pages are exchanged to disks.

Some parameters can be set to improve security, such as. net. ipv4.icmp _ echo_ignore_broadcasts = 1 it "tells" that the kernel does not have to respond to ICMP requests, so that your network is protected against denial-of-service attacks such as Smurf attacks.
Net. ipv4.conf. all. rp_filter = 1 is "tell" kernel to enhance the ingress filtering and egress filtering)

Is there a description that can cover all the parameters? Well, there is a line of command: # sysctl-a, which will show all the parameter names and current values. The list is very long, but you cannot know what these parameters are. Another useful Reference is the Red Hat Enterprise Linux Reference Guide, which describes the entire chapter. You can download it from www.redhat.com/docs/manuals/ise. Translated by Wang yulei)
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.