Linux driver Development must SEE: detailed Mystery Core __linux

Source: Internet
Author: User
Tags documentation reserved

Linux driver development must look at the secret core http://tech.chinaunix.net/a2010/0312/860/000000860010_2.shtml

IT168 Technical documentation "before starting into the mysterious world of Linux device drivers, let's look at a few kernel elements from the driver developer's perspective, and familiarize ourselves with some basic kernel concepts." We will learn about kernel timers, synchronization mechanisms, and memory allocation methods. However, we have to start from the beginning of this journey of exploration. Therefore, this chapter first takes a look at the startup information emitted by the kernel, and then explains some interesting points one by one.

2.1 Startup process

Figure 2-1 shows the boot sequence based on the x86 computer Linux system. The first step is that the BIOS imports the master boot record (MBR) from the boot device, and then the code in the MBR looks at the partition table and reads the boot loader, such as GRUB, Lilo, or syslinux from the active partition, after which the boot loader loads the compressed kernel image and passes control to it. After the kernel has gained control, it will decompress itself and put it into operation.

There are two modes of operation for x86 based processors: Real mode and protection mode. In real mode, users can use only 1 MB of memory and have no protection. Protection mode is much more complex, and users can use more advanced features such as paging. The CPU must switch real mode to protected mode halfway. However, this switch is one-way, that is, you cannot switch back to real mode from protection mode.

The first step in kernel initialization is to execute the assembly code in real mode, followed by the Start_kernel () function in the protected mode init/main.c file (the source file modified in the previous chapter). The Start_kernel () function First initializes the CPU subsystem, then lets the memory and process management system in place, starts the external bus and I/O devices, and the final step is to activate the initialization (INIT) program, which is the parent process for all Linux processes. The initialization process performs a user-space script that initiates the necessary kernel services, and ultimately derives the console terminal program and displays the login prompt.

Figure 2-1 Boot process based on Linux on x86 hardware

The level 3 headings in this section are a printed piece of information from figure 2-2, which comes from the Linux boot process of a x86 notebook computer. If you start the kernel on another architecture, the message and semantics may be different.

2.1.1 bios-provided Physical RAM Map

The kernel resolves the system memory mappings read from the BIOS and takes the lead in printing the following information:

bios-provided Physical RAM Map:

bios-e820:0000000000000000-000000000009f000 (usable)

...

bios-e820:00000000ff800000-0000000100000000 (Reserved)

The initialization code in real mode obtains the memory-mapped information of the system by using the Int 0x15 service of the BIOS and executing the 0xe820 function (that is, the bios-e820 string above). Memory-mapped information contains reserved and available memory, and the kernel will then use that information to create its available pool of memory. In Appendix B's B.1 section, we will give a more in-depth explanation of the memory mapping issues provided by the BIOS.

Figure 2-2 Kernel Boot information

2.1.2 758MB Lowmem Available

Regular addressable memory areas within 896 MB are called low-end memory. The memory allocation function Kmalloc () allocates memory from this area. Memory areas higher than 896 MB are referred to as high-end memory and can only be accessed when mapped in a special way.

During the boot process, the kernel calculates and displays the total number of pages in these memory areas.

2.1.3 Kernel Command Line:ro root=/dev/hda1

Linux boot loader usually passes a command line to the kernel. The parameters in the command line are similar to the argv[] list passed to the main () function in the C program, except that they are passed to the kernel. You can add command-line arguments to the configuration file of the boot loader, and of course, you can modify the prompt line [1] of the bootstrapper loader during the run. If you are using the Grub boot loader, the configuration file may be/boot/grub/grub.conf or/boot/grub/menu.lst, depending on the release version. If you are using Lilo, the configuration file is/etc/lilo.conf. Here is an example of a grub.conf file (added some comments), and after reading the line of code immediately following the title kernel 2.6.23, you will understand the origin of the printed information.

Default 0 #Boot The 2.6.23 kernel by default

Timeout 5 #5 second to alter boot order or parameters

Title Kernel 2.6.23 #Boot Option 1

#The boot image resides in the partition

#under The/boot/directory and is named vmlinuz-2.6.23. ' Ro '

#indicates that the root partition should is mounted read-only.

Kernel (hd0,0)/boot/vmlinuz-2.6.23 ro root=/dev/hda1

#Look under section "Freeing Initrd memory:387k freed"

INITRD (hd0,0)/BOOT/INITRD

#...

The command-line arguments affect the code execution path during startup. For example, suppose a command-line argument is Bootmode, and if the parameter is set to 1, it means that you want to print some debugging information during startup and switch to level 3rd at the end of the startup (the initialization of the process is printed to understand the meaning of the runlevel); If the Bootmode parameter is set to 0, it means you want the startup process to be relatively concise and set RunLevel to 2. Now that you are familiar with the Init/main.c file, add the following modifications to the file: static unsigned int bootmode = 1;
static int __init
Is_bootmode_setup (char * str)
{
Get_option (& str, & Bootmode);
return 1;
}

* * Handle parameter "bootmode=" * *
__setup ("bootmode=", Is_bootmode_setup);

if (bootmode) {
/* Print VERBOSE output * *
/*  ...  */
}

/*  ...  */

/* If Bootmode is 1, choose a init runlevel of 3, else
Switch to a run level of 2 * *
if (bootmode) {
argv_init[+ + args] = "3";
} else {
argv_init[+ + args] = "2";
}

/*  ...  */

Please recompile the kernel and try to run the new changes.

2.1.4 Calibrating delay ... 1197.46 bogomips (lpj=2394935)

During the boot process, the kernel calculates the number of times that the processor runs an internal delay loop within a jiffy time. The meaning of jiffy is the interval between 2 consecutive beats of the system timer. As expected, the calculation must be calibrated to the processing speed of the CPU used. The results of the calibration are stored in kernel variables called Loops_per_jiffy. One case of using Loops_per_jiffy is when a device driver expects a small microsecond-level delay.

To understand the delay-loop calibration code, let's look at the Calibrate_ delay () function defined in the init/calibrate.c file. The function uses an integer operation flexibly to get the floating-point precision.  The following code fragment (some comments) shows the beginning of the function, which is used to get a rough value for a loops_per_jiffy: Loops_per_jiffy = (1 << 12); /* Initial approximation = 4096 * *
PRINTK (kern_debug "calibrating delay loop ...");
while ((Loops_per_jiffy <<= 1)!= 0) {
Ticks = jiffies; /* As you'll find out in the section, "Kernel
Timers, "The jiffies variable contains the
Number of timer ticks since the kernel
Started, and is incremented in the timer
Interrupt Handler * *

while (ticks = = jiffies); /* Wait until the start of the next jiffy * *
Ticks = jiffies;
* Delay * *
__delay (Loops_per_jiffy);
/* Did The outlast the current jiffy? Continue if it didn ' t * *
Ticks = jiffies-ticks;
if (ticks) break;
}

Loops_per_jiffy >>= 1; /* This fixes the most significant bit and is
The Lower-bound of Loops_per_jiffy * *

The above code first assumes that the Loops_per_jiffy is greater than 4096, which translates to a processor speed of about 1 million instructions per second, or 1 MIPS. Next, it waits for the Jiffy to be refreshed (1 new beats start) and starts to run the delay Loop __delay (Loops_per_jiffy). If this delay loop lasts more than 1 jiffy, the previous Loops_per_jiffy value (moving the current value to the right 1 bit) will be repaired to the highest bit of the current loops_per_jiffy; otherwise, the function continues to move through the left Loops_per_ Jiffy value to detect its highest bit. After the kernel calculates the highest position, it begins to compute the low and fine-tune its precision: Loopbit = Loops_per_jiffy;

/* Gradually work on the lower-order bits * *
while (Lps_precision-&& (loopbit >>= 1)) {
Loops_per_jiffy |= loopbit;
Ticks = jiffies;
while (ticks = = jiffies); /* Wait until the start of the next jiffy * *
Ticks = jiffies;

* Delay * *
__delay (Loops_per_jiffy);

if (jiffies!= ticks)/* longer than 1 tick * *
Loops_per_jiffy &= ~ loopbit;
}

The above code calculates the low value of the loops_per_jiffy when the delay loop crosses the Jiffy boundary. This calibrated value can be used to get the bogomips (in fact, it is not a scientific processor speed metric). You can use bogomips as a relative measure of how fast the processor is running. On a notebook computer with 1.6G Hz based on Pentium M, the result of the cyclic calibration is that the Loops_per_jiffy value is 2394935, based on the printed information of the preceding boot process. The way to obtain Bogomips is as follows: Bogomips = Loops_per_jiffy * Number of Jiffy in 1 seconds * Number of instructions to delay cycle consumption (in millions)
= (2394935 * HZ * 2)/(1000000)
= (2394935 * 250 * 2)/(1000000)
= 1197.46 (consistent with values in the printing information of the startup process)

Jiffy, Hz, and Loops_per_jiffy are described in more detail in section 2.4.

2.1.5 Checking HLT Instruction

Because the Linux kernel supports multiple hardware platforms, startup code checks for architecture-related bugs. One of the tasks is to verify the downtime (HLT) instructions.

The x86 processor's HLT directive will place the CPU in a low-power sleep mode until the next hardware outage occurs. When the kernel wants the CPU to go idle (see the Cpu_idle () function defined in the arch/x86/kernel/process_32.c file), it uses the HLT directive. For problematic CPUs, command line arguments no-hlt can suppress hlt directives. If the no-hlt is set, the kernel will be busy waiting instead of cooling the CPU through HLT when idle.

This information is printed when the startup code in INIT/MAIN.C calls the check_bugs () defined in Include/asm-your-arch/bugs.h.

2.1.6 net:registered Protocol Family 2

The Linux sockets Layer is a unified interface for user-space applications to access various network protocols. Each protocol is registered with its unique serial number, as defined in the Include/linux/socket.h file. Family 2 of the above printed information represents af_inet (Internet Protocol).

Another common registration protocol series during startup is Af_netlink (Family 16). Network link sockets provide a way to communicate between the user process and the kernel. The functionality that can be accomplished through network link sockets also includes access to routing tables and Address Resolution Protocol (ARP) tables (the Include/linux/netlink.h file gives a complete list of usages). For such tasks, network link sockets are more appropriate than system calls because they have the advantage of adopting asynchronous mechanisms, easier implementation, and dynamic linking.

Another protocol series that is often enabled in the kernel is Af_unix or unix-domain sockets. Programs such as X windows use them to communicate between processes on the same system.

2.1.7 Freeing Initrd memory:387k freed

INITRD is a virtual disk image of resident memory loaded by the boot loader. After the kernel is started, it is mounted as the initial root file system, which holds the dynamically connected modules on which the actual root file system partition is mounted. Since the kernel can run on a wide variety of storage controller hardware platforms, it is not feasible to put all possible disk drivers directly into the underlying kernel image. The drivers for your system's storage devices are packaged into INITRD, which are loaded before the kernel starts and the actual root file system is mounted. Use the MKINITRD command to create a INITRD image.

The 2.6 kernel provides a new feature called Initramfs, which is more outstanding in several ways than INITRD. The latter simulates a disk (which is known as Initramdisk or INITRD), which brings overhead (such as buffering) of the Linux block I/O subsystem, which is essentially the same as a mounted file system, and thus is called Initramfs.

Unlike INITRD, a Initramfs based on page buffering can dynamically grow or shrink as a page buffer, reducing its memory consumption. In addition, INITRD requires that your kernel image contain the file system used by INITRD (for example, if INITRD is a EXT2 file system, the kernel must contain EXT2 drivers), but Initramfs does not require file system support. Furthermore, since Initramfs is only a small layer above the page buffer, its code is small.

The user can package the initial root file system as a Cpio compressed package [1] and pass the initrd= command-line arguments to the kernel. Of course, you can also compile the kernel directly through the Initramfs_source option during kernel configuration. In the latter case, the user can provide a file name for the Cpio package or a directory tree containing Initramfs. During the boot process, the kernel will extract the file into a Initramfs root filesystem, and if it finds a/init, it will execute the top-level program. This method of obtaining the initial root file system is particularly useful for embedded systems because system resources are invaluable in embedded systems. With Mkinitramfs, you can create a INITRAMFS image and view the document Documentation/filesystems/ramfs-rootfs-initramfs.txt for more information.

In this case, we are using the initrd= command-line arguments to pass the initial root file system Cpio compression package to the kernel. After you extract the contents of the compressed package into the root file system, the kernel releases the memory occupied by the compressed package (in this case, 387 KB) and prints the above information. The freed pages are distributed to other parts of the kernel for application.

During the development of embedded systems, INITRD and INITRAMFS can sometimes be used as the actual root file system on embedded devices.

2.1.8 IO Scheduler anticipatory registered (default)

The main goal of the I/O Scheduler is to increase the throughput of the system by reducing the number of times the disk is positioned. During disk positioning, the head needs to move from its current position to the destination of interest, which can cause some delay. The 2.6 kernel provides 4 different I/O schedulers: Deadline, anticipatory, Complete Fair queuing, and NoOp. Printing information from the kernel above shows that this example sets the anticipatory to the default I/O scheduler.

2.1.9 Setting up Standard PCI

The next stage of the startup process initializes the I/O bus and the perimeter controller. The kernel detects PCI hardware by traversing the PCI bus, and then initializes other I/O subsystems. From Figure 2-3 We will see the SCSI subsystem, USB controller, video chip (part of 855 North Bridge chipset information), serial port (in this case, 8250 UART), PS/2 keyboard and mouse, floppy drive, RAMDisk, loopback device, IDE controller ( This example is part of the ICH4 Bridge chipset, the trackpad, the Ethernet controller (e1000 in this case), and the startup information initialized by the PCMCIA controller. The identity (ID) of the I/O device is pointed to in figure 2-3.

Figure 2-3 Initializing the bus and perimeter controller during the boot process

This book will discuss most of the above driver subsystems in a separate section, and note that if the driver is dynamically linked to the kernel as a module, some of the messages may be displayed only after the kernel has started.

2.1.10 ext3-fs:mounted FileSystem

The EXT3 file system has become the Linux de facto file system. EXT3 adds a log layer based on the retired EXT2 file system, which can be used for fast recovery of file systems after a crash. Its goal is to obtain a consistent file system without time-consuming file system check (FSCK) operations. EXT2 is still the working engine for the new file system, but the EXT3 layer logs the file interaction before the actual disk changes are made. EXT3 is backward compatible with EXT2, so you can add EXT3 to your existing EXT2 file system or return to EXT3 file system from EXT2.

EXT3 will start a kernel worker thread called Kjournald (the next chapter will delve into the kernel thread) to complete the logging function. After the EXT3 is put into operation, the kernel mounts the root file system and prepares for "business":

ext3-fs:mounted filesystem with ordered data mode

Kjournald starting. Commit interval 5 Seconds

vfs:mounted Root (ext3 filesystem).

2.1.11 init:version 2.85 booting

The parent process init of all Linux processes is the 1th program that runs after the kernel completes the startup sequence. In the last few lines of INIT/MAIN.C, the kernel searches for a different location to navigate to Init:if (Ramdisk_execute_command) {

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.