Linux Startup Process (transfer)
The introduction to the Windows Startup Process is rare. All users who have used Linux may notice that when a computer is started, a lot of information appears on the screen. In general, we can see this information through the following command:
What exactly does this information mean? This seems easy to answer, because as long as you look for it in the Linux reference book, you will find a similar answer: "This is some kernel startup information ......". But what does "kernel startup information" mean?
To understand the internal work of Linux, you must have a comprehensive understanding of the Linux kernel architecture. Next we will unveil its secret. Here, I don't want to explain the Linux kernel architecture. I just want to explain (or try to explain) some of the most basic concepts in the process of starting a computer system. The starting process here refers to the entire process from pressing the switch to the prompt.
What does startup mean?
In the operating system vocabulary, startup refers to executing some commands through the processor to put a part of the operating system into the primary memory. During the startup process, the data structure in Linux is initialized, assigned to some initial values, and some processes are created. When a computer is powered on, all hardware devices are in an unpredictable state, and the memory is in an inactive random state, the Startup Process of a computer can be said to be a long and complex task. Therefore, we must know that "startup" is mainly because of the computer architecture.
Attention:
1. Have a basic understanding of the work inside the computer and the operations on the kernel, which is very beneficial to you.
2. All files mentioned in this Article refer to files in Linux kernel 2.4.2-2. These files are the same for all linux kernels and can be found in any Linux system. Here I am using Red Hat 7.1.
3. the scope of this article is limited to the ibm pc architecture.
Bios and its Functions
When the computer powers on, the memory contains some random data, everything is not initialized, and the operating system is not loaded. Starting the entire startup process is a special hardware circuit that triggers the logical value of the CPU reset foot. Then, some CPU registers such as CS (a segment register: code segment register, which points to a segment containing program instructions), EIP (during instruction execution, when the CPU detects an accident, it will make three types of judgments: errors, traps, and suspensions, depending on the value of the EIP register, which is stored in the kernel module stack) A value is given. Then, the Code with the physical address 0xfffffff0 will be executed. This address is stored in a read-only memory (ROM. BiOS (Basic Input/Output System) is actually a program stored in Rom. It contains a series of Interrupt drivers and low-level programs that can be called by some operating systems to process various computer hardware devices. Among them, Microsoft's DOS is such an operating system.
Does Linux use BIOS attached to a computer system to initialize hardware devices? Or is there anything else to do the same job? However, this problem is not that simple and requires some knowledge. We started from the 80386 mode. An Intel microprocessor implements address translation (from a logical address to a linear address to a physical address) in two different ways: real mode and protection mode. The real mode is mainly used to make the processor compatible with older processing. In fact, all BIOS programs run in real mode. However, the Linux kernel runs in protection mode, not in real mode. Therefore, once the initialization is complete, Linux will not use bios, but provide drivers for all hardware on the computer by itself (which is different from DoS ).
So when does Linux use the protection mode? Why cannot I use the same BIOS mode? The BIOS uses the real mode address because it uses the real mode address during the operation, and only the real mode address is available when the computer is powered on. A real-mode address consists of a segment address and an offset address. Therefore, the corresponding physical address is a segment address × (2 × 8) + offset.
So does this mean that Linux never uses BIOS during the entire startup process? The answer is no. In the startup phase, you need to use BIOS when Linux loads the kernel from a hard disk or other external devices.
Let's take a look at the main BIOS operations during startup:
1. BIOS should perform a series of thorough detection on the hardware. This step mainly checks which devices are installed in the system and whether they work properly. This step is usually called power-on self-test (post). The version and many other related hardware information are displayed.
2. BIOS should initialize the hardware. This step is important because it ensures that all hardware devices do not conflict with the I/O port during IRQ (request interruption) operations. After this step is completed, it will display a table of installed PCI devices.
3. Then, the operating system is displayed. the BIOS will look for a bootable operating system. This depends on the BIOS settings. It can be started from a floppy disk, hard disk, or disc.
4. once a valid device is found, the BIOS copies the content of the first sector to the physical address, that is, the memory starting from 0x00007c00, and then jumps to the loaded address and executes it.
So far, all the BIOS work has been done.
UDF and its Functions
The BIOS calls a special program. The task of this program is to transfer the kernel of the operating system to the memory. This program is called a boot loader ). Before proceeding to the following content, let's take a look at the different ways to start the system.
1. Start Linux from a floppy disk
When a floppy disk is started, the command stored in the first sector of the floppy disk is loaded and executed. This command then copies the other Kernels to the memory.
The Linux kernel can be installed on a MB Floppy disk, but they are compressed to reduce disk usage. This compression process is completed during compilation, while the decompression process is completed by the self-lifting program.
When Linux is started from a floppy disk, the self-lifting program is easy to do. It is an assembly language file located in/usr/src/linux-2.4.2/ARCH/i386/boot/bootsect. S. When we compile the Linux kernel source code or obtain a new kernel, the executable assembly code will be placed at the front end of the kernel program. It can be seen that it is very easy to create a Linux floppy disk that can be started. We only need to copy the Linux kernel from the first sector of the disk to create a boot floppy disk. When the BIOS loads the first sector of a floppy disk, it actually copies a UDF. The UDF is called by the BIOS (jump to the physical address 0x00007c00), and then perform the following operations:
(1) Move the user from address 0x00007c00 to 0x00090000;
(2) Use address 0x00003ff4 to create the "real mode" stack;
(3) set the disk parameter table. Here, the disk driver provided by the BIOS is used;
(4) display the "loading" information by calling the BIOS program;
(5) The Bootstrap program calls the BIOS program to load the setup () function of the kernel on the floppy disk and places it in the memory with the starting address 0x00090200;
(6) Next, the UDF calls a BIOS program, which loads the remaining kernel program from the floppy disk and places it in the starting address 0x00010000 (so-called low address) or 0x00100000 (the so-called high address );
(7) then, jump to the setup () function.
2. Start Linux from hard disk
When the system starts from the hard disk, the startup process is different. The first sector of the hard disk is MBR (Master Boot Record), which stores a partition table and a small program. This program starts to load storage in the first sector of the operating system. Linux is a highly flexible and excellent software, so in MBR, it uses a program called lilo to replace the above program. Lilo allows you to select the operating system to start.
Generally, Linux is started from a hard disk. This requires different self-lifting programs. In intel systems, Lilo is the most widely used self-lifting program. For other architectures, there are other self-raising procedures. Lilo can be installed on MBR (NOTE: When installing Red Hat Linux, there is a step for the user to choose to install lilo to MBR or boot sector) or the boot sector of an active partition.
Because lilo is too large to accommodate MBR, it is divided into two parts. MBR (or the Boot Sector of the disk partition) contains a small Bootstrap program which is loaded into the memory starting from 0x00007c00 by the BIOS. Then, the applet moves itself to the 0x0009a000 address, then sets the real mode stack, and finally loads the second part of the Lilo UDF Program (note: the real-mode stack address range is 0x0009b000 to 0x0009a200 ).
Lilo in the second part reads all available operating systems from the disk and lists them to users to select the system to be started. Once the user chooses to complete the operation, the UDF loads the corresponding slice content to the memory and runs it.
When the UDF program is called by the BIOS (jump to the physical address 0x00007c00), perform the following operations:
(1) Move the user from address 0x00007c00 to 0x00090000;
(2) Use address 0x00003ff4 to create the "real mode" stack;
(3) set the disk parameter table. Here, the floppy disk driver provided by BIOS is used;
(4) display the "loading Linux" information by calling the BIOS program;
(5) The Bootstrap program calls the BIOS program to load the setup () function of the kernel on the floppy disk and places it in the memory with the starting address 0x00090200;
(6) Next, the UDF calls a BIOS program, which loads the remaining kernel program from the floppy disk and places it in the starting address 0x00010000 or 0x00100000;
(7) then, jump to the setup () function.
Analyze the Linux Startup Process
Taking redhat9.0 and i386 as examples, this article analyzes the Linux Startup Process from user power on to command line prompt on the screen. It also introduces various files involved in startup.
Reading the Linux source code is undoubtedly the best way to learn more about Linux. In this article, we also try to further analyze the Linux Startup Process from the source code perspective, so it also involves some of the relevant Linux source code, the source code for starting this part of inux mainly uses the C language and involves a small amount of compilation. A large number of scripts written by Shell (mainly bash shell) are also executed during the startup process. To facilitate reading, I will introduce the entire Linux Startup Process in the following parts one by one. For details, refer:
When the user powers on the PC, BIOS boot self-check, start according to the boot device set in BIOS (usually hard disk), then start the boot program LILO or GRUB installed on the device to boot Linux, linux first performs kernel boot, and then executes the INIT program. The INIT program calls RC. sysinit and RC programs, RC. after completing system initialization and running service tasks, sysinit and RC return Init; init starts mingetty and opens the terminal for users to log on to the system. After the user logs on to the system, the user enters shell, this completes the entire startup process from boot to login.
The following describes several key parts one by one:
Part 1: Kernel boot (kernel boot)
Red hat9.0 can use boot programs such as LILO or GRUB to start guiding the Linux system. After the boot program successfully completes the boot task, Linux takes control of the CPU from them, then the CPU starts to execute the Linux core image code and starts the Linux Startup Process. Here we use several assembler programs to guide Linux. This step is generic to the files under "arch/i386/Boot" in the Linux source code tree: bootsect. s, setup. s, video. s.
Bootsect. S is the source code for generating the Boot Sector. After loading, it directly jumps to the program entry of setup. S. The main function of setup. S is to copy system parameters (including memory and disk, which are returned by the BIOS) to the special memory so that the codes in the protected mode of these parameters can be read in the future. In addition, setup. s includes the code in video. s to detect and set the display and display modes. Finally, setup. s converts the system to the protection mode and jumps to 0x100000.
So what code is stored in the memory address 0x100000? Where did these codes come from?
The memory address 0x100000 stores the decompressed kernel, because the kernel provided by Red Hat contains a large number of drivers and functions, therefore, the "makebzimage" method is used in kernel compilation to generate a compressed kernel. in RedHat, the kernel is often named vmlinuz. During the initial Linux boot process, is passed through the head in "arch/i386/boot/compressed. s uses Misc. run the decompress_kernel () function defined in C to decompress the kernel vmlinuz to 0x100000.
When the CPU jumps to 0x100000, startup_32 in "arch/i386/kernel/head. s" will be executed. It is also the entrance of vmlinux, and then jumps to start_kernel. Start_kernel () is a function defined in "init/Main. c". start_kernel () calls a series of initialization functions to complete the setting of the kernel itself. The start_kernel () function does a lot of work to build a basic Linux core environment. If start_kernel () is successfully executed, the basic Linux core environment has been established.
At the end of start_kernel (), by calling the init () function, the system creates the first core thread and starts the INIT process. The Core Thread Init () is mainly used to initialize peripherals, including calling do_basic_setup () to load and initialize peripherals and their drivers. Complete file system initialization and root file system installation.
When the do_basic_setup () function returns Init (), INIT () opens the/dev/console device again and redirects the three standard input/output files stdin, stdout, and stderr to the console. Finally, search for the INIT program in the file system (or the program specified by the init = command line parameter), and use the execve () System Call to load and execute the INIT program. The init () function ends, and the kernel boot part ends,
Part 2: run init
The INIT process number is 1. From this point, we can see that the INIT process is the starting point of all processes in the system. After Linux completes the kernel boot, it starts to run the INIT program ,. The INIT program needs to read the configuration file/etc/inittab. Inittab is an unexecutable text file consisting of several lines of commands. In the RedHat system, the content of the inittab is as follows (comments starting with "###" are added to the author ):
#
# Inittab this file describes how the INIT process shocould set up
# The system in a certain run-level.
#
# Author: Miquel van smoorenburg,
# Modified for RHS Linux by Marc Ewing and Donnie Barnes
#
# Default runlevel. The runlevels used by RHS are:
#0-halt (do not set initdefault to this)
#1-Single User Mode
#2-multiuser, without NFS (the same as 3, if you do not havenetworking)
#3-full multiuser Mode
#4-unused
#5-X11
#6-Reboot (do not set initdefault to this)
#
### Indicates that the current default running level is 5 (initdefault );
ID: 5: initdefault:
### Automatically execute the/etc/rc. d/rc. sysinit script (sysinit) at startup)
# System initialization.
Si: sysinit:/etc/rc. d/rc. sysinit
L0: 0: Wait:/etc/rc. d/RC 0
L1: 1: Wait:/etc/rc. d/RC 1
L2: 2: Wait:/etc/rc. d/RC 2
L3: 3: Wait:/etc/rc. d/RC 3
L4: 4: Wait:/etc/rc. d/RC 4
### When the running level is 5, run the/etc/rc. d/RC Script with 5 as the parameter, and init will wait for it to return (wait)
L5: 5: Wait:/etc/rc. d/RC 5
L6: 6: Wait:/etc/rc. d/RC 6
### Allow restarting the system by CTRL-ALT-DELETE during startup
# Trap CTRL-ALT-DELETE
CA: ctrlaltdel:/sbin/shutdown-T3-R now
# When our ups tells us power has failed, assume we have a few minutes
# Of power left. Schedule a shutdown for 2 minutes from now.
# This does, of course, assume you have powerd installed and your
# Ups connected and working correctly.
PF: powerfail:/sbin/shutdown-f-h + 2 "power failure; system shutting down"
# If power was restored before the shutdown kicked in, cancel it.
PR: 12345: powerokwait:/sbin/shutdown-c "power restored; shutdown cancelled"
### Run the/sbin/mingetty program with ttyx as the parameter at Level 2, 3, 4, and 5, and enable the ttyx terminal for user login,
### If the process exits, run the mingetty Program (respawn) again)
# Run Gettys in standard runlevels
1: 2345: respawn:/sbin/mingetty tty1
2: 2345: respawn:/sbin/mingetty tty2
3: 2345: respawn:/sbin/mingetty tty3
4: 2345: respawn:/sbin/mingetty tty4
5: 2345: respawn:/sbin/mingetty tty5
6: 2345: respawn:/sbin/mingetty tty6
### Run the xdm program at level 5, provide the xdm graphical login interface, and re-Execute (respawn) upon exit)
# Run xdm in runlevel 5
X: 5: respawn:/etc/X11/preofdm-nodaemon
The preceding inittab file is used as an example to describe the inittab format. The row starting with # Is a comment row. Each row has the following format except the comment row:
ID: runlevel: Action: Process
The detailed explanations of the above items are as follows:
1. ID
Id refers to the entry identifier, which is a string. For other login program items such as Getty or mingetty, the ID must be the same as the TTY number; otherwise, the Getty program will not work properly.
2. runlevel
Runlevel is the identifier of the running level of init. It is generally 0-6 and S or S. The running levels 0, 1, and 6 are retained by the system. Among them, 0 is used as the shutdown action, 1 is used as the restart to the single-user mode, and 6 is used as the restart; S and S share the same meaning, indicating the single-user mode, the inittab file is not required, so it does not appear in inittab. In fact, when you enter single-user mode, init runs/sbin/sulogin directly on the console (/dev/console. In general system implementation, 2, 3, 4, and 5 are used. In the RedHat system, 2 indicates that the multi-user mode is not supported by NFS, 3 indicates full multi-user mode (also the most common level), 4 is reserved for user customization, and 5 indicates xdm graphical login mode. The 7-9 level can also be used. Traditional UNIX systems do not define these levels. Runlevel can be multiple parallel values to match multiple running levels. For most actions, runlevel is executed only when it matches the current running level.
3. Action
Action describes the running mode of the subsequent process. Optional values of action include initdefault, sysinit, boot, and bootwait:
Initdefault is a special action value used to identify the default startup level. When init is activated by the core, it reads the initdefault item in inittab and obtains the runlevel, as the current running level. If there is no inittab file or there is no initdefault item, init will request to enter runlevel on the console.
Actions such as sysinit, boot, and bootwait will run unconditionally when the system is started, ignoring runlevel.
Other actions (excluding initdefault) are related to a certain runlevel. The definitions of each action are described in detail in the man manual of inittab.
4. Process
Process is the specific execution program. The program can be followed by parameters.
Part 3: system initialization
There is such a line in the init configuration file:
Si: sysinit:/etc/rc. d/rc. sysinit
It calls/etc/rc. d/RC. sysinit, while RC. sysinit is a bash shell script mainly used to complete system initialization, RC. sysinit is an important script to run at every running level. It mainly includes activating swap partitions, checking disks, loading hardware modules, and other tasks that require priority.
RC. sysinit has more than 850 lines, but each single function is relatively simple and has annotations. It is recommended that interested users read the file on their own machines, to learn more about system initialization. This file is long, so it is not listed in this article, and will not be described in detail.
After the RC. sysinit program is executed, the system returns init to continue the next step.
Part 4: Start the running daemon
After RC. sysinit is executed, init will be returned to continue other actions. Normally, the/etc/rc. d/RC program will be executed next. Taking runtime Level 3 as an example, init will execute the following line in the configuration file inittab:
L5: 5: Wait:/etc/rc. d/RC 5
This line indicates running/etc/rc with 5 as the parameter. d/RC,/etc/rc. d/RC is a shell script that accepts 5 as the parameter and executes/etc/rc. d/rc5.d/All RC startup scripts in the directory,/etc/rc. in the D/rc5.d/directory, these startup scripts are actually some link files, rather than the real RC startup scripts. The real RC startup scripts are actually stored in/etc/rc. d/init. d/directory. These RC boot scripts have similar usage. They generally accept parameters such as start, stop, restart, and status.
The rc startup script in/etc/rc. d/rc5.d/is usually a link file starting with K or S. For a STARTUP script starting with S, it will run with the start parameter. If the corresponding script is found to have a K-header link, and it is already in the running state (marked by a file under/var/lock/subsys ), stop the started daemon with the Stop parameter and run the daemon again. This ensures that all related daemon will be restarted when init changes the running level.
You can use chkconfig or "system services" in setup to set which daemon will run at each running level. Common daemon processes include:
AMD: automatically installs the NFS daemon.
Apmd: Advanced Power Management daemon
Arpwatch: records logs and constructs an ethernet address and IP address pair database that is visible on the LAN interface.
Autofs: automatically installs the management process automount, which is related to NFS and relies on NIS.
Crond: the daemon of scheduled tasks in Linux
Named: DNS Server
Netfs: Install NFS, Samba, and Netware Network File Systems
Network: Activate the script program with configured network interfaces
NFS: Enable the NFS service
Portmap: RPC Portmap manager, which manages RPC-based connections
Sendmail: Sendmail
SMB: Samba file sharing/printing service
Syslog: a script that enables Syslog and klogd system logs to wait for processes during system boot.
XFS: X Window server, which provides a font set for local and remote X Servers
Xinetd: supports core daemon processes of multiple network services and manages services such as wuftp, sshd, and telnet.
These daemon processes are also started, the RC program is executed, and the init is returned to continue the next step.
Part 5: Establish a terminal
After the RC is executed, init is returned. At this time, the basic system environment has been set and various daemon processes have been started. Init will then open six terminals so that users can log on to the system. You can switch between the six terminals by pressing Alt + FN (N corresponds to 1-6. The following six lines in the inittab define six terminals:
1: 2345: respawn:/sbin/mingetty tty1
2: 2345: respawn:/sbin/mingetty tty2
3: 2345: respawn:/sbin/mingetty tty3
4: 2345: respawn:/sbin/mingetty tty4
5: 2345: respawn:/sbin/mingetty tty5
6: 2345: respawn:/sbin/mingetty tty6
From the above, we can see that the mingetty program will be run in the respawn mode in the 2, 3, 4, and 5 running levels. The mingetty program can open the terminal and set the mode. At the same time, it will display a text logon interface, which is a logon interface we often see. In this logon interface, a user is prompted to enter the user name, the user input will be passed as a parameter to the login program to verify the user's identity.
Part 6: log on to the system and start the system
For graphical users with a running level of 5, their logon is through a graphical logon interface. After successful logon, you can directly go to the KDE, gnome, and other Window managers. This article focuses on text-based Logon:
When we see the mingetty logon interface, we can enter the user name and password to log on to the system.
In Linux, the Account Verification Program is login. login receives the username sent from mingetty as the username parameter. Login analyzes the user name. If the user name is not root and the/etc/nologin file exists, login outputs the content of the nologin file and then exits. This is usually used to prevent non-Root User Logon during system maintenance. Only terminals registered in/etc/securetty allow the root user to log on. If this file does not exist, the root user can log on to any terminal. The/etc/usertty file is used to add access restrictions to users. If this file does not exist, there are no other restrictions.
After the username is analyzed, login searches for/etc/passwd and/etc/shadow to verify the password and set other information about the account, such as what is the main directory and what shell is used. If no main directory is specified, the root directory is used by default. If no shell is specified,/bin/bash is used by default.
After the login program is successful, the last logon information (recorded in/var/log/lastlog) is output to the corresponding terminal ), check whether the user has any new emails (in the directory of the corresponding user name of/usr/spool/mail ). Then set various environment variables: For Bash, the system first looks for the/etc/profile script file and runs it. Then, if the user's main directory exists. execute the bash_profile file. Other configuration files may be called in these files. After all the configuration files are executed, various environment variables are also set, A familiar command line prompt will appear, and the entire startup process ends.
We hope that the analysis of the Linux Startup Process will help those who want to learn more about the Linux Startup Process, and further study how Linux will work next.