Grub overall analysis
In general, we can regard GRUB as a micro-operating system with shell, script, and file system ...... We can regard stage1 and stage1.5 as a pilot program, while stage2 is an operating system, but this operating system is dedicated to guiding other operating systems. Therefore, stage2 supports internal "commands" such as kernel, initrd, and chainloader for this purpose ".
3.1 Two Methods for grub to boot the Operating System
3.1.1 direct guidance
Grub supports both Linux, FreeBSD, NetBSD, and OpenBSD. If you want to start other operating systems, you must use the chain startup method to start them [6].
Generally, the steps for grub to directly guide the operating system are as follows:
(1) Use The 'root' command to set the master device of grub to point to the location where the operating system image file is stored.
(2) run the 'kernel 'command to load the core image of the operating system.
(3) If a module is required, run the 'module' command to load the module.
(4) run the 'boot' command '.
Linux, FreeBSD, NetBSD, and OpenBSD are started in the same way. You can run the 'kernel 'command to load the core image and then run the 'boot' command. If the core needs some parameters, you only need to append them after the 'kernal 'command.
3.1.2 chain guide
If you want to start an operating system that is not directly supported by grub (for example, Windows 95), you can start an operating system through chained boot. Generally, the boot program and the operating system to be started are installed in a partition.
The main steps are as follows:
(1) run the 'rootnoverify 'command to set the master device of grub to point to a sector.
Grub> rootnoverify (hd0, 0)
(2) Use the 'makeactive' command to set the 'active' flag on the slice.
Grub> makeactive
(3) run the 'chainloader 'command to load the boot program.
Grub> chainloader + 1
'+ 1' indicates that grub needs to read a sector from the starting partition.
(4) run the 'boot' command '.
3.2 brief Process Analysis of grub boot Operating System
3.2.1 boot from computer to grub start operating system
(1) BIOS executes int 0x19, loads MBR to 0x7c00, and jumps to execute. If you install grub to MBR, the grub installer copies stage1 (512b) to MBR. Depending on the size of stage2, the installer embeds the disk location information of stage1_5 or stage2 in stage1.
(2) stage1 starts execution. It directly loads stage1_5 or stage2 and jumps to the execution. In either case, stage2 is running.
(3) The small operating system stage2 has finally started to run! It will switch the system into the protection mode and set the C Runtime Environment (mainly BSS ). He will first find the config file (which is our menulist). If there is no such file, execute a shell and wait for us to enter the command. The job of grub is to input the command-parse the command-execute the command loop. Of course, stage2 exists to load other operating systems, so if this permits, after executing the boot command, the system transfers the control.
3.2.2 main startup modules of grub
Grub contains the following startup modules: two required scenario files, an optional scenario file named "stage 1.5", and two network startup image files. First, I have a general understanding of them.
Stage1
This is an essential image file for starting grub. Normally, this file is loaded to the MBR or the partition where the startup sector is located. Since the size of the PC startup sector is 512 bytes, this image file must be 512 bytes after compilation.
All the work of stage1 is to load stage 2 or stage 1.5 from a local disk. Due to the limitation on the size of stage1, It encodes the location of stage 2 or stage 1.5 in the form of program tables, so stage1 cannot recognize any file system.
Stage2
This is the core image of grub. It does almost everything except starting it. Generally, it is stored in a file system, but not necessary.
E2fs_stage00005
Fat_stage00005
Ffs_stage00005
Jfs_stage00005
Minix_stage00005
Reiserfs_stage00005
Vstafs_stage00005
Xfs_stage00005
These files are called stage 1.5, which serves as a bridge between stage1 and stage2. That is to say, stage1 loads stage1.5 and then stage1.5 loads stage2.
The difference between stage1 and stage1.5 is that the former does not recognize any file system, but the latter recognizes the file system (for example, 'e2fs _ stage1_5' recognizes ext2fs ). So you can safely move the stage2 location, even after grub is installed.
Nbgrub
This is an image file for network startup and is used by an Ethernet boot loader. It is similar to stage2, but it also needs to establish a network and then load the configuration file [7] through the network.
Pxegrub
This is an image file started by another network.
In addition to the format, it is consistent with 'nbgrub.
4 stage1 module Analysis
The stage1 module is the boot module of the entire boot program and the first module for the transition from boot to grub. The stage1 code file is stage1/stage1.s under the source code directory. After compilation, it becomes a 512-byte IMG, which is written on the hard disk's 0-side 0-way 1st sectors, as the main Boot Sector of the hard disk.
4.1 stage1.h File Analysis
This file mainly defines some constants used in the stage1.s file.
The analysis of these constants is as follows:
/* Defines the grub version number, which can be identified in stage1 .*/
# Define compat_version_major 3
# Define compat_version_minor 2
# Define compat_version (compat_version_minor <8 )/
| Compat_version_major)
/* The identifier of the last two bytes of MBR */
# Define stage1_signature 0xaa55
/* The offset of the End mark of BPB (BiOS parameter block BIOS parameter block), which contains descriptions of low-level parameters of the drive .*/
# Define stage1_bpbend 0x3e
/* Offset marked by the major version number */
# Define stage1_ver_maj_offs 0x3e
/* Stage1 start drive offset marked */
# Define stage1_boot_drive 0x40
/* Force the offset marked by LBA */
# Define stage1_force_lba 0x41
/* Offset marked by the stage2 address */
# Define stage1_stage2_address 0x42
/* Stage2 sector offset marked */
# Define stage1_stage2_sector 0x44
/* Offset of stage2 _ segment mark */
# Define stage1_stage2_segment 0x48
/* Offset identified by the magic header of Windows NT */
# Define stage1_windows_nt_magic 0x1b8
/* Offset of the Start address of the partition table */
# Define stage1_partstart 0x1be
/* Offset of the end address of the partition table */
# Define stage1_partend 0x1fe
/* Start address of the stage1 stack segment */
# Define stage1_stackseg 0x2000
/* Disk buffer segment. The disk buffer length must be 32 KB and cannot span the 64 kB boundary. */
# Define stage1_bufferseg 0x7000
/* Drive parameter address */
# Define stage1_drp_addr 0x7f00
/* Drive parameter size */
# Define stage1_drp_size 0x42
/* Drive letter flag of the floppy disk in Bois */
# Define stage1_bios_hd_flag 0x80
4.2 stage1.s File Analysis
First, some macros are defined at the beginning of this file.
# Define ABS (x) (X-_ start + 0x7c00)
This macro calculates the direct address. Since MBR is loaded to 0x7c00, the direct address of the X parameter can be obtained through calculation. In this way, the linker program is independent.
# Define MSG (x) movw $ ABS (x), % Si;
This macro is used to load and respond to strings.
Then the program starts execution from the _ Start program entry. The entry is located in Cs: IP 0: 0x7c00 in the memory. Then a series of variables are initialized. Set the starting sector, disk, and cylinder, and their starting position. The version number of stage1 is also set. Set the boot_drive variable to load stage2 from that disk. If this variable is set to 0xff, stage2 is loaded from the default boot drive. The starting address of stage2 is 0x8000, the starting segment is 0x800, and the starting fan area number is 1. That is to say, the starting position of stage2 is stored in [8] On 0 cylinder, 0 track, and 2nd sector.
The program is actually executed from the real_start entry. First, the offset of the Data Segment and stack segment is set to 0, and then the starting address of the stage1 stack is set to stage1_stackseg, that is, 0x2000. Then the interrupt is enabled. Then, check whether the disk is enabled. Whether the boot_drive variable is 0xff. If it is not 0xff, save the set disk number to the DL register and press it into the stack for saving. The grub text is displayed on the screen. Then, check whether the disk is a floppy disk. If the disk is a floppy disk, you do not need to check whether the LBA mode is supported. Then, check whether the started disk supports the LBA mode. Then the program is divided into two parts, one being the LBA mode and the other being the CHS mode.
The full name of LBA is Logical Block Addressing. The Chinese name is Logical Block Addressing [9]. LBA refers to the addressing technology of a disk device. It uses logical ing to specify the sector of the disk drive. Currently, in the transmission interface used by the personal computer, both enhanced IDE (enhanced IDE) and SCSI use Logical Block Addressing. The traditional hard disk addressing technology adopts the physical addressing (physical mapping, physical addressing) method, and uses the actual structure on the disk as the structure of the data block address. However, in the initial design of the physical addressing mode, the hard disk capacity is only 5, 10, 20 MB, and so on, so the maximum addressing capacity can be up to 1024 cylinder) 16 Heads and 63 sectors ). Based on the 512 bytes (bytes) of each sector (sector), the entity addressing method can only use up to 512 × 63 × 1024 × 16 = 528482304 bytes (528 MB) of hard disk space. However, as the magnetic storage technology continues to improve and the hard disk capacity increases significantly, such restrictions allow users to divide the hard disk into multiple blocks, which is inconvenient to use.
Therefore, the hardware vendor has developed the LBA logic addressing mode, that is, the computer system has not recorded the data storage location, and the actual storage location of the data on the hard disk. The IDE control circuit and BIOs are responsible for converting the location table of the addressing data. After the conversion, the record is to number 0 for the 1st sectors of the 1st tracks on the 1st magnetic column, 1 for the second sector, and so on ......, Assume that one track has 2000 sectors, and the number of the 2000th sectors is 1999. The 2nd sectors on the 1st channels are 2000, so they are arranged linearly. A hard disk can be addressed in logical blocks. It can contain up to 16383 magnetic columns, a maximum of 16 magnetic heads, 63 sectors per rail, and 512 bytes in size, the supported hard disk space is 512 × 63 × 16383 × 16 = 8455200768 bytes (8.4 GB) [10].
The ATA interface specification defines 28-bit addressing. Therefore, it can support a capacity of 224x512 = 137gb. However, unfortunately, the BIOS cannot work with each other. It uses a 24-bit addressing (that is, the LBA mode ). Therefore, the root cause is to change the BIOS's support for the interruption of 13 H, so later the BIOS designed the enhanced version of the interrupt 13 H. One breath uses 64-bit addressing for the hard disk, so it can support up to 264 × 512 = 9.4 TB, equivalent to 3 trillion times of 8.4 GB [11].
To read data in LBA mode, first define some parameters of the previously defined disk to prepare for future calls of INT 13. Use the 0x42 function of int13 to read the disk content to the memory. Set ah to 0x42 as the function number, set the DL register to set the disk number, and Si to record the address offset of a series of disk information, the disk information includes the cylinder number, track number, and fan area number to be read. The program then calls Bois INT 13 to interrupt and read the content on the second sector on the boot disk to stage1_bufferseg in the memory. In stage1.h, The stage1_bufferseg is defined as 0x7000. The content on the second sector will be read to 0x7000 in the memory. If the data is read successfully, it will jump to copy_buffer. If the data fails to be read, it will try to read data in CHS mode.
Different from the LBA mode, call the 0x2 function in the BIOS int 0x13 interrupt and set the ah register to 0x2, the Al value to the number of sectors, and the CL bit to 6, the combination of bits 7 and CH is the track number. The 0-5 bits of CL are the fan area number, DH is the head number, and DL is the drive number (0x80 is the hard disk, 0x0 is the soft drive ). ES: the address of the data buffer zone. However, the function is similar to the LBA mode mentioned above. It also reads the content in the second sector to 0x7000 in the memory as the cache. Then jump to copy_buffer.
Finally, call copy_buffer to transfer the read sector to stage2_address. That is, it is transferred to 0x8000.
4.3 functions of stage1 Module
Due to the limitation on the file capacity of stage1, the work done by stgae1 is relatively limited. It is first loaded to 0x7c00 in the memory by bois, and then the content on the second sector of the startup drive is read to 0x7000 in the memory by calling Bois int13, then, you can call copy_buffer to transfer it to the 0x8000 position in the memory. The content on the second sector to be read is the start. s function module to be analyzed below.
5. Analysis of the Start Module
From the analysis in the previous chapter, we can see that stage1 completes the task required by an MBR, but grub does not directly load the grub kernel directly through stage1, instead, another module is loaded to 0x8000 through stage1. Based on the source code analysis, it is found that the loaded module is the second module to be analyzed, namely the start. S module.
5.1 Function Analysis of the start. S Module
At the beginning of the program, some macros are defined for the program.
# Ifdef stage00005
# Define ABS (x) (X-_ start + 0x2000)
# Else
# Define ABS (x) (X-_ start + 0x8000)
# Endif
It can be found that if stage00005 is defined, the starting address of the program is 0x2000, And if stage00005 is not defined, the starting address of the program is exactly 0x8000. So I determined that the program that loads memory after stage1 is the 512-byte image file compiled by START. S. The stage1_5 section is not analyzed for the time being. Skip this section for the time being.
The macro "# define MSG (x) movw $ ABS (x), % Si;" is used to display strings on the screen.
The next step is the program entry _ start. Because stage1 is then loaded into the memory, its starting address is 0x8000, And it will still use the registers and variables left by the stage1 module. If the stage1_5 variable is set, "loading stage1.5" is displayed on the screen. If this variable is not set, "loading stage2" is displayed ". Then, read the number of sectors to be read. Next, enter a bootloop loop. If the number of sectors to be read is not 0, continue the loop until the loop ends when the number of sectors to be read is 0. In this loop, we use the same method as stage1 to determine the read/write modes supported by the drive disk. Based on the modes supported by different disks, jump to the corresponding partition to read the sector on the disk to the memory. If the disk supports the LBA mode, it will jump to the lba_mode part to read the corresponding sector. If the disk does not support the LBA mode, it will jump to the chs_mode part, you can use the CHS mode to read the sectors in the disk into the memory. The first step is to cache the read sector to 0x7000 in the memory, and then copy the cached content to the target address, that is, starting from 0x8200, by calling the copy_buffer subprogram. Like in stage1. s also has a data structure that records the address. The difference is that there is only one entry in stage1, and start. s records an address linked list called blocklist. the nodes in this linked list record a set of continuous sectors.
Lastlist:
. Word 0
. Word 0
. = _ Start + 0x200-bootsec_listsize
/* 0x200 is added because start. S is a 512-byte image file after compilation. */
/* Initialize the first data list */
Blocklist_default_start:
. Long 2/* Records starting from 3rd sectors */
Blocklist_default_len:
/* This parameter records the number of sectors to be read */
# Ifdef stage00005
. Word 0/* If stage1_5 is set, do not read */
# Else
. Word (stage2_size + 511)> 9/* read all slice occupied by stage2 */
# Endif
Blocklist_default_seg:
# Ifdef stage00005
. Word 0x220/* If stage1_5 is set, it will be read from 0x220 */
# Else
. Word 0x820/* If stage1_5 is not set, read from 0x820 */
# Endif
Firstlist:
After all the sectors to be read are read, the program enters the bootit subroutine block. Then the program jumps. If the stage00005 flag is set, it jumps to 0x2200 for execution. If the stage00005 flag is not set, it jumps to 0x8200 for execution.
5.2 overview of functions of the Start Module
By analyzing the start. s file, we can see that. The START module loads the stage2 or stage1_5 module from the disk to the memory. If stage2 is directly loaded, it is loaded at 0x8200 of the memory. If stage00005 is loaded, it is loaded at 0x2200 of the memory.