XV6 Learning lab1booting a PC

Source: Internet
Author: User
Tags manual printf using git

Go to lab1 and download the lab files using git. Then skip the introduction, starting with Part 1. This article is for reference from Jasonleaster, thanks to him here.


Part 1:pc Bootstrap


Following the introduction step-by-step, the Qemu window appears after make QEMU. (Remember to install QEMU first).

If make Qemu-nox is used here, it will not jump out of the Qemu window and only appear in your terminal.


1, here is involved in 8086 of some basic knowledge.


1)

The 8086 address bus (AB) has 20 bits, which is the addressing space 2^20 B = 1MB, from 0x00000 to 0xFFFFF.

8086 data Bus (DB) is only 16 bits.


2)

So how to address 20-bit space with 16-bit. is segmented addressing. i.e. Segment:offset, calculation result: (segment << 4) + offset

8086 when allocating memory space for a program, divide it into code snippets CS, data segment DS, stack segment SS, and additional segment ES, which are stored on some registers (16 bits), as shown below.


Universal Registers:

AX,BX,CX,DX is called a data register:
AX (ACCUMULATOR): Cumulative register, also known as accumulator;
BX (Base): Base address register;
CX (Count): Counter register;
DX (data): Information register;

SP and BP are also known as pointer registers:
SP (Stack Pointer): stack pointer register;
BP (base Pointer): base pointer register;

SI and DI are also known as the variable address registers:
SI (source Index): source variable address register;
DI (Destination Index): Destination variable address register;


Control Register:

IP (instruction Pointer): instruction pointer register;
Flag: Sign Register;


Segment Register:

CS (Code Segment): Snippet register;
DS (Data Segment): segment register;
SS (Stack Segment): stack segment register;
ES (Extra Segment): additional segment register;


For a more detailed introduction, see:

8086 register: Little BMW's dad

x86 Register: http://www.eecg.toronto.edu/~amza/www.mindsec.com/files/x86regs.html


2, 8086 power-on process (real mode):

Each time the power button is pressed, the CPU resets and everything starts again. For 8086来, after the reset, all registers have a value of 0, except CS = 0xFFFF.
Then the segment is addressed,
Cs:ip = 0xffff0

So the computer starts up-every time it starts from 0xffff0. CS = 0xFFFF IP = 0x0000 when the CPU in its own initialization, after the completion of the CPU initialization, the system immediately into real mode, CS into 0xf000 ip=0xfff0 (here a doubt, the system's first instruction address is 0xffff0, but do not know is by the power after the [ FFFF:0000] or after the initialization of [f000:fff0] get, but look at the picture below the feeling should be the latter)

Figure below



From this output you can conclude a few things:

The IBM PC starts executing at physical address 0x000ffff0, which are at the very top of the 64KB area reserved for the ROM Bios.
The PC starts executing with CS = 0xf000 and IP = 0xfff0.
The first instruction to be executed are a jmp instruction, which jumps to the segmented address CS = 0xf000 and IP = 0xe05 B.


For a detailed discussion of virtual address, linear address, physical address, see Jasonleaster

If you do not understand, you can read this: I understand the logical address, linear address, physical address and virtual address


3, familiar with the gdb SI directive

After entering the GDB link, the SI directive resembles the s instruction in the C code. We can look at each register value through the "I r" command and view the memory through the x instruction. The following figure:


You can also view the specified registers with "I R ax". For the use of the x instruction, you can refer to http://m.oschina.net/blog/33839 and follow its instruction format.


Part 2:the Boot Loader

First, the concept of sector (sector) is presented:

Floppy (floppy) and hard disks for PCs is divided into a byte regions called sectors. A sector is the disk's minimum transfer granularity:each read or write operation must be one or more sectors in size and Aligned on a sector boundary. If the disk is a bootable, the first sector is called the boot sector, and since this is where the boot loader code resides. When the BIOS finds a bootable floppy or hard disk, it loads the 512-byte boot sector to memory at physical addresses 0x 7c00 through 0X7DFF, and then uses a JMP instruction to set the CS:IP to 0000:7C00, passing control to the boot loader. Like the BIOS load address, these addresses is fairly arbitrary-but they is fixed and standardized for PCs.

The ability to boot from a CD-ROM came much later during the evolution of the PC, and as a result of the PC architects took t He opportunity to rethink the boot process slightly. As a result, the modern BIOS boots from a CD-ROM is a bit more complicated (and more powerful). CD-ROMs use a sector size of 2048 bytes instead of a., and the BIOS can load a much larger boot image from the disk into Memory (not just one sector) before transferring control to it.

In our curriculum, traditional 512byte sectors are used.

The boot/folder has the relevant code.


The first is boot. S, assembler code, really want to cry. Slowly chew it.

Boot. The assembly code in S is the conversion of real mode into protected mode.

Relevant references are visible:

http://blog.csdn.net/misskissc/article/details/16349249

ATT assembly

Http://www.cnblogs.com/MSRA_SE_TEAM/archive/2010/11/29/1891270.html

GDT detailed

GDT and Ldt

Know a general, but the details of a lot of do not understand, stay in the future after a closer look.


Then there is the MAIN.C function: Really want to cry.

The Bootmain function is located in Boot/main.c, which performs the operation of reading the kernel from the hard disk sector.


Using GDB to view the code execution process is as follows:

Boot code executes from 0X7C00.


The code is currently executing in 16bit i8086 mode, and after executing 0x7c2d this line of code, enter 32bit i386 protected mode.

/BOOT/MAIN.C and/obj/boot.asm inside of the code is really not going to see it now. Leave a hole and look.


Now try to solve 4 minor problems:


1) At what point does the processor start executing 32-bit code? What exactly causes the switch from 16-to 32-bit mode?

The program is in boot. S 55 performs a jump to enter 32bit protection mode.


The corresponding GDB moment is as follows:



2) What's the last instruction of the boot loader executed, and what's the first instruction of the kernel it just loaded ?

/BOOT/MAIN.C, the following image is the last instruction executed in the boot loader.


Keyword ELFHDR, e_entry, where ELFHDR is a pointer to 0x10000 (the coerced type is converted to struct elf*).

This allows the ELFHDR to be initialized with the readseg () function. This initialized data source is the kernel image on the hard disk.


So we went from there to find the location where the elfhdr->e_entry pointed. Disassemble the kernel image.

Objdump-x./obj/kern/kernel

The visible starting position is 0x10000c and the breakpoint is set to run here:


Get the first instruction after entering the kernel.

Since jumping to the Jos kernel, the first file executed is kern/entry. S, we can be in kern/entry. S is verified and can be found in this code:


The entry symbol in the kernel image is a pointer to entry. S the code starting address of this file
Disassembly you will see a entry symbol. Value is 0xf010000c this is the entry address of the kernel on our mirror, and the above 0x10000c does not conflict, the former 0x10000c is the latter 0xf010000c conversion.


This conversion is manual at first, comparing the same code (left) with the 14 code for 09\10 year:


Find here there is manual & conversion, and the 2014 code is not this cast.

Operating system kernels often like is linked and run at very high virtual address, such as 0xf0100000, in order to Lea ve the lower part of the processor's virtual address space for user programs. The reason for this arrangement would become clearer in the lab 2.

Many machines don ' t has any physical memory at address 0xf0100000, so we can ' t count on being able to store the kernel th Ere. Instead, we'll use the processor's memory management hardware to map virtual address 0xf0100000 (the link address at WHI CH The kernel code expects to run) to physical address 0x00100000 (where the boot loader loaded the kernel into physical m Emory). This, although the kernel's virtual address is high enough to leave plenty of address space for user processes, it wil L be loaded in physical memory @ The 1MB point in the PC ' s RAM, just above the BIOS ROM. This approach requires the PC has at least a few megabytes of physical memory (so, physical address 0x00100000 W Orks), but the likely to was true of any PC built after about 1990.

Since the hardware has mapped 0xf0100000 to 0x100000, the same goes for mapping 0xf010000c to 0x10000c, which is essentially a manual conversion into a hardware direct conversion.

We can see by disassembly the results as follows:


The VMA represents the virtual memory address, while the LMA refers to the load memory address.

Earlier, from the start-up information we can also know this:



3) Where is the first instruction of the kernel?

From the above analysis, it is known that the physical address of the first instruction of kernel is 0x10000c.


4) How does the boot loader decide what many sectors it must read in order to fetch the entire kernel from disk? Where does it find this information?

Determine and read the information stored in the ELF format file, see the Elf Header on the map above.


Exercise 4.

Read the code inside the point.c to be able to read it.


LAB1 comparison of the following places are prompted as follows:

Here is a few specific points you read on K&r Chapter 5 that is worth remembering for the following exercise a nd for the future labs.

1) If int *p = (int*), then (int) p + 1 and (int.) (P + 1) is different numbers:the first was 101 but the second is 104. When adding an integer to a pointer, as in the second case, the integer was implicitly multiplied by the size of the object The pointer points to.
2) P[i] is defined to being the same as * (P+i), referring to the i ' th object in the memory pointed to by P. The above rule for addition helps this definition work when the objects is larger than one byte.
3) &p[i] is the same as (P+i), yielding the address of the I ' th object in the memory pointed to by P.


There are two main points needed mainly (at that time own summary):

1) When C is an array name, 3[c] = c[3], because both actually refer to * (3+C), but this method of expression is really windy ah.

2)

c = (int *) ((char *) C + 1);
*c = 500;
The result of this sentence is unexpected, where c modifies an int from 9th to 32nd, and then overwrites the lower 8 bits of the latter number, so the result is very strange.

This sentence affects the values of a[1] and a[2].

Start a[1], a[2] data is (in binary notation):

00000000000000000000000110010000 00000000000000000000000100101101

Thought it would turn into:

00000000000000000000000000000001 11110100000000000000000100101101

The result is:

00000000000000011111010010010000 00000000000000000000000100000000
I don't understand. Looking for a brother to explore the next, the original is the reason for the storage order.

The storage of an easy-to-know array is stored from a low address to a high address. low-to-high, in the same vein, the 4 bytes of int are low in existence, high in address, and the base unit of storage is bytes. (in bytes We think it is easy to understand the left high right low storage mode)

Begin:

Add 0xbfffe5f4
0xbfffe5f5
0xbfffe5f6
0xbfffe5f7
A[1] 0x90 0x01 0x00 0x00
Add 0xbfffe5f8
0xbfffe5f9
0xbfffe5fa
0xbfffe5fb
A[2] 0x2d 0x01 0x00 0x00
C is currently pointing to 0xbfffe5f5, after performing *c = 500, the 4 bytes forward from the current address will be modified, and 500 will be stored in int (4 bytes) in the correct way, as shown in the table:

Add 0xbfffe5f4
0xbfffe5f5
0xbfffe5f6
0xbfffe5f7
A[1] 0x90 0xf4 0x01 0x00
Add 0xbfffe5f8
0xbfffe5f9
0xbfffe5fa
0xbfffe5fb
A[2] 0x00 0x01 0x00 0x00
Same as the result.




I modified the Boot/makefrag inside the link address, the 0X7C00 to 0X7C01, re-run error, if the 0X7C01 set a breakpoint, then the card read boot sector that part, the QEMU interface is constantly flashing, Continuously reads the boot sector. If the breakpoint is set to 0X7C00, the program continues to execute, but executes the wrong code, which is then trapped inside an infinite loop.



We know that the address of the boot loader is 0x7c00, and kernel's address is 0x10000c.

The GDB content is as follows:


Contrast can be seen, the execution to 0X7C00, 0x100000 inside is all 0, and execution to 0x10000c, inside there is data, but can't understand.

For the moment, it's a compendium of instructions, compared to obj/kernel/kernel.asm


Verify that the kernel code starts with 0x100000 and is consistent with the link script kern/kernel.ld description. Note that the first instruction executed by the kernel code is at 0x10000c, and it is not clear what is stored and what is used.



Part 3:the Kernel



The debug results are as follows:


Before this instruction, the contents of the two addresses are different and then become the same. The reason is that the paging mechanism has not been established before, and the high address kernel zone is not mapped to the physical address of the kernel, but only the low address is valid. After paging is turned on, both virtual addresses point to the same physical address area under the static Mapping Table (KERN/ENTERPGDIR.C).


The Mov $relocated in the figure,%eax is the first instruction that opens after paging. Although the image on the right shows the command execution address is high address 0xf0100028, the actual execution address is low address 0x00100028.


Read through KERN/PRINTF.C, lib/printfmt.c, and kern/console.c, and make sure you understand their. It'll become clear in later Labs why Printfmt.c are located in the separate Lib directory.

The first problems encountered were the functions of Va_start (VA, last), Va_arg (VA, type), and Va_end (VA). See Slvher's column.


The problem is to complete the/lib/printfmt.c "%o" part of the code, divert, as follows:


Then solve the following problems:

1) Explain the interface between PRINTF.C and console.c. Specifically, what function does console.c export? How are this function used by PRINTF.C?

You can see the output function that contains some characters in the Kernel/printf.c file, where the called Cputchar () function is defined in the kernel/console.c file. The Cputchar () function is used to output a character to the console.


2) Explain the following from console.c:


Note here that Memmove actually moves the address pointed to by the second parameter to the address pointed to by the first argument, where n byte is specified by the third parameter. This code is mainly to detect the current screen output buffer is full. If the buffer is full, the first line of the screen is overwritten with a line up, the last line is empty, and the For loop fills with ' (space), and the Crt_pos minus Crt_cols.


3) for the following questions-might wish to consult the notes for lecture 2. These notes cover GCC ' s calling convention on the x86.

Trace the execution of the following code step by step (pay special attention to the test codes I have slightly modified):

int x = one, y = x, z = 4;
cprintf ("x%o, y%x, z%d\n", x, Y, z);
I. In the "To Cprintf" (), to what does FMT point? To what does AP point?

FMT points to the initial address of the first parameter of the Cprinft () function, which is the first address of the string "x%d, y%x, z%d\n". The AP is a va_list variable, and after the Va_start (AP, FMT) function executes, the AP points to the first mutable parameter, or x, to the vcprintf () function and points to y,z in order. Note, however, that the 3 parameters here are stored in the function stack, which differs from the previously defined address. After the vcprintf () function returns, the AP becomes NULL after the Va_end (AP) is executed.


Ii. List (in order of execution) each call to CONS_PUTC, Va_arg, and vcprintf. For CONS_PUTC, list it argument as well. For VA_ARG, List what AP points to before and after the call. For vcprintf list the values of their arguments.

It was supposed to be a mock-up in the brain, but for a closer look, step by step.

Start even the code how to test do not know, later found that the Qemu window is output, and then use grep search "Welcome to the JOS kernel monitor!", in Kernel/monitor found the source code, followed by, I add test code. Then go to the obj/kernel.asm to find the generated assembly code, find the appropriate entry address entry, because they are too weak, looking for a long day. A lot of people feel the right place after setting breakpoints (do not know why, estimated to understand later), execution but found that has been executed, helpless, 1.1 points will be the breakpoint forward, anyway, looking for half a day, was his stupid cry.

Now start trace:

Into the position we want, everything is ready.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.