My blog is better than this layout, especially the picture
http://notelzg.github.io/2016/06/29/%E6%B7%B1%E5%85%A5%E7%90%86%E8%A7%A3%E8%AE%A1%E7%AE%97%E6%9C%BA%E7%B3%BB% e7%bb%9f%e6%80%bb%e7%bb%93/
1. Hello Wordl
Let's start with the Hello World program:
#include <stdio.h>int main(){ printf("hello, world! \n"); return0;}
Let's take a look at what's going on between the source code and the executable, and the results of running the output:
Compilation Phase 2.1 preprocessing
gcc -E test.c -o test.i 或 gcc -E test.c
You can output the code in the Test.i file that contains the test.c after preprocessing. Open the Test.i file and take a look, and you'll see.
The subsequent instruction is to output the preprocessed code directly in the command-line window. The-e option of GCC allows the compiler to preprocess
and outputs the preprocessing results. In this case, the preprocessing result is to insert the contents of the stdio.h file into the test.c.
That's all.
2.2 Compile bit assembly code
Compile the test.i file and generate the assembly code:
gcc -S test.i -o test.s
The-s option of GCC, which indicates that the assembly code is stopped and the-O output assembly code file is generated during program compilation.
Generate 32-bit assembler on a 64-bit machine gcc-m32-s test.i-o Test.s, plus-m32,
Indicates that a 32-bit program is generated, so let's take a look at the simplest assembly code
In the assembly code. The first is the so-called symbol tag, which is resolved in the link to be replaced with the virtual address (later or said)
In this case, we often say the storage of variables to see how our variables are stored and passed, we know a
A program is a process, a process is the state of a program at run time, including a lot of parts, let's start by talking about the process
User stack, which is used to save, pass temporary variables, the structure of the stack is generic, because a process has only one stack, but the process
There are countless function in order to distinguish the variables of each function, so the part of the stack that each fuction occupies
Also called stack frame, let's take a look at
. File "TEST.c" # #声明文件的名字. section . Rodata # #标记只读数据. LC0: # #标记字符串 "Hello, World", and is read-only. String "Hello, World". Text # #text Store the compiled program. GloblMain# #全局的. TypeMain, @function# # Global FunctionsMain: # #main function starts. LFB0:. CFI_STARTPROCPUSHL%EBP# # EBP into the stack, because the following to use the%EBP register, in order to protect the data, need to go into the stack, the function returns the need to stack back the original value. CFI_def_cfa_offset8 . CFI_offset5, -8MOVL%esp,%EBP# # To assign the frame bottom address to the%EBP register, which is because the%EBP register is used here, so the last instruction will store the value of the register in the stack.. CFI_def_cfa_register5Andl $- -,%esp# # ESP Register value plus-16 open stack spaceSUBL $ -,%esp# # ESP Register value minus 16 open stack spaceMOVL $. LC0, (%ESP)# # Put the address of the string into the stack PagerPuts# # # puts is the printf function, the function uses the ESP parameterMOVL $0,%eax# # 0 set to return valueLeavethe # # bit function returns to prepare the command here equivalent to: Movl%EBP,%esp (Let the top of the stack point to the bottom of the frame); Popl%ebp (Restore the value of EBP) . CFI_restore5 . CFI_def_cfa4,4 ret # # returns to the caller function . CFI_endproc. LFE0:. SizeMain,.-main. Ident "GCC: (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4" . section . Note. GNU-stack,"", @progbits
2.3 Assembly
The assembly is essentially the TEST.S generated in the previous step, which translates the assembly code into a machine instruction that is 0 1 code
gcc -c-m32 test.-o test.o -d test.//反汇编目标文件
Watch the disassembly file below, compare it to the assembly code above, and compare it to the executable, and you will find that their address has changed.
The rest of the basics hasn't changed.
TEST.O:file Format elf32-i386Disassembly of section.text:00000000<main>e0: -Push%EBP 1: theE5 mov%esp,%EBP 3: theE4 F0 and $XFFFFFFF0,%esp 6: theEcTenSub $X10,%esp 9: C7Geneva - xx xx xx xxMovl $X0, (%esp)Ten: E8 FC FF FF FF call One<main+0x11> the: B8xx xx xx xxmov $X0,%eax 1A:c9 leave1B:C3 ret
2.4 Links
This step, we use the function of the library function, the printf function, our assembly code is also reflected in, but I do not find in the previous file is not the specific printf function
Implementation, the link is to parse the various symbols in the TEST.O (the Test.s appears in the.) and replace the address in the virtual memory with the corresponding function of the bit.
gcc -m32 test.-o test -d//反汇编可执行文件
After the executable has been linked, there will be a lot more content, we only see the main function inside the content,
I can look at, and the assembly is similar, but the previous decimal number has become a 16-step complement
I know when the computer is storing data, is stored in the complement, all calculations are also in the complement of the basis of the
so the complement of the study is very important, integer floating point complement of the expression and calculation, pay special attention to the floating point number of the complement
expression and calculation, and the integer is very different, and because the floating point accuracy problem so there will be rounding, and the CPU rounding
when the different hardware may produce different results, so the floating-point number is never equal to the comparison, only
greater than or equal to the operation, judge equality is possible, but often do not get the results you expect, if you really need to compare the equality can be
converted to a string, The characters are then truncated to the desired number of digits and then compared. We can see the call function on the previous instruction
Movl $0x80484d0, (%ESP), send an address to the top of the stack storage, I think this should be "Hello, World" string
in the virtual address of the computer, the virtual address can find the string. In fact, calling the pritntf function is to copy the string from the source address of the memory
into the memory of the monitor and then display it to our screen. Because now the memory has DMA (Directory memory access)
So do not need the CPU, from the register to remove the string and then stored in the memory of the display, only need the CPU to send a command, memory
You can send the string to the memory of the monitor, and after the transmission is over, the interrupt will tell the CPU,CPU to perform the subsequent operation.
If you look at the disassembly code of several executables, you will find that the code starts with the same address, which is opened from a fixed address.
The address here is not the address of our physical memory but the virtual address, so that each process is
In exclusive memory, this makes it easier for the linker to work, and the MMU (Memory Management unit) on the CPU is turning the virtual address into something
Address, we use virtual addresses instead of physical addresses. For example, we are all familiar with books and catalogs, which can be
To the specific content, now the operating system chooses to put the memory paging, Intel's page size is generally 4KB/4MB, so that we can be divided into
To a page table, the corresponding physical address can be found through the page table and the in-page offset, each of our processes has its own virtual storage space 32
Bit system virtual storage space the 2^64 of the 4gb,64 bit is too large, the virtual storage space is divided into the system space and user space, 32-bit general system
2G, user 2G, detailed can see this URL
https://msdn.microsoft.com/zh-cn/library/windows/hardware/hh439648 (v=vs.85). aspx,
http://blog.csdn.net/tennysonsky/article/details/45092229
Let's take a look at the distribution of Liux,
So we can understand why the address inside the executable file
0804841D <main>:804841D: -Push%EBP804841E: theE5 mov%esp,%EBP8048420: theE4 F0 and $XFFFFFFF0,%esp8048423: theEcTenSub $X10,%esp8048426: C7Geneva -D0 - Geneva ,Movl $X80484d0, (%esp)804842D:e8 is FE FF FF call80482F0 <puts@plt>8048432: B8xx xx xx xxmov $X0,%eax8048437: C9 Leave8048438: C3 ret8048439: the -Xchg%ax,%ax804843B: the -Xchg%ax,%ax804843D: the -Xchg%ax,%ax804843F: -Nop
3. Operation phase
//运行test 可执行文件
3.1 Loading
We enter the./test in the terminal, the terminal is actually shell or is a shell program, through this program to add
Load and run our test, this shell first loads our test executable into memory and then executes the specific
is to allocate the virtual address space, the virtual address space has the corresponding structure, the initialization structure, here will give the process
Allocating resources is actually memory if the resource is sufficient to add the process to the ready queue, otherwise the struct pointer is added
To the wait queue, and then wait for the resource to be sufficient, or wait for the CPU to call, here is the so-called CPU dispatch,
In the process of processing there are synchronous, asynchronous, shared, semaphore, interrupt, deadlock and other issues this can go to see the computer operating system
Learn about. Let's just say that the child process and the parent process share the file problem, and we all know that the parent process creates a child process
The Linux system function provides a fork () function that generates a sub-process that copies a copy of the parent process's virtual address directly
Spatial structure, of course, the process number is unique, and the parent process opens the file description, the child process can be shared, but the file
The reference count for the description increases by 1, and when the file description has a reference count of 0, the file can be closed, so the child process must end
Close the appropriate file description to avoid a memory leak.
Test
3.2 Run
The shell loads the program into memory, which is the creation process, but the load is just a virtual address space, and the specific data in the CPU
Need to go to memory through the virtual memory, of course, now we have a variety of caches, cache level two cache level three cache
Through these caches to speed up CPU processing, if we write code with good time, spatial locality, then our program
It will run faster.
4. Talking about program optimization
Based on the code compiled by the source code, we can see how a program works on the CPU by reading the assembly codes, by understanding
CPU-to-sink code parsing we will understand the limitations of the code we write, the General program optimization has the loop expansion, the recursive turn into a loop,
Conditional judgment changed to conditional transfer, these are analyzed by the implementation of the assembly instructions, we will find that the program does have a great improvement
Of course, most of the cases I do not feel, after all, my program is too small data, but in large projects this really will greatly improve a
The performance of the program. So to improve the performance of your C C + + compilation, this book should provide some help.
5. Talking about thread concurrency
并发极大的提高了我们进程运行的速度和对文件分享的方便,但是同时也会带来了一些弊端,比如对全局
Changes to variables can be confusing because processes and threads are scheduled to run by CPU processes, and their operation is not sequential and irregular because
Semaphore mechanisms are needed to handle synchronization problems, to protect boundary variables or functions, or to allocate resources, classic producer consumer models, reader-writer
Problem, it is a good explanation for these problems. Today's processors are mostly multicore, enabling parallel computing that greatly improves performance and efficiency.
6. Talking about network programming
这里主要说的就是sockt套接字,在linux上面,socket套接字是一个文件,连接socket的文件描述也是一个文件
We connect through the socket file, and when the connection is successful it returns a file descriptor, which we read and write by reading and writing the file description to achieve
The transmission of network information, the Rio packet is used to process network IO, the Web site that we normally browse through the browser, and then the Web page that is seen in the browser or the downloaded file
are transmitted through sockt, through which I can write a Web service of my own, deepen the understanding of network programming.
Finlly Summary
读完这本,我对程序的整个运行过程有一个整体的了解,特别是指针、汇编、用户栈、堆,这些我们经常接触到的
With a clear understanding of the operation of the entire operating system generally have their own understanding, I think by reading this book I think I have a certain basis
Of course, this book is only a stepping stone into the computer world, but it is extremely important, after all, which book can not be a chapter to finish the assembly, right,
This is the great thing about this book, and it is suitable for people of all stages to read, of course, I feel the precision again, and then need to carefully study the relevant areas
The so-called teacher led the door to practice in the personal, classic book is a good teacher. I hope we all have something to gain.
In-depth understanding of computer system notes