During the winter vacation, I was fortunate to read the Randal E.bryant of Carnegie Mellon University (CMU) and David r.o ' Hallaron's masterpiece-in-depth understanding of computer systems (computer System:a Programmer's perspective). This book is a course called "Introduction to Computer Systems" from CMU. Unfortunately, domestic universities seem to like to open such a course. What level of class is an introduction to computer systems? It involves the knowledge of the principles of computer composition, assembly language, operating system, compiling principles, network programming and so on. That is to say, it is a course that involves many CS theories, its purpose is to let a programmer have a macroscopic feeling and understanding to the computer system, know what each knowledge that oneself learns is doing, and its connection with other knowledge. In this way, in the specific study there is a clearer concept. Instead, learning a knowledge, listening to Mengmengdongdong, learning feebly, learning to complete the relief, until the need to use when it dawned, but also deplored the course when learning, sleepy too much.
Spit the groove to say less, or return to the subject. During the holidays, the main learning chapters are the 3rd chapter of the book-The machine-level representation of the program, the part about assembly language. All the knowledge of this record and some of their own thinking. Because the compilation of this book is based on the IA32 (Intel 32bit archtechture) assembly, and the format of the assembly uses the T-T format, that is, the register name plus a "%" symbol, such as%ESP, the command name lowercase, such as MOV, the last of the instructions ( Assembly code suffix) indicates the size of the operand (b:byte, w:word, l:double word; Word represents 16-bit); This corresponds to another format used by Intel and Windows, with a "%" symbol before the register, and a command name capitalized. The programming environment is Ubuntu 14.04, and the compiler is GCC.
The language we are exposed to-work and learn-is generally a high-level language, such as C + + or java. and assembly language, let the majority of secrecy, because it is closely related to hardware, making it difficult to understand the computer principle or architecture, it is hard to learn the assembly. This also makes many people far away from the assembly, because it is too lazy to learn the assembly to learn computer principles. However, the assembly is such an important language, to learn it, not necessarily to use it, at least you can better understand the computer system and software implementation of the bottom. A senior in CS Field said: All CS problems can be solved by adding an intermediate layer. An intermediate layer is the encapsulation and abstraction of the underlying, and provides a friendly interface (Interface) for use on the previous layer. Most of the time we don't need to be at the bottom of things, but sometimes we have to go to the next layer of gutter to see where the problem is.
The computer hardware system consists of the processor (CPU), main memory (memory), I/O controller and I/O device. And the processor is composed of arithmetic logic unit (ALU), controller, and register file. The execution of the program is controlled by the Processor control unit I/O devices, such as the hard disk, the data is loaded into main memory, and then through the reading of the data, the instruction data control the transfer between main memory and registers and the calculation of the ALU. A register is a high-speed storage file that allows fast data exchange with the processor, while main memory is slower. In assembly language, it is mainly the operation of main memory and register, because they are the entities that store the data. At the same time, the final calculation will have to have the ALU. Because of memory virtual memory and paging technology, we use the memory address is a virtual address, and for assembly language, memory is abstracted into a byte array, each byte has an address corresponding to it, and this address can be stored in this byte of memory directly access to the data. This virtual address is the pointer we understand in the C language. and the virtual address to the physical address of the conversion is the memory of the controller do things, in fact, is a mapping relationship. (it suddenly felt that mapping was one of the most widely used concepts in high school mathematics.) I remember thinking about what it would be like to have a map of this ghost thing. It is also used in the hash table. )
In general, the development of computer language is divided into several stages: Originally the assembly language for the instruction set of different machines, then the Fortran and C encapsulation of assembly language, and the c++,c# and Java after the object-oriented concept was proposed, and in the last 20 years or so, Because of the improvement of hardware resources, there are a variety of dynamic languages, such as Python, Ruby, PHP, Javascript,perl,lisp, which are friendly to programmers and less efficient. Build up on a layer of ground. As we all know, the data in the computer is encoded in binary, the machine is only known 0 and 1 of these 10 kinds of numbers. So, all the program code eventually has to be translated into binary code. (Knowledge of machine-level representations and information theory and coding related to messages.) The first assembly, for a different set of instructions (Isa,instruction set Architecture), can be translated directly by the assembler instruction into machine code (binary code) through the assembler. In the following, C + + has to compile the compiler into executable code, and in the process of compiler compilation, the compiler will first use the C + + code translation into assembly code, and then by the assembler to translate the assembly code into machine code. Java and the various scripting languages can be compiled into a specific binary code, and then through its interpreter (interpreter), explain the run. And its interpreter, usually written by C + +. An interpreter is an intermediate layer. such as the very famous Java Virtual machine (JVM, Java Vsan).
In-depth understanding of the way the computer system explains assembly language is by explaining the basic grammar of assembly language and by translating C into assembly code to learn the assembly. At the same time, read the assembly code produced by GCC and the assembler code generated by the object file (*.o suffix, an intermediate binary code in the Unix/linux system) to understand how a high-level language such as C + + is implemented as assembler code, and to study its optimizations at the same time. The content is then broadly divided into:
1>IA32 register and its access;
2> data transfer instructions;
3> three types of operands and addressing methods;
4> arithmetic and logic operations;
5> condition code and control;
Implementation of the program stack of the 6> process;
Allocation and access of 7> arrays;
8> heterogeneous data structures and pointers;
In the compilation of 9>GCC, the corresponding relationship between assembly language and C is summarized.
Reprint Please specify address
< assembly language Series > Computer hardware system and assembly