Deep understanding of computer systems

Source: Internet
Author: User

Chapter 1 Introduction
1, cannot replace X with x-y <0

Chapter 2 computer system roaming
1. The only way to distinguish different data objects is to judge through the context of the data.
2. the cache is implemented by static random access memory (SRAM. L1 is located on the processor chip, while L2 is located on the motherboard and connected to the chip through the high-speed cache bus.
3. In the virtual address space of a process, the code and data background closely follow the runtime heap. The code and data zone are determined when the process starts to run. Different from this, they are used to call C standard library functions such as malloc and free, the heap can be dynamically expanded and reduced at runtime. Stack can also be expanded and reduced. The 1/4 at the top of the address space is reserved for the kernel.

Chapter 1 Information Presentation and Processing
1. Due to the limited precision of the representation, floating point operations cannot be combined. Generally, the minimum first operation is selected.
2. indicates the size of integer and pointer data. The most important system parameter determined by the font length is the maximum size of the virtual address space.
3. Float is generally 4 bytes, and double is generally 8 bytes. the pointer generally uses the full-character length. The 32-bit host is 4 bytes, And the 64-bit host is 8 characters long.
4. on almost all machines, multibyte objects are stored as consecutive byte sequences, and the object address is the smallest address in the byte sequence used.
5. Binary Code is rarely transplanted between different machine and operating system combinations.
6. Expression ~ 0 will generate a full 1 mask regardless of the machine's word size. Although a 32-bit machine can write 0 xfffffff with the same mask, such Code cannot be transplanted.
7. Almost all compilers/machine combinations use arithmetic shifts to the right for signed data, that is, adding a signed bit on the left.
Both C and C ++ support symbols and unsigned numbers, but Java only supports signed numbers.
9. Files in the C library define a set of constants to limit the range of different Integer Data Types of the machine running the compiler, such as int_max, int_min, and uint_max.
10. The forced type conversion does not change the bit representation of the parameter. It just changes how to interpret these BITs as a number:
Int x =-1;
Unsigned UX = (unsigned) X;
Here UX = 0 xFFFF FFFF
11. Rule 1: When a signed number is mapped to its unsigned number, the negative number is converted to a large positive number, rather than a negative number.
Rule 2: For small numbers (<2 ^ (W-1), a conversion from unsigned to signed retains the original value of the number, for large numbers, the number is converted to a negative value.
12, unsigned integer addition:
X + y = x + y, X + Y <2 ^ W
X + y = x + y-2 ^ w x + y> = 2 ^ W
13. signed integer addition:
X + y = x + y-2 ^ w, x + y> = 2 ^ (W-1) positive Overflow
X + y = x + y, normal
X + y = x + y + 2 ^ w, x + y <-2 ^ (W-1) Negative Overflow
14. Non-binary Complement
-X =-2 ^ (W-1), x =-2 ^ (W-1)
-X =-x, x> 2 ^ (W-1)
15. In a single-precision floating point format (float in C), S, exp, and Frac are 1-bit, 8-bit, and 23-bit, respectively, and generate a 32-bit representation. In double-precision format (double in C), S, exp, and Frac are bits and 52 bits, respectively, to generate a 64-bit representation.
16. The floating point Rounding Rule is to round (round-to-even) to an even number or round (round-to-nearest) to the nearest value ).
17. The floating-point addition method is not bounded. the floating-point addition method satisfies the following monotonic property: If a> = B, then for any values of A and B, except that X is not equal to Nan, both have x + A> = x + B. Floating-point multiplication also satisfies the monotonicity property.
18. A floating point number is a simple de-inverse of its symbol bit. Float F; F =-(-F) is correct.
19. Check the following code.
View plaincopy to clipboardprint?
# Include
# Include
Using namespace STD;
Void main ()
{
Double X = 1.3;
Double Y = 0.4;
If (x + y! = 1.7)
Cout <"addition failed? "<Endl;
}
# Include
# Include
Using namespace STD;

Void main ()
{
Double X = 1.3;
Double Y = 0.4;
If (x + y! = 1.7)
Cout <"addition failed? "<Endl;
}

The running result will be addition failed? ". That is, x + y! = 1.7 The reason is that the value of double is an approximate value.

Instead of the exact 1.7.

The correct statement should be as follows:
View plaincopy to clipboardprint?
# Include
# Include
Using namespace STD;
Const double Epsilon = 0.000001;
Bool about_equal (Double X, Double Y)
{
Return (x <Y + epsilon )&&
(X> Y-Epsilon );
}
Void main ()
{
Cout <"1.3 + 0.4 = 1.7:" <
(1.3 + 0.4 = 1.7) <Endl;
Cout <"about_equal (1.3 + 0.4, 1.7):" <
About_equal (1.3 + 0.4, 1.7) <Endl;
}
# Include
# Include
Using namespace STD;

Const double Epsilon = 0.000001;

Bool about_equal (Double X, Double Y)
{
Return (x <Y + epsilon )&&
(X> Y-Epsilon );
}

Void main ()
{
Cout <"1.3 + 0.4 = 1.7:" <
(1.3 + 0.4 = 1.7) <Endl;
Cout <"about_equal (1.3 + 0.4, 1.7):" <
About_equal (1.3 + 0.4, 1.7) <Endl;
}

Chapter 1 Machine-level representation of Programs
1. Linux uses flat addressing. In this addressing mode, programmers regard the entire bucket as a large byte array.
2. The two operands of the instruction transmitted in ia32 cannot both point to the memory location.
3. By convention, all functions that return function or pointer values are implemented by placing the result in register % eax.
4. The shift quantity can be an immediate number or put in the single-byte register element % Cl.
5. The first, second, and third parameters of the function are stored in the location where the address offset of % EBP is 8, 12, and 16 respectively in the memory.
6. In C, all loops are converted to do-while.
7. By convention, registers % eax, % edX, and % ECx are divided into the registers saved by the caller, and the remaining three registers (% EBX, % ESI, % EDI) it is divided into registers saved by the caller.
8,
View plaincopy to clipboardprint?
Call next
T:
Popl % eax; % eax stores the address of the popl command.
Call next
Next:
Popl % eax; % eax stores the address of the popl command. This is the only way to put the value in the program counter into the Certificate Register in ia32.

9. Union is declared using the keyword 'Union '. Several different types can be used to reference an object. The total size of a union is equal to the size of its largest domain.

Chapter 2 Program Performance Optimization
1. Code motion: This includes identifying the computation that needs to be executed multiple times (for example, in a loop) but the computation results will not change, therefore, we can move the computer to the front of the code without being evaluated multiple times.
2. eliminate unnecessary references. For example, if we need to assign a value to a pointer continuously in a loop, we can define a variable, the value obtained each time is assigned to this variable, and the variable value is finally assigned to the pointer.
3. superscalar: multiple operations can be performed in an out-of-order manner in each clock cycle.
4. On an ia32 processor, all floating-point operations are performed with an extended 80-bit precision, and floating-point registers store values in this format. The value in the register is converted to a 32-bit or 64-bit format only when it is written into the memory.
5. Profiling Program (UNIX platform): add the-PG parameter during compilation to generate a gmon during program execution. and then run gprof. out.
6. Memory aliases and process calls severely limit the compiler's ability to perform a lot of optimization.

Chapter 2 Storage hierarchy
1. If the data required by your program is stored in the CPU register, you can access it within zero period of execution.
2. SRAM stores each bit in a cell of a bistability memory unit, and each unit is implemented by a six transistor circuit.
3. DRAM stores each bit for capacitor charging. Unlike SRAM, DRAM memory units are very sensitive to interference. When the voltage of the capacitor is disturbed, it will never be restored. Various factors that leak current may cause DRAM units to lose their charge within 10-milliseconds.
4. a d * w DRAM stores a total of DW bit information, where D super units, each of which has W Bit.
5. The access time to the slice consists of the track seeking time, rotation time, and transmission time.
6. the I/O bridge is connected to the system bus, memory bus, and I/O bus.
7. The program that repeatedly references the same variable has a good time locality. For instructions, there are 10 good loops and space locality. The smaller the loop body, the more iterations the loop has, the better the locality. For a program with a reference mode with K step, the smaller the step size, the better the space locality. A program with a reference mode with a step size of 1 has a good spatial locality, while a program with a large step size jumping out of memory has a poor local space.
8. High-speed cache includes direct ing of High-speed cache (E = 1) and group-connected high-speed cache (1

Chapter 2 link
1. The C source code file plays the role of the module. Any declarations of global variables or functions with static attributes are proprietary to the module. Similarly, any global variables and functions declared without static attributes are public and can be accessed by other modules. It is a good programming habit to use static attributes as much as possible to protect variables and functions.
2. When the connector parses the global symbols defined in multiple places: the function and initialized global variables are strong symbols, and uninitialized global variables are weak symbols. Rule 1: Multiple strong symbols are not allowed. Rule 2: if there is a strong symbol and multiple weak symbols, select a strong symbol. Rule 3: If multiple weak symbols exist, select any one of these weak symbols.
3. Relocation refers to merging input modules and assigning runtime addresses for each symbol.
4. It is completely linked (relocated) in the executable file, so it no longer needs the. rel section.
5. Load: copy the program to the memory and run it.
6. Shared libraries are new products designed to solve static library defects. A shared library is a target module that can be loaded to any memory address during runtime and connected to a program in the memory. This process is called a dynamic link and is executed by a dynamic connector. A major purpose of the shared library is to allow multiple running processes to share the same library code in the memory, thus saving the memory resources.

Appendix:
1. In C ++, the maximum value of the array is max (int32). How can we break through this limit on 32-bit machines? No! The Array Operations of C ++ will be accessed in the Assembly layer in the form of base address + offset register. The offset register on a 32-bit machine is only 32-bit. Therefore, it cannot be broken through.
2. When the class size is small but the number of objects to be generated is large (such as K and M), is it highly efficient to apply for heap memory with new, or is it highly efficient to directly instantiate objects? Which memory capacity does the upper limit depend on? The instantiation efficiency is high. This is to directly press the stack elastic stack for the program stack, which is much higher than the stack allocation. Heap also records the starting and stopping address of the distribution chain. The maximum stack size depends on the stack length allocated during compilation, and the stack size depends on the memory size provided by the operating system. That is to say, the stack space is much smaller than the heap space, and the system will also use the stack. For example, when a function is dropped, the parameter transfer and return values are all completed through the stack operation. If a large object is opened on the stack, the stack space is used up, and the system is either deadlocked or exited. If you want to instantiate a large number of small objects, it is best to apply for a large memory for management at the beginning of the program, and then use the small object to allocate from this large memory, that is, the memory pool technology, this method is more effective, especially when the size of small objects is fixed.
3. What is the performance comparison between arrays and STL? In terms of access efficiency, arrays are highly efficient (including time and space efficiency), but after adding insert, delete, search, and sort operations, array efficiency is much lower than STL.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.