In early August, I started to access the book "understanding computer systems in depth". At that time, I read the English version. It took me a month to finish it, but I know there are still a lot of things I don't understand, so I borrowed a Chinese version. Now I can say that I have actually read this book and can write some reading notes.
In-depth understanding of computer system Reading Notes
P39-40
Convert binary to unsigned decimal number b2u (w) = Σ Xi * 2 ^ I (0 <= I <= W-1)
B2u (4) ([1011]) = 1*2 ^ 3 + 0*2 ^ 2 + 1*2 ^ 1 + 1*2 ^ 0 = 11
Convert to complement form b2t (w) =-X (W-1) * 2 ^ (W-1) + Σ Xi * 2 ^ I (0 <= I <= W-2)
B2t (4) ([1011]) =-1*2 ^ 3 + 0*2 ^ 2 + 1*2 ^ 1 + 1*2 ^ 0 =-5
P45-46
Conversion between unsigned and Complement
T2u (w) = x + 2 ^ W (x <0)
= X (x> = 0)
U2t (w) = u (U <2 ^ (W-1 ))
= U-2 ^ W (u> = 2 ^ (W-1 ))
P52
Truncation result of the unsigned number b2u ([x (k-1), x (K-2 )... x0]) = b2u ([x (W-1 )... x0]) mod 2 ^ K
Result b2t ([x (k-1), x (K-2 )... x0]) = u2t (b2u ([x (W-1 )... x0]) mod 2 ^ K)
P55
Unsigned addition x + (u, W) y = x + y (x + y <2 ^ W)
= X + y-2 ^ W (2 ^ W <= x + y <= 2 ^ (W + 1 ))
P58
Complement addition x + (T, W) y = x + y-2 ^ W (2 ^ (W-1) <= x + y) positive Overflow
= X + y (-2 ^ (W-1) <= x + y <2 ^ (W-1) normal
= X + y + 2 ^ W (x + y <-2 ^ (W-1) Negative Overflow
P64
Product shift operation: x * k. k is represented by the binary number. If K can be represented as a continuous 1 (n> = m) of a group of slave positions N in place m ), we can use one of the following two different forms to calculate the influence of these bits on the product.
Form a :( x <n) + (x <n-1) +... + (x <m)
Form B: (x <n + 1)-(x <m)
In this form, we can calculate x * K without multiplication.
P333
Code movement: This kind of optimization includes identifying the computing that needs to be executed multiple times but the computing result will not change, and moving the computing to the part that won't be evaluated multiple times before the code
For example: for (INT I = 0; I <strlen (s); I ++) Here we should advance strlen (s) and save it with a variable length
After Rewriting: int length = strlen (s); For (INT I = 0; I <length; I ++)
P338
Loop expansion: This is a program transformation, which reduces the number of iterations by increasing the number of elements in each iteration.
For example:
For (I = 0; I <limit; I + = 2)
ACC = (ACC op data [I]) OP data [I + 1]
The above is to expand the loop K = 2 times, but the last few elements need to be processed separately
P351
Two-way parallel operation: For combiner and interchangeable merge operations, you can combine and split one operation into two or more parts and merge the results at the end to improve the performance.
For example:
For (I = 0; I <limit; I + = 2)
{
Acc0 = acc0 op data [I];
Acc1 = acc1 op data [I + 1];
}
...
* DEST = acc0 op acc1
The above uses two loops to expand and two parallel
P354
Re-integration and Transformation: minor changes to the Code, and changes to the merge mode can reduce the number of operations on key paths in computing, and achieve better performance by better utilizing the pipeline capabilities of functional units
For example:
ACC = (ACC op data [I]) OP data [I + 1]
ACC = acc op (data [I] op data [I + 1])
Only the difference between brackets, but the performance is very different
P359
The Key Path specifies a basic lower bound of the time required to execute a program. If a program has a data-related chain, the sum of all latencies on this chain is equal to T, this program must be executed in at least t cycles.
At the same time, the throughput limit of the functional unit is also a lower limit for program execution. Assume that a program requires n computing operations, and the microprocessor only has m functional units that can execute this operation, and the launch time of these units is I, it takes at least N * I/M cycles to execute the program.
P369
Performance Improvement Technology
A-select an appropriate algorithm and data structure for advanced design
B-Basic coding principles
Eliminate continuous function calls and try to move the computation out of the loop
Eliminate unnecessary memory references and introduce temporary variables to save the intermediate structure. Only the calculated values are saved in the array or global variables.
C-low-level optimization
Expand the cycle to reduce overhead and make further optimization possible
By using technologies such as multiple accumulative variables and re-integration, find a method to improve the command-level parallel
Rewrite conditional operations with functional styles to enable conditional data transfer for compilation
P374
Amdahl's Law: When we speed up a part of the system, the impact on the overall performance of the system depends on how important this part is and how fast it is improved.
Assume that the percentage of execution time of the entire application for a part of the system is α
Acceleration ratio S = 1/[(1-α) + α/K]
Therefore, to greatly increase the speed of the entire system, we must increase the speed of a large part of the entire system.
P402-403
Locality: temporal locality (Repeat), spatial locality (reference a location nearby)
The reference mode with the step size of 1 is the sequential reference mode. Generally, as the step size increases, the space locality decreases.
Partial Evaluation Principle:
The program that repeatedly references the same variable has a good time locality.
For a program with a K-step reference mode, the smaller the step size, the better the space locality. The step size is 1 with a good spatial locality, and the program space with a large step size is poorly local.
For instruction fetch, the loop has a good temporal and spatial locality. The more iterations the loop has, the better the locality.
P433
Use the following techniques to exploit locality:
Focus your attention on the internal loop, where most of the access to computing and storage occurs.
By reading data in the sequence of data objects stored in the memory and step-by-step, you can maximize the space locality in your program.
Once a data object is read from the memory, use it as much as possible to maximize the time locality in the program
P455
The function and initialized global variables are strong symbols, and uninitialized global variables are weak symbols.
The UNIX linker uses the following rules to process symbols with multiple definitions:
Rule 1: Multiple strong symbols are not allowed
Rule 2: if there is a strong symbol and multiple weak symbols, select a strong symbol.
Rule 3: If multiple weak symbols exist, select any one of these weak symbols.
P492-522
Process Functions
Pid_t getpid (void) // return the PID of the calling Process
Pid_t getppid (void) // return the parent process PID
Void exit (INT status) // terminate the process
Pid_t fork (void) // The parent process creates a child process through this function. In the parent process, fork returns the child process PID and 0 in the child process.
Pid_t waitpid (pid_t PID, int * status, int options) // The parent process waits for the child process to terminate or stop the PID> 0 to wait for the separate child process, pid =-1 is all
Pid_t wait (int * Status) // wait for all sub-Processes
Unsigned int sleep (unsigned int SECs) // suspends a process for a specified period of time
Int pause (void) // sleep the called function until the process receives a signal
Int execve (const char * filename, const char * agrv [], const char * envp []) // function load and run a new program
Char * getenv (const char * Name) // search for string name = value in the Environment array and return a pointer to value
Int setenv (const char * Name, const char * newvalue, int overwrite) // replace oldvalue with newvalue
Void unsetenv (const char * Name) // Delete name = Value
Signal
Pid_t getpgrp (void) // return the process group ID of the current process (each process belongs to only one process group)
Int setpgid (pid_t PID, pid_t, pgid) // change the process group of the user or other processes.
Int kill (pid_t PID, int sig) // terminate the process and send the signal sig to the process PID.
Unsigned int alarm (unsigned int SECs) // The process sends a sigalam signal to itself in Secs seconds,
Sighandler_t signal (int signum, sighandler_t handler) // process receives Signal
Int sigaction (int signum, struct sigaction * Act, struct sigaction * oldact) // Signal Processing
Non-local jump
Int setjmp (jum_buf env) // Save the current CALL Environment in the Env buffer, which is used by longjmp
Int sigsetjmp (sigjmp_buf ENV, int savesigs) // signal processing version
Void longjmp (jum_buf ENV, int retval) // restore the call environment from the Env Buffer
Void siglongjmp (sigjmp_buf ENV, int retval) // signal processing version
P582-585
Common memory-related errors in C Programs
A-indirect reference of bad pointers
Example: scanf ("% d", Val)
B-read uninitialized memory
For example, assume that the heap memory is initialized to zero.
Int * Y = (int *) malloc (N * sizeof (INT ))
In fact, it is not initialized here. It can be implemented only by using calloc.
C-Allow Stack Buffer Overflow
Example: Char Buf [64]; gets (BUF)
D-assume that the pointer and the object they direct to are of the same size.
For example: int ** A = (INT **) malloc (N * sizeof (INT); // It should be sizeof (int *)
... A [I] = (int *) malloc (M * sizeof (INT ))
The purpose of the program is to create an array composed of N pointers, each pointing to an array containing M int, but the Code actually creates an int Array
E-misaligned errors
For example: for (INT I = 0; I <= m; I ++) // check whether the = sign is obtained.
F-reference the pointer instead of the object it points
For example, binheap [0] = binheap [* size-1];
* Size --; // It should be (* size )--
G-Misunderstanding pointer operation
For example, p is a pointer.
P + = sizeof (INT) // It should be P ++
H-reference a variable that does not exist
Example: int val;
Return & val; // after the program is used once, the local variable is no longer legal
I-reference data in the idle heap Block
Example: Free (X );
Y = X;
J-causes memory leakage
For example: int * x = (int *) malloc (N * sizeof (INT); // No free
P597-608
UNIX system I/O
Int open (char * filename, int flags, mode_t mode) // open a file or create a new file
Int close (int fd) // close the file
Ssize_t read (int fd, void * Buf, size_t N) // copy up to n Bytes from the file location of the descriptor FD to the memory Buf
Ssize_t write (int fd, const void * Buf, size_t N) // copy a maximum of n characters from the memory to the file
Int Stat (const char * filename, struct stat * BUF) // read the file metadata
Int fstat (int fd, struct stat * BUF) // same as above
Int dup2 (INT oldfd, int newfd) // I/O redirection, copy the Descriptor Table item oldfd to the Descriptor Table item newfd
P606
The Unix kernel uses three related data structures to indicate open files.
Descriptor Table: each process has its own independent Descriptor Table. Table items are indexed by file descriptors opened by the process.
File Table: There is a file table in the file opening set. All processes share this table.
V-node table: like the file table, all processes share this V-node table. Each table item contains the majority of information in the stat structure.
P619-629
UNIX network programming functions
Unsigned long int htonl (unsigned long int hostlong) // converts a 32-bit integer from host byte to network byte order
Unsigned short int htons (unsigned short int hostshort) // converts 16-bit integers from host bytes to network bytes.
Unsigned long int ntohl (unsigned long int netlong) // converts a 32-bit integer from a network byte to a host byte.
Unsigned short int ntohs (unsigned short int netshort) // converts 16-bit integers from network bytes to host bytes.
Int inet_aton (const char * CP, struct in_addr * indium) // converts a dot-decimal string CP into a network byte-ordered IP address (indium)
Char * inet_ntoa (struct in_addr_in) // inverse operation. The structure itself is passed, not a pointer to the structure.
Struct hostent * gethostbyname (const char * Name) // return host entries related to the domain name
Struct hostent * gethostbyaddr (const char * ADDR, int Len, 0) // return host entries associated with the IP address ADDR
Int socket (INT domain, int type, int protocol) // create a socket Descriptor
Int connect (INT sockfd, struct sockaddr * serv_addr, int addrlen) // try to establish an Internet connection with the server with the socket address serv_addr
Int BIND (INT sockfd, struct sockaddr * my_addr, int addrlen) // tell the kernel to associate the server socket address in my_addr with the socket descriptor sockfd.
Int listen (INT sockfd, int backlog) // converts sockfd from an active socket to a listening socket and receives connection requests from the client.
Int accept (INT listenfd, struct sockaddr * ADDR, int * addrlen) // The server waits for client connection requests through this function.
P659-660
Thread Functions
Int pthread_create (pthread_t * tid, pthread_attr_t * ATTR, func * F, void * Arg) // create a new thread
Pthread_t pthread_self (void) // obtain the ID of the thread.
Void pthread_exit (void * thread_return) // The thread is terminated. The main thread calls the thread and waits for the peer thread to terminate.
Int pthread_cancel (pthread_t tid) // terminate the current thread
Int pthread_join (pthread_t tid, void ** thread_return) // wait for other threads to terminate
Int pthread_detach (pthread_t tid) // The thread calls this function. The separation can be combined with the thread tid.
Int pthread_once (pthread_once_t * once_control, void (* init_routine) (void) // initialize the thread-related status
Postscript:
It can only be said that this book may not be as cool as it is. I don't know if it is because I have not been able to get started yet. However, this book is very clear, and there are also a lot of programming advice, in my opinion, there are factors that combine the computer composition principle with the operating system, but it is obviously impossible to include those two parts, which can only be regarded as a foreshadowing, for further study, you should read more professional books, such as operating system-essence and design principles and computer architecture.
I don't know whether it is too late to read this book. But since I can learn something, I am happy to accept it. If I read more books and read a good book, it must be learned by programmers.