Logical right shift with arithmetic right shift:
Logical right Shift: High fill 0
Arithmetic right Shift: The value of the highest and most significant bit
In Java: x>>k: Table makes logical right SHIFT, x>>>k: Table does arithmetic right shift
What to do if the number of bits (k) of the shift operation is >32 in a three-(W)-bit machine? , make k= (k) mod (W).
Exercise 2.11 On the basis of the INPLACE_SWAP function in Exercise 2.10, you decide to write a piece of code that implements an array of elements that are reversed in turn. You write the following function:
1 void Reverse_array (int a[], int cnt) {2 int first, last; 3 for (first = 0, last = cnt-1; 4 First <= last; 5 first++, last--) 6 Inplace_swap (&a[first], &a[last]); 7}
When you use this function on an array containing elements 1, 2, 3, and 4, as expected, the elements of the array now become 4, 3, 2, and 1. However, when you use this function on an array containing elements 1, 2, 3, 4, and 5, you will be surprised to see that the elements that get the numbers are 5, 4, 0, 2, and 1. In fact, you'll find that this code works correctly for all even-length arrays, but when the array is an odd length, it sets the middle element to 0.
Http://c-faq.com/expr/xorswapexpr.html
a: First and last values are equal, K (for odd number of elements, 2k+1)
B: For Xor_swap This function, x and Y are not able to point to the same address. The first is a undefined behavior.
Can see http://c-faq.com/expr/xorswapexpr.html
Second, even if not undefined behavior, for example, C language strict rules for the Order of Merit is the same problem, the first step to the X and y into 0
C: Modify to for (first = 0, last = cnt; first < first++, last--).
Exercise 2.25 considers the following code, which attempts to calculate the and of all the elements in array A, where the number of elements is given by the length of the parameter.
1/* Warning:this is buggy code */
2 float sum_elements (float a[], unsigned length) {
3 int i;
4 float result = 0;
6 for (i = 0; I <= length-1; i++)
7 result + = A[i];
8 return result;
9} When the argument length equals 0 o'clock, running this code should return 0.0. In practice, however, the runtime encounters a memory error. Please explain why this happens and how to modify the code.
Answer:length-1 will have overflow, so instead of < length
There are also problems with the number of signed and unsigned numbers, and when the length is particularly large, the correct result is not obtained.
So instead:
unsigned I
for (i = 0; i < length; ++i)
****************************************************************
Declaration: The following various byte-length representations are based on the IA32 instruction set architecture.
Machine-level representation of the program:
A IA32 central processing unit contains 8 registers that store the four -bit value. Used to store integer data and pointers .
%EAX,%ECX,%EDX,%EBX,%ESI,%EDI,%ESP (stack pointer),%EBP (frame pointer)
The first six are general-purpose registers. At the same time Eax,ecx,edx preservation and recovery practices are different from Ebx,edi,esi.
Type of operand:
1, the immediate number, that is, the constant, the method: $ followed by the standard C-represented integer
2, register, the content of a register, for the double word, can be any one of eight registers; for the word, can be%ax this, for the byte, can be%al this
3, memory reference, according to the calculated address, access to a memory location. Consider memory as a large array, using MB[ADDR] to represent a reference to the B-byte value of memory starting at address addr.
%eax:0x100
(%EAX): 0xFF
(Other address addressing is not too much to discuss)
***********************************************************************
Data transfer instruction: Two operands of the delivery instruction cannot all point to the memory location
MOV: Copies the value of the source operand to the destination operand.
The source operand is an immediate number, stored in a register or memory.
The destination operand specifies a location, either a register or the address of a memory.
mov class divided into three kinds :
1, movb,movw,movl respectively: transmit bytes, transmit word, transfer double word
2, MOVSBW,MOVSBL,MOVSWL, will do the symbol extension of the byte to the word, will do the symbol extension of the byte to the word, will do the symbol extension of the word to the word
3, Movzbw,movzbl,movzwl,pushls,popl D, will do 0 extended ~~~~PUSHL and POPL instructions to do the following explanations:
PUSHL S ( double word Press stack ):
r[%esp]<-r[%esp]-4;//reduce the address by 4 (the stack extends to the address drop)
m[r[%esp]]<-s;//data is pressed into
So: The PUSHL instruction is equivalent to a sub address and a MOVL value of two instructions. Reduce the stack pointer by 4 first,
The new value is then written to the new stack top address.
Sub $4,%esp minus 4 of the value in the%ESP first,
Movl%ebp, (%ESP) puts the values in the%EBP into a memory in the%ESP point
A location
POPL D ( double word out of stack ):
d<-m[r[%esp]];//pulling data out of the stack
r[%esp]<-r[%esp]+4//add 4 to the address;
(both Movs and movz copy a smaller data source to a larger data location)
Anyway, %esp The value pointed to is always the top of the stack.
*****************************************************************
The stack plays a significant role in the process invocation. In IA32, the program stack is stored in an area of the memory . In general, the stack upside down, the stack downward growth, the lower the stack address smaller. stack Pointer %esp holds the address of the top element of the stack . (See previous section)
Stack and program code, as well as other forms of program data, are stored in the same memory, so the program accesses any location within the stack with a standard memory addressing method. For example : Assuming that the top element of the stack is a double word, then movl 4 (%ESP),%edx, will copy the second double word from the stack to the Register %edx.
**************************
C Language code:
Intchange (int *xp,int y) {
int x=*xp;
*xp=y;
return x;
}
**********************
Assembly Code:
XPat%ebp+8,y at%ebp+12//
Movl8 (%EBP),%edx//gets XP, assigns%edx
MOVL (%edx),%eax//acquires *xp, assigns to%eax
MOVL12 (%EBP),%ecx
MOVL%ECX, (%edx)
***********************
Arithmetic operations:
Leal: Load a valid address,
Leal S,d says:d<-&s;
However, it is often used to perform simple arithmetic operations and address calculations.
Leal (%eax,%ecx,4) equivalent to: x+4y
Other directives: INC D d+1;decd-1;
NEG D-d,not D ~d;
Add, Sub Minus, Imul, XOR, or or, and with, Sal left shift, SHL left,
SAR arithmetic right shift, SHR logical right SHIFT.
***********************************************************
PS: Program memory contains: Executable machine code of the program, some information required by the operating system, used to manage process calls and returned run-time stacks, and user-allocated memory blocks (such as malloc library function allocation). The program memory is addressed with a virtual address. At any moment, only a limited virtual address is considered legal. Although the 32-bit address of the IA32 can address 4GB, a program typically accesses only a few megabytes. The operating system is responsible for managing the virtual address space, translating the virtual address into the physical address of the actual processor memory, a machine instruction that performs only a very basic operation, a simple arithmetic operation, data transfer between memory and registers, conditional branching to a new instruction address, etc., the compiler must produce sequences of these instructions, Thus, the program structure of expression evaluation, loop or procedure call and return is realized.
Control: Determines the order in which the operations are executed based on the test results. ( program memory storage Machine code )
Jump can change the order in which a set of machine code instructions are executed.
Condition Code: In addition to the integer register, CPU also maintains a set of single-bit conditional code registers that describe the properties of the most recent arithmetic or logical operation and can detect these registers to perform conditional branching instructions.
In addition to all the ADD,SHR commands (which are described in the previous section), there are also two types of directives that set the condition code, but do not change the value of any register (the Leal of the previous add will often change the value in the target operand register). CMP and test instructions, respectively
CMPS1,S2:S1-S2 (compare operation) (same as sub behavior)
TESTS1,S2:S1&S2 (test) (same as Add behavior)
They are used only to set the value of the condition code.
Special usage: Test%eax,%eax: Check whether the%eax is negative, 0, or positive.
Access Condition Code : The condition code is not read directly, Set The instruction combines the value of the condition code into a single-purpose operand, placing the operand in the previous 8-byte register Element (%ah), or storing a byte of memory location .
After comparing a<b, put the results in%eax.
*************************************
Jump instruction and its encoding :
MOVL $0,%eax
JMP. L1
MOVL (%eax),%edx
. L1
POPL%edx
The third line of instructions skips the MOVL instruction. When the target code file is generated, the assembler determines the address of all labeled directives and encodes the destination (the address of the destination instruction) as part of the jump instruction.
How to encode the jump instruction:
1, PC Related: The address of the target instruction and the instruction immediately after the jump instruction to do the difference between the code.
These offsets can be 1, 2, or 4 bytes
2, give the absolute address, with four bytes directly specify the target
The value of the program counter is the address of the instruction following the jump instruction, not the address of the jump instruction itself. Add the value of the program counter to the target code (which should be the offset of the gas) and get the destination address of the jump.
8048757:72 E7 JE xxxxxxx
8048759:C6 A0 MOVL $0x1,0x804a010
Destination Address: 0x8048759-25 (0xe7 is a one-byte complement of 25)
***************************************
Conditional code + JUMP instruction: Control the execution of the program. (Control Flow)
According to the combination of the condition code and the jump instruction, or jump, or continue to execute the next instruction in the code sequence. The names of these instructions are matched to the set instructions.
To implement a conditional branch:
Implementing loops: Write the goto code first. Then write the assembly. Do while,while,for. For can be converted from while, but be aware of the situation of continue, may cause I can not self-increment 1 into the dead loop.
Conditional Delivery Directives :
The transfer of data is an alternative strategy. The two results of a conditional operation are evaluated first, and then one is selected based on whether the condition is satisfied. Usage scenarios are limited. Match modern processors.
Original C-Language code:
int Absdiff (int x,int y) {
if (x<y)
Returny-x;
Else
Returnx-y;
}
To assign a value using a condition:
int Cmovdiff (int x,int y) {
Inttval=y-x;
Intrval=x-y;
inttest=x<y;
if (test) Rval=tval;
Returnrval;
}
Found no jump instruction in the assembly code of the following program. When a machine runs to a conditional jump (also called a branch), it is not able to determine whether a jump is performed, and the processor uses very sophisticated branch prediction logic to try to guess whether each jump instruction will execute. There will be branch forecast penalties. So it's best to minimize jump commands. Conditional delivery does not need to predict test results, just read the source values, check the conditions, and then either update the destination register, or remain unchanged.
************************************
The implementation of conditions and loops is no longer discussed. The switch statement is discussed below:
The switch statement has multiple branches based on an integer index value. Use a jump table.
Jump Table : An array , the table item I is the address of a code snippet, the implementation of this code when the switch index value equals I should take action. The program code uses a switch index value to perform an array reference within a jump table to determine the jump target.
First JMP *. L7 (,%eax,4), C code declares a jump table as an array of 7 elements, each of which is a pointer to the location of the code, enabling an indirect jump, each of which implements a different branch of the switch statement.
************************************************
Procedure Call:
This includes passing data (in the form of process parameters and return values) and control from part of the code to 21 parts. Allocates space for local variables of the procedure when entering, and frees the space when exiting. Most machines transfer control to the process and transfer out of the process to control this simple instruction. data transfer and local variable allocation, release, through the operation of the program stack implementation .
Stack frame structure:
The machine uses stacks to pass process parameters, store return information, and save registers for later recovery.
What's the stack frame? The portion of the stack allocated for a single process is called a stack frame.
When the program executes, the stack pointer can be moved, so most information access is relative to the frame pointer. (Note return address !!) )
stacks are only used to store some addresses and data!! Don't think about all the instructions on the stack!!! instruction is present in a program memory!!
Suppose P calls the procedure Q, then the parameter of Q is placed in the stack frame of p. When P calls Q, the return address in P is pressed into the stack, forming the end of the stack frame of p . The return address is where the program should continue to execute when it returns from Q.
The process Q uses stacks to hold other local variables that cannot be placed in the register, because:
1, not enough registers to store all local variables
2. Some local variables are arrays or structures that must be accessed through an array or struct reference
3. To use the address operator & for a local variable, you must be able to generate an address for it.
Transfer control
The call command, like a jump, can be direct or indirect.
The effect of the call command:1, the return address into the Stack 2, jump to the beginning of the called process.
When the calling procedure returns, execution continues from here. ret The instruction pops the address from the stack and jumps to the location (indicating that the program counter is assigned the return address and jumps to this address to execute the corresponding instruction).
Program counter:%eip . Where the value represents the address of the instruction that is currently executing.
Register Usage Conventions:
Program register groups are the only resources that can be shared by all processes. It is important to ensure that when a procedure (the caller) invokes another procedure (callee), the callee does not overwrite the value of the register to be used after a caller. ( I understand that although the called process has its own stack frame, but the data is to be placed in the register to do operations, etc.) the Convention divides%eax,%edx and%ECX into the caller-save register.
%ebx,%esi and%edi are divided into caller-save registers.
When P passes a parameter y to Q, if Y does other operations in the operation of Q. Then, when P uses Y, it will go wrong. This means that Q must be saved to the stack before the values of these registers are overwritten and restored before returning. Implemented in two ways:
1. Before calling Q, P saves the value of y in its own stack frame, and when Q returns, p removes the Y from its own stack.
That is to say, p saves this value
2. Save the value of Y in the callee's register, and then save the value of the register in its own stack frame and restore the value before returning. That is to say, Q saves this value.
for details on procedure calls, see in-depth understanding of computer systems 156 page (%EBP and%esp is constantly changing).
Array Assignment and Access:
1, for the definition of T a[n], there are two effects:
Allocates a contiguous region of l*n bytes in memory . L refers to the size of the data type T, which is expressed in bytes, and the starting position is represented by XA. The identifier A,a is introduced as a pointer to the beginning of the array, and the value of this pointer is XA. The array element i is placed where the address is xa+l*i.
PS: Define char *b[8], in the array of pointers, the element size is 4 bytes (not a single byte of char)
Suppose E is an array of type int, the address of E[i]:(E is placed in%edx, I is stored in%ecx):
Use the following command: MOVL (%edx,%ecx,4),%eax
The compute address xe+4i is executed and the value of the memory location is read into the%eax.
The Ps:c language allows the operation of pointers, and the computed values are scaled based on the size of the data type referenced by the pointer.
Example: If a is an array, then * (A+i) represents the number of elements (because a can represent the starting address of the array).
Data alignment: Why is it best to do data alignment?
Assuming that a processor always takes 8 bytes from memory, the address must be a multiple of 8. If we can guarantee that the address of all double type data is a multiple of into equal 8, then the processor can fetch the double value only once for the memory. Otherwise, two reads and writes to the memory may be performed. Although IA32 works fine regardless of alignment, data alignment can improve efficiency.
*****************************
for function pointers:
Pointers can also point to functions, providing a very powerful store and the ability to pass references to code.
int fun (int x,int *p);//Declare a function
(int) (*FP) (int x,int *p);//Declare a function pointer
fp=fun;//assigns the function fun to this pointer, note that this is not written in &fun (); Similar to the assignment principle of array pointers
The value of the function pointer is the address of the first instruction in the function's machine code!
Use this pointer to call this function:
int Y=1;
FP (3,&y);
PS: Note the wording:
Int (*FP) (int *x)//Indicates that f is a pointer to a function. The function's argument is type int*, and the function return type is int
int *fp (int *x)//This sentence will be interpreted as:(int *) FP (int *x), which is understood to be a function prototype, this function to Int*x
is a parameter, and the return type is int*
Personal thinking: The local variables in the procedure call are stored in the stack, and when the variable is manipulated, it is placed in the register of the processor to be staged and then manipulated. After the data changes, write the stack (that is, the data in the update stack), when the call is finished, the status information (%EPX and%EBX) pop-up stack, the top of the stack points to the return address, the execution of the return address corresponding to the instruction. If the parameter of the called procedure is a pointer, then the value of the pointer is changed, and the content is changed naturally, the content that the pointer points to can be in the caller's or callee's own stack frame.
Note: The registers used to hold the temporary value are specified as the caller saves, the function can freely overwrite these values ( because there is a stack of ~ ~), and some registers are designated as the callee to save the register, any modification of these registers to save and restore them. is to save their values on the stack before modifying the values of these registers.
************************************************************
Out-of-bounds references and buffer overflows for memory!! (finally see the section related to CSP)
C does not have any bounds checking for the reference to the array, and local variables and state information (such as saved register values and return addresses) are stored on the stack ( these variables must be stored in the stack because we must generate an address for them ). Write operations on array elements that are out of bounds can destroy state information stored in the stack. A critical error occurs when the program attempts to reload the register or perform a RET instruction using this corrupted state.
************************************************
After the 32-bit expansion to 64-bit, the register becomes 16, changing as follows:
A great change has taken place after turning to 64 digits.
In a procedure call, by doubling the register, the program no longer needs to rely on the stack to store and retrieve process information .
The only reason to need a stack becomes: Store the return address.
If a function requires a stack frame, the possible reason:
1, too many local variables, can not be placed in the register
2. Some local variables are arrays or structures
3, the function uses the address operator to calculate the address of a local variable (required address of the variable to be stored in the stack, otherwise placed in the register, no address)
4, the function must pass some parameters on the stack to another function
5, before modifying a callee's save register, the function to save its state.
Note: In IA32, the stack pointer moves forward and backward as the value is pressed in and out, but the stack frame of the x86-64 process usually has a fixed position at the beginning of the process by reducing the stack pointer (register%RCP) to set (stack to small address growth), so that the data can be accessed by the offset of the stack pointer, visible , you do not need a frame pointer in the IA32. (The frame pointer in the IA32 is used to fix the position of the AH ~ ~) visible, in the x86-64, the stack of the data after the pressure is not the stack pointer-1, because the stack pointer is fixed. And after the call does not have to slowly pop up the stack element (to point to the return address corresponding to the position of the stack), you can simply increase the stack pointer to free up the stack space.
*****************************************************
Machine-level representation of the program