Chapter II Representation and processing of information
First, preface
1. Binary digits are called bits (bit)
2. Three important numbers: unsigned code, complement (signed), floating point number (scientific notation)
3, floating point operation although overflow will produce a special value +∞, but a set of positive product is always positive. Because of the limited precision of the representation, floatingpoint arithmetic cannot be combined. Integer and floatingpoint arithmetic have different mathematical properties because they deal with a different way of representing the finite number of digitsthe representation of an integer can only encode a relatively small range of values, but this representation is accurate, while floatingpoint numbers can encode a larger range of values, but this representation is just near.
Second, information storage
1. Most computers use a 8bit block, or byte, as the smallest addressable memory unit. Machinelevel programs treat memory as a very large array of bytes, called virtual memory. Each byte of the memory is identified by a unique number, called its address, and the collection of all possible addresses is called the virtual address space.
2. how the compiler and runtime systems divide the memory space into more manageable units to hold different program objects (program object), that is, procedural data, directives, and control information. the value of a pointer in C (whether it is pointing to an integer, a struct, or some other program object) is the virtual address of the first byte of a storage block.
3. Hexadecimal notation
A byte is made up of 8 bits. In binary notation, its range is 000000002 to 111111112, and if it is represented by a decimal integer, its range is 0 ~ 255. The bit pattern is represented by a base of 16, or by a hexadecimal (hexadecimal) number. Hex ("hex") uses the number ' 0 ' ~ ' 9 ', and the character ' A ' ~ ' F ' to represent 16 possible values. In hexadecimal notation, the value of a byte is 0016 ~ FF16.
A numeric constant that begins with 0x or 0X is considered to be a hexadecimal value. The character ' A ' ~ ' F ' can be either uppercase or lowercase or even case mixed.
A common task for writing machinelevel programs is to manually convert the decimal, binary, and hexadecimal representations of the inplace mode.
When n is expressed in the form of i + 4 J, where 0≤i≤3, we can write X as the beginning of the hexadecimal number 1 (i = 0), 2 (i = 1), 4 (i=2) or 8 (i=3)
4. Each computer has a word size, which indicates the nominal size of the integer and pointer data (nominal size). Because the virtual address is encoded in one of these words, the most important system parameter that the word length determines is the maximum size of the virtual address space. For a machine with a word length of W, the virtual address ranges from 0 to 2w1, and the program accesses up to 2w bytes.
5. Data size
The data type of C char represents a separate byte. Although "Char" is named for the fact that it is used to store a single character in a text string, it can also be used to store integer values.
C's data type int can also be preceded by a qualifier short, long, and the nearest long long to provide an integer representation of various sizes.
The exact number of bytes depends on the machine and the compiler. The "short" integer is assigned 2 bytes, and the unqualified int is 4 bytes. The "Long" integer uses the full word length of the machine. The "Long" integer data type introduced by ISO C99 allows 64bit integers.
Single precision (declared in C as float) and double precision (declared as double in C). The format uses 4 bytes and 8 bytes, respectively.
6. Addressing and byte order
What is the address of this object, and how these bytes are arranged in memory.
Some machines choose to store objects in memory in the order in which they are from the least significant byte to the most significant byte, while others are stored in the order from the most significant byte to the lowest valid byte. The previous rulethe least significant byte in the front of the way, called the smallend method (little endian). This rule is used by most Intel compatible machines. The latter rulethe most significant byte in the front way, called the big endian. Most IBM and Sun Microsystems machines use this rule.
Code writing for a network application must follow established rules on byte order to ensure that the sender machine translates its internal representation into network standards, while the receiving machine translates the network standard into its internal representation.
A disassembler is a tool that determines the sequence of instructions represented by an executable program file.
When reading this smallend method machinegenerated machinelevel program representation, the bytes are often displayed in reverse order. The natural way to write a byte sequence is that the lowest byte is on the left, while the highest byte is on the right, which is exactly the most significant bit on the left, and the least significant bit on the right is the opposite.
The third case where byte order becomes visible is when writing programs that circumvent the normal type system. In the C language, you can use coercion type conversion (CAST) to allow referencing an object in a data type that differs from the data type defined when the object was created.
On Linux 32, Windows and Linux 64 this shows that they are small endoftheway machines, while Sun is a bigendian machine.
Multibyte objects are stored as contiguous sequence of bytes, with the address of the object being the smallest address in the byte being used.
Linux 32, Windows, and Sun machines use 4byte addresses, while Linux 64 uses 8byte addresses.
The ASCII code for the decimal digit x is exactly 0x3x, and the hexadecimal representation of the terminating byte is 0x00. Any system that uses ASCII code as a character code will get the same result, regardless of the byte order and word size rules. As a result, text data has greater platform independence than binary data.
The Java programming language uses Unicode to represent strings. Unicodeenabled libraries are also available for the C language.
7. Boolean algebra
Binary values are the core of computer coding, storing, and manipulating information.
The simplest Boolean algebra is defined on the basis of the twotuple set {0,1}.
Claude Shannon (19162001), which founded the field of information theory, first establishes the connection between Boolean algebra and digital logic.
The Boolean operation extends the operation of the inplace vector, which is a string with a fixed length of W, composed of 0 and 1. The operation of a bit vector can be defined as an operation between each corresponding element of a parameter.
A useful application of bit vectors is to represent a finite set.
Boolean Operations  and & correspond to the set's and intersection respectively, and the ~ corresponds to the complement of the set.
This mask represents a set of valid signals.
8. Bitlevel arithmetic in C language
The best way to determine the result of a bitlevel expression is to extend the hexadecimal parameter into binary notation and perform a binary operation, and then convert back to 16 binary.
A common use of bitlevel arithmetic is to implement a mask calculation, where the mask is a bit pattern that represents a collection of bits selected from a word.
A mask of 0xFF (the lowest 8 bits of 1) represents the low byte of a word.
9. Logical operation in C language
10. Shift operation in C language
The machine supports two forms of right shift: logical right SHIFT and arithmetic right shift. The logical right moves at the left to complement K 0, the arithmetic right shift is the value of the most significant bit in the left complement K
The C language standard does not explicitly define which kind of rightshift should be used. For unsigned data (that is, integer objects declared with qualifier unsigned), the right shift must be logical. For signed data (the default declared integer object), the right shift of arithmetic or logic is possible.
Almost all compiler/machine combinations use arithmetic right shifts for signed data, and many programmers assume that the machine will use this right shift.
Java, on the other hand, has a clear definition of how to move right. The expression x>>k shifts the X arithmetic to the right by the K position, and the x>>>k shifts the logical right of the X.
Third, integer representation
1. Integer data type
The C language supports multiple shaping data typesintegers representing a limited range.
To use the "long Long" type in C99, compiling is to use the Gccstd=c99
2, unsigned number of code p3944
Suppose an integer data type has a w bit. We can write a bit vector as x→, represent the whole vector, or write [Xw1, xw2,...,x0], representing each bit in the vector. As a binary representation of x→, the unsigned representation of x→ is obtained.
Unsigned binary has an important attribute, that is, each integer between 0~2^w1 has a unique value of w encoded, and the function is a doubleshot.
3, signed number and unsigned number conversion p4447
The C language allows for coercion of type conversions between different numeric data types. Converting negative numbers to unsigned numbers can get 0. If the converted unsigned number is too large to exceed the range that the complement can represent, you may get Tmax.
The C language allows the conversion between signed and unsigned numbers, and the principle of conversion is that the underlying bit representation remains the same
4. Signed number and unsigned number p4749 in C language
5, expand the bit representation of a number
To convert an unsigned number to a larger data type, simply add 0, or 0, at the beginning of the representation; Convert a complement number to a larger data type you can perform a symbol extension, which is a copy of the value that represents the most significant bit added.
6. Truncation of digital P51
The number of a W bit assumes that we do not extend a value with an extra bit, but rather reduce the number of digits that represent a number. X=[xw1, xw2,...,x0] when truncated to a Kbit number, the high wk bit is discarded, a bit vector is obtained [Xk1, xk2,...,x0], and truncating a number may change his valuea form of overflow.
Note : Implicit coercion of type conversions with signed numbers to unsigned numbers results in some nonintuitive behavior. These nonintuitive features often result in program errors, and this kind of error, which contains subtle differences in implicit coercion type conversions, is difficult to find. Because this coercion type conversion occurs without explicit instructions in the code, the programmer often ignores its impact.
7. Complement code
The most common form of a computer representation of signed numbers is the complement. In this definition, the most significant bit of the word is interpreted as a negative right.
The range of values that can be represented [ 2^ (w1) ~2^ (w1)1], in the range that can be represented, each number has a unique Wbit complement code, and the function is a doubleshot.
Note :
The complement uses the length of the register as a fixed feature to simplify mathematical operations. Think of clocks, 121 is equivalent to 12 + 11, the use of complement can be used to unify mathematical operations into addition, as long as an adder can achieve all the mathematical operations.
The range of the complement is asymmetrical:  tmin =  Tmax + 1, which means that tmin does not have a positive number corresponding to it. This leads to some special attributes of the complement operation and can easily cause minor errors in the program. This asymmetry occurs because half of the bit patterns (the number of sign bits set to 1) represent negative numbers, and half of the numbers (the sign bit is set to 0) represent nonnegative numbers. Because 0 is a nonnegative number, it means that a positive number can be expressed less than a negative number.
The largest unsigned value is just twice times larger than the maximum of the complement: UMAXW = 2 Tmaxw + 1. All bit patterns representing negative numbers in the complement representation become positive numbers in the unsigned representation.

 Anticode: In addition to the most effective bit of the right is(2w11) and not 2w1, it is the same as the complement
 Original code: The most significant bit is the sign bit used to determine whether the remaining bits should take negative or positive rights.
8. Other representations of the number of symbols
 Anticode: In addition to the most effective bit of the right is(2w11) and not 2w1, it is the same as the complement
 Original code: The most significant bit is the sign bit used to determine whether the remaining bits should take negative or positive rights.
Four, integer arithmetic
1. Unsigned addition
Consider two nonnegative integers x and y, satisfying 0≤x, y≤2w1. Each number can be represented as a Wbit unsigned number. However, if we calculate their and, we have a possible range of 0≤x + y≤2w+12. Represents this and may require a W + 1 bit. This constant "wordlength expansion" means that the word length is limited to the full expression of the result of the arithmetic operation.
Unsigned operations can be considered as a form of modulo operations. Unsigned addition is equivalent to computation and 2w on the modulo. This value can be calculated by simply discarding the highest bit represented by the W + 1 bits of x + Y.
overflow : An arithmetic operation overflow means that a complete integer result cannot be placed in the total length limit of the data type.
2. Complement addition
3, the complement of non
Each digit x in the range 2w1≤x < 2w1 has an additive inverse under +WT. defined for the 2w1≤x < 2w1 within the range of X, the complement of the nonopWT is as follows:
4. Nonsymbolic multiplication
In 0≤x, integers x and y in Y≤2w1 can be represented as unsigned numbers of w bits, but their product x The values for Y range from 0 to (2w1) 2 = 22w2w+1+1. This may require a 2w bit to represent. However, the unsigned multiplication in the C language is defined as the value that produces the Wbit, which is the value represented by the low w bit of the integer product of the 2w bit. can be regarded as equivalent to the computational product modulo 2w.
Therefore, the Wbit unsigned multiplication operation * Wu results in:
5. Complement multiplication
The symbolic multiplication in C is achieved by truncating the product of the 2w bit to w bit.
6. Multiply constants
The compiler used an important optimization to try to replace multiplication multiplied by a constant factor with a combination of shift and addition operations. The integer is split into a power of 2, then the shift is used to calculate (left), and the result is added at the end. Similarly, for nonnegative numbers, the arithmetic right shift Kbit is the same as dividing by 2^k.
We can fix this inappropriate rounding by "biasing" (biasing) before the shift. The attributes used by this technique are: for integers x and y of any y > 0, there are "x/y for example, when x =30 and y = 4, we have x + y1 =27, and"30/4 we have x + y1 =29, and "32/4 here 0≤r < y , get (x + y1)/y = k + (r + y1)/y, so when the following item equals 0, and when R > 0 o'clock, equals 1. That is, by adding a partial y1 to X and then rounding the division down, when y divides x, we get k, otherwise we get K + 1. Therefore, for x < 0, if you precede the right shift with X plus 2k1, then we will get the result of rounding correctly.
Four, floating point number
A floatingpoint representation encodes the rational number of a shape such as V = xx2y. It involves a very large number of executions ( V >
0), very close to 0 ( V <<1), and more generally as a calculation of the approximate value of a real number operation, is useful.
1, binary decimal
2. IEEE Floating point representation
The IEEE floatingpoint standard represents a number in the form of V = ( 1) ^sxmx2^e:
 Symbol: s Determines whether the number is negative (S=1) or positive (s=0), while the symbolic bit interpretation for the value 0 is handled as a special case.
 Mantissa: M is a binary decimal, and its range is 1~2ε, or 0~1ε.
 Order code: The role of E is weighted to the floatingpoint number, which is 2 of the power of the E (may be negative).
Divide the bit representation of a floatingpoint number into three fields and encode the values:
 A separate sign bit s directly encodes the symbol S.
 The Order field of Kbit exp = EK1...E1E0 encoded Order E.
 Nbit decimal field frac = FN1...F1 F0 encoded mantissa m, but the encoded value also depends on whether the value of the Order field equals 0.
Given a bit representation, the encoded value can be divided into the following three cases, depending on the exp value:
This is true when the EXP bit mode is neither 0 (value 0) nor all 1 (singleprecision value is 255, doubleprecision is 2047). In this case, the Order field is interpreted as a signed integer in the form of a bias (biased). That is, the value of the order is E = Ebias, where e is an unsigned number, its bit is represented as EK1...E1E0, and Bias is a bias value equal to 2k11 (single precision is 127, double precision is 1023). This results in an exponential range of values, which is 126~+127 for single precision and 1022~+1023 for double precision.
 Case 2: Nonnormalized values
When the Order field is full 0 o'clock, the number represented is a nonnormalized form. In this case, the order value is E = 1bias, and the value of the mantissa is M = f, which is the value of the small number field, which does not contain the implied beginning of 1. The reason for nonnormalized values to set the bias value is that it seems counterintuitive to make the order value 1bias instead of a simplebias. We will soon see that this approach provides a smooth transition from nonnormalized values to normalized values.
The number of nonnormalized numbers has two uses:
First, they provide a way to represent the value 0, because with normalized numbers we must always make m≥1, so we cannot represent 0.
Another feature is the number that is very close to 0.0. They provide a property called a gradual overflow, where the possible numerical distributions are uniformly close to 0.0.
Occurs when the point code is all 1. When the decimal field is all 0 o'clock, the resulting value represents infinity, when s = 0 o'clock is +∞, or when s = 1 o'clock is∞. When we multiply two very large numbers, or divide by zero, infinity can represent the result of overflow.
3. Rounding
* * Rounding: * * Because the representation method limits the range and precision of floating point numbers, floatingpoint arithmetic can only approximate the real number operation. Therefore, for the value x, we generally want to use a system method, can find the "closest" Match value X ', it can be expressed in the desired floatingpoint form.
* * The rounding direction is determined in the middle of two possible values: * * An alternative approach is to maintain the lower and upper bounds of the actual number. For example, we can determine the values that can be represented by X and x+, so that the values of x lie between them: x≤x≤x+.
The IEEE floatingpoint format defines four different rounding methods.
 The default method is to find the closest match, while the other three can be used to calculate upper and lower bounds.
 The other three ways of producing actual values are indeed bounded. These methods are useful in a number of applications. Rounds a positive number to a 0 rounding method, rounds the negative numbers up, and gets the value x^, making  X ^≤ x . The rounding down method rounds both positive and negative numbers down to get the value x, which makes the x≤x. Rounding up the positive and negative numbers rounded up to get the value x+ to meet the x≤x+.
4. Floating point Arithmetic
The IEEE standard specifies a simple rule that is used to determine the results of arithmetic operations such as addition and multiplication. The floatingpoint values X and Y are considered real, and an operation ⊙ is defined on the real number, and the calculation produces round (x⊙y), which is the result of rounding the exact result of the actual operation. When one of the parameters is a special value (such as0,∞, or Nan), the IEEE standard defines some rules that make it more reasonable. For example, defining 1/0 will produce∞, and defining 1/+0 will produce +∞.
 Floatingpoint addition does not have a binding, which is the most important group attribute that is missing.
 Floatingpoint addition satisfies the monotonicity attribute: If a≥b, then for any value of a, B, and X, except Nan, there is x + a≥x + B. unsigned or complement addition does not have the attribute of this real (and integer) addition.
 For any A, B, and C, and A, B, and C are not equal to Nan, floatingpoint multiplication satisfies the following monotonicity:
5, Clanguage floatingpoint number
All C language versions provide two different floatingpoint data types: float and double. On machines that support IEEE floatingpoint format, these data types correspond to singleprecision and doubleprecision floatingpoint.
The newer version of the C language, including the ISO C99, contains the third floatingpoint data type long double. For many machines and compilers, this data type is equivalent to a double data type. However, for Intel compatible machines, GCC uses the 80bit "extended precision" format to implement this data type, providing a much larger range and precision than the standard 64bit format.
============ problems encountered ==================
The understanding of the formula is still not enough, although the teacher said the formula can not look, but do not look at the formula will be a large part of the content can not be understood, practice is also troublesome. So I spend a lot of time on the formula, but in the end it is not as good as the previous compilation, computer introduction to the logarithmic system of simple understanding. The code also has many parts that need to be constructed. If there is no answer, I think I may not know how to write.
Information Security system Design Fundamentals third Week study summary