Binary decimals
10 binary Representations
2 binary Representations
You can see that the decimal point of the binary decimal is not shifted to the left one, equivalent to 2
But a binary can only represent a decimal that can be written as a x*2^y. For other values only approximate
IEEE Floating point representation
IEEE standards represent floating-point decimals in a way similar to scientific notation, that is, we represent each floating-point number as V = ( -1)
S* M * 2E.
s sign bitM Mantissa, which is a binary decimal, with a value range of [up] (normalized value), [0,1] (non-normalized value)E-Order Code
the bit representation of a floating-point number is divided into three partss 1 sign bitThe Order field for the exp k bit is used to denote the Ea small number of frac n bits is used to denote m
floating-point encoding is divided into three typesnormalizedNon-normalizedSpecial VALUES (infinity, non-digit Nan)
normalizedThe most common format, the bit pattern for this format's order field is not full 0 and not all 1
It says, the order code is 2.Eof E
is the index of 2, so ecan bepositive to negativeThe value of the order code is represented by a shift code, which is actually to represent the signed number with an unsigned value.SupplementThe so-called shift code, simply said is the value range of the offset number
The negative number expressed in the complement is not intuitive at the time of comparison. For example 010101,101011 easy to think the latter is larger, but the fact that the so-called shift code is the complement of W position, add 2w to it. This can be considered as "unsigned notation". This will then be able to visually compare the size of the two shift codes. The reason for the shift code is the order code
So, to subtract the previous offset from the value represented by the bias, so that the original unsigned order can represent unsigned values
bias=2^ (k-1)-1 k is the order code bit length Bias single precision is 127,bias double precision is 1023 here you can see the range of values for the Order field changed to [-126,127]. So, where did 127 and 128 go? The 127 represents the number approaching 0, and 128 is the number of infinity or overflow.
So now the order code E=e-bias, where E is the number represented by the Order field
In addition the normalized number Mantissa m=1. The Order field, which is the implied decimal that begins with 1, does not save this 1, and the default starts with 1 when calculating. Why did you do it? This is intended to be more of a precision bit. We adjust the tail code by the order, so that it is always 1 decimal places, such as 1.01010101 ... But the tail Code field does not record this beginning 1, so we have one more to record decimals, so that the number will be more accurate.
the benefits of using code-shift notation for order codesEasy comparison can represent special values
Non-normalizedThe Order field is all 0 at this point the end code no longer has an implied 1.Two non-formatted uses
- Provides a way to represent 0
Since the number of formatted digits is preceded by a default of 1, the format number cannot be represented as 0
- Provides a way to represent close to 0
But this time the order value is E=1-bias, and not as e=-bias as the formatted value.
More than 1 is added in order to be able to smoothly convert unformatted to formatted number.
For example, consider an example of a 8-bit floating-point format, where there are k=4 order bits and n=3 decimal digits. Offset 2^ (4-1) -1=7.
Because the format number abruptly at the beginning of the tail code to add a 1 that is offset by a bit, but not the format of this 1, so to E plus a 1.
Special Values
Appears when the order is all 1. When the decimal field is full 0 o'clock, it represents infinity. The sign bit s is 0 positive infinity, and 1 is negative infinity. When the decimal field is not 0 o'clock the result is Nan (not a number)
RoundingIEEE has four different rounding methods
- Rounds an even number (mostly like rounding, but. 5 is biased towards even numbers, such as 4.5-->4,3.5-->4)
- Rounds to 0
- Rounding down
- Rounding up
In addition to rounding to even numbers, the remaining three produce actual value indeed bounds. That is, rounding results must be either greater than or less than the original value.
And rounding to even numbers is used for statistics.
There's no concrete implementation in the book.
floating point Arithmetic
Floating-point addition does not have binding a+b+c! =a+ (b+c) for compiler default optimization strategy, the problem will be x=a+b+c;y=b+c+d; optimize t=b+c;x=a+t;y=t+d;
Case
Turn 12345[11000000111001] into an IEEE single-precision floating-point number
- Put 2 decimal point left 13 bit, 12345=1.1000000111001x2^13
- Remove the beginning 1, add 10 0 to the end to construct a small number segment, [10000001110010000000000]
- In order to construct the order code, 13 plus offset 127, get 140[10001100]
- Plus the sign bit at the beginning 0
- finally [0 10001100 10000001110010000000000 ]
"In-depth understanding of computer systems" 2.4 Floating point numbers