Representation of floating point data stored in memory
The real number is stored in the memory as a standard floating point number, including the number operator, level code, and tail number. The precision of the number depends on the ending number of the tail number. For example, the float type on a 32-bit machine is 23 bits (because the maximum bits of the number of normalized digital devices are 1, there is no need to store them, and the actual precision is 24 BITs, which will be explained below), and the double type is 52 bits.
Floating-point notation is similar to scientific notation. Any number can move the decimal point by changing the exponent. For example, 23.45 can be written as 2.345*10 ^ 1.
Floating point representation is generally in the form of R = m * 2 ^ e (r: real M: mantissa tail number E: exponent level code)
The above float binary can be divided into three parts:
X xxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxx
Digit (1B) level code (8B) ending number (23B)
The floating point numbers of the double type are: digit (1B), level code (8B), and ending number (52B)
Sign: The plus or minus sign of real "+": 0 "-": 1
Level E: the binary here is actually a shift code E (0 ~ 255), E = E-127 (double type E = E-1023) e is positive indicates that the floating point number decimal point to the left move e bit, if the value of E is negative, the decimal point of the floating point number is shifted to the right. 127 = 2 ^ 7-1 1023 = 2 ^ 10-1
Tail M: Valid Digit
Float floating point number 0.5 is converted into a 32-bit binary floating point number.
0.5 of the binary code is 0.1, and the scientific count of the binary code is: 1.0*2 ^ (-1) that shifts one digit to the right, then E =-1, then E = e + 127 = 126, while e's binary code is 01111110, and 1.0 removes 1 from the "integer" part and sets it to 0, and then supplements 0 to form the order code.
Therefore, the 32-bit binary floating point number of 0.5 is
0 01111110 00000000000000000000000
Binary cross Conversion
32-bit binary floating point: 0 10000010 00010000000000000000000 converted to decimal floating point
Digit part is 0, it indicates this number is an integer; level code part is 10000010, then E = 130, then E = E-127 = 3, it means it shifted to the left three places, 0001 is 1.0001 after 1 is added to the "integer" part. The original binary number is 1000.1 = decimal 8.5