Any data is stored in the memory in binary format. For example, a short-type data 1156 is expressed in binary format as 00000100. In the Intel CPU architecture system, the storage mode is 10000100 (low-address unit) 00000100 (high-address unit), because the Intel CPU architecture is a small-end mode. But how do floating point numbers be stored in the memory? Currently, all C/C ++ compilers use the standard floating point format (Binary scientific Notation) developed by IEEE.
In the binary scientific representation, S = m * 2 ^ n consists of three parts: Symbol bit + level code (n) + ending number (m ). For float data, the binary value is 32 bits, where the sign bit is 1 bits, the order code is 8 bits, and the tail number is 23 bits. For double data, the binary value is 64 bits, and the sign bit is 1 bits, the order code is 11 bits and the ending number is 52 bits.
31 30-23 22-0
Float sign-level code tail
63 62-52 51-0
Double signed order ending number
Sign bit: 0 indicates positive, 1 indicates negative
Level Code: here the level code is represented by a shift code. For float data, the required offset is 127, And the level code has positive and negative values. For 8-bit binary, the range is-128-127, and the double type is 1023. The range is-1024-1023. For example, for float-type data, if the real value of the Level Code is 2, after 127 is added, the level code is 129.
Ending number: Valid digit, that is, some binary digits (the binary digits after the decimal point). Because the integer part of M is invariably 1, this 1 is not stored.
The following is an example:
Float data 125.5 is converted to standard floating point format
125 binary representation is 1111101, And the decimal part is 1, then the 125.5 binary is 1111101.1. Because the specified integer part of the tail number is 1, it is 1.1111011*2 ^ 6, if the order code is 6 and the value of 127 is 133, It is 10000101. For the ending number, remove integer part 1, which is 1111011, and add 0 after it to make the number of digits reach 23, it is 11110110000000000000000
The binary representation is
0 10000101 11110110000000000000000, the storage method in the memory is:
00000000 low address
00000000
11111011
01000010 high address
In turn, if you want to calculate a floating point number in the binary format, for example, 0 10000101 11110110000000000000000
Because the symbol is 0, it is a positive number. If the order code is 133-127 = 6 and the ending number is 11110110000000000000000, the actual ending number is 1.1111011. So its size is
1.1111011*2 ^ 6: move the decimal point to the right of 6 digits to get 1111101.1, 1111101 in decimal format is 125, and 0.1 in decimal format is 1*2 ^ (-1) = 0.5, so its size is 125.5.
Similarly, if float data 0.5 is converted to binary format
The binary format of 0.5 is 0.1. Because the required positive part must be 1, it is 1.0*2 ^ (-1) to move the decimal point to the right ), the order code is-1 + 127 = 126, indicating 01111110, while the tail number 1.0 removes the integer part as 0, and supplements 0 to 23 digits 00000000000000000000000, the binary representation is
0 01111110 00000000000000000000000
From the above analysis, we can see that the maximum value of float data is 1.11111111111111111111111*2 ^ 127 = 3.4*10 ^ 38.
The double type data is similar, except that the order code is 11 bits, the offset is 1023, And the ending number is 52 bits.
/* Test the memory storage mode of floating point data: 2011.10.2 */# include <iostream> using namespace STD; int main (INT argc, char * argv []) {float a = 125.5; char * P = (char *) & A; printf ("% d \ n", * P); printf ("% d \ n ", * (p + 1); printf ("% d \ n", * (p + 2); printf ("% d \ n ", * (p + 3); Return 0 ;}
Output result:
0
0
-5
66
As shown above, the storage mode of Float 125.5 in memory is:
00000000 low address
00000000
11111011
01000010 high address
Therefore, for the units pointed to by P and p + 1, the decimal integer represented by the binary number stored in the unit is 0;
For the Unit pointed to by P + 2, because it is a char pointer, It is a signed data type, so 11111011, the symbol bit is 1, it is a negative number, in the memory, binary is stored as a complement, so its true value is-5.
For the Unit pointed to by P + 3, 01000010, positive number, the size is 66. The output results of the above program verify its correctness.
In this example, a positive complement is itself, a negative complement is, and the symbol bit remains unchanged.