Float is a 32-bit, double is 64-bit in C + +, the two are stored in memory and can be expressed in different precision, the current C + + compiler standards are in accordance with the IEEE developed floating-point notation for float,double operations.
In-memory storage, whether float or double, is divided into three main parts, namely:
(1) Sign digit: 0 for positive number, 1 for negative number
(2) exponential (Exponent): Used to store the exponential portion of the scientific notation, and to use the Shift-storage method
(3) Number of digits (mantissa): for storing tail parts
For both in-memory storage structure, as shown in:
The number float 9.125 is expressed as 9.125*10^0 in a scientific way in decimal, but in computers, computers only know 0 and 1, so the computer is represented in a scientifically computed binary way:
A binary representation of 9 is 1001
A binary representation of 0.125 is 0.001
So 9.125 of the representation of 1001.001 will represent it as a binary scientific counting method of 1.001001*2^3
In a computer, any number can be represented as a form of 1.xxxxxx*2^n,
Where xxxxx represents the tail part, and N indicates the exponential portion of the
Where, because the highest bit orange 1 here, because any one number is represented in this form when this is 1, so in the storage actually does not save this bit, which makes the mantissa of float 23bit can represent the accuracy of 24bit, The mantissa of 52bit in double can express the accuracy of 53bit.
For float type data, can be accurate to several after the decimal point? Of course, the students who have studied C say that float can be accurate to 6 digits after the decimal point, but how does it come about? Here's a little explanation:
In decimal 9, the representation in binary is 1001, here also tells us that the decimal number in the binary is required to 4bit, so we now have 24bit precision in float, so float in the decimal has 24/4=6, so in decimal, Float can be accurate to 6 digits after the decimal point. Similarly, double types with 53bit precision can be accurate to 13 digits after the decimal point.
For the float type, his exponent portion is 8bit, which can represent -127~128, but the way in which the shift is stored (which is unclear to this concept), the cardinality of the data when storing the exponent is 127, not 0. For example, the above 9.125, the second binary exponent part is 3, so in the storage is actually stored 127+3=130. (130 binary is represented as 10000010)
Finally, according to the storage structure of float in the figure above, it is actually 9.125 in the computer:
The above binary number is converted to 16 binary after the representation is: 01000001 00010000 00000000 00000000--41 10 00 00
In fact, in the X86 computer, the use of small-end storage, that is, low-address storage lower-level data, high-address storage higher data.
So the data should be stored like this:
The type of storage for double is actually similar to float, except that the number of bits stored is different and the principle is the same.
http://blog.csdn.net/qingtingchen1987/article/details/7719259
Storage structure for float and double in C/