Original address: http://blog.csdn.net/u012843100/article/details/60885763
Today, while learning the core programming of Python, the decimal floating-point number sees an interesting thing.
>>>0.10.1000000000000001
Why is that? This is explained in the article:
This is because the language of the majority of C-language double-precision implementations comply with the IEEE754 specification, where 52 bits are used for the bottom. Therefore, floating-point values can only have 52-bit precision, and a value like this can be truncated in binary notation only as in the above. The binary representation of 0.1 is 0.11001100110011...*2^-3, because its closest binary approximation is 0.0001100110011 ..., or 1/16+1/32+1/256+ ...
It means that after rounding it becomes 0.1000000000000001, the above paragraph may be a lot of people indefinitely.
Perhaps I have tried in Python 3 and Python2.7, and there is no such problem, but this question is worth knowing.
Here's a first piece of C # code to prove that I'm essentially a. NET Program Ape Bar:
2.15f; double d = f; Console.WriteLine(d.ToString("0.00000000000")); d = 2.15d; Console.WriteLine(d.ToString("0.00000000000")); Console.ReadKey();
Smart you guess the output of this two times the same result? The answer is not the same, do not believe you can try it is not the following.
To answer the above 0.1 output, we need a step-by-step.
The first step is to turn 0.1 into binary first.
Integer to binary we must all know, this will not say, but the fractional part how to turn may still some classmates do not know. No problem, attach link: decimal between binary decimal how to convert
So the binary in decimal 0.1 is 0.00011001100110011 ... To here,
The binary representation of 0.1 is 0.11001100110011...*2^-3
This sentence will understand it.
The second step, probably understanding the next IEEE 754
The IEEE 754 standard is the standard number of IEEE binary floating-Point Arithmetic standards (IEEE Standard for Floating-point arithmetic), and the IEEE 754 standard provides for binary and decimal floating-point exchange in computer programming environments , arithmetic formats, and methods.
Reference:
Baidu Encyclopedia IEEE 754
Wikipedia IEEE 754
This is a Nanyi blog.
According to the International standard IEEE 754, any one binary floating-point number V can be expressed in the following form:
(1)(-1)^s表示符号位,当s=0,V为正数;当s=1,V为负数。 (2)M表示有效数字,大于等于1,小于2。 (3)2^E表示指数位。
For example, the decimal 5.0, written in binary is 101.0, equivalent to 1.01x2^2. Then, according to the above v format, you can draw s=0,m=1.01,e=2.
The decimal-5.0, written in binary is-101.0, is equivalent to -1.01x2^2. So, s=1,m=1.01,e=2.
IEEE 754 stipulates that for 32-bit floating-point numbers, the highest 1 bits are the sign bit s, then the 8 bits are exponential e, and the remaining 23 bits are the valid number M.
For 64-bit floating-point numbers, the highest 1 bits are the sign bit s, then the 11 bits are exponential e, and the remaining 52 bits are the valid number M.
IEEE 754 has some special provisions for the effective number m and index E.
As previously said, 1≤m<2, that is, M can be written in 1.xxxxxx form, where xxxxxx represents the fractional part. Ieee
754, the first digit of the default number is always 1 when the M is saved inside the computer, so it can be abandoned to save only the xxxxxx part of the back. For example, when saving 1.01, only save 01, until the time of reading, then the first 1 plus go. The purpose of this is to save 1 digits of valid digits. Take the 32-bit floating-point number as an example, left to M only 23 bits, will be the first bit of 1 out of the future, is equivalent to save 24 digits of valid numbers.
As for index E, the situation is more complicated.
First, E is an unsigned integer (unsigned
int). This means that if E is 8 bits, its value range is 0~255, and if E is 11 bits, its value range is 0~2047. However, we know that the E in scientific notation can be negative, so the IEEE
754, the true value of E must be subtracted from an intermediate number, for the 8-bit E, the median is 127, and for the 11-bit E, the middle number is 1023.
For example, 2^10 's E is 10, so it must be saved to 10+127=137, or 10001001, when it is stored as a 32-bit floating-point number.
The index e can then be divided into three different situations:
(1) e is not all 0 or not all 1. At this point, the floating-point number is represented by the above rule, that is, the calculated value of the exponent e minus 127 (or 1023), the real value, and then the effective digit m plus the first digit of 1.
(2) e is all 0. At this point, the index e of the float is equal to 1-127 (or 1-1023), and the effective number m is no longer added to the first bit of 1, but the decimal point is reverted to 0.xxxxxx. This is done to represent ±0, and very small numbers close to 0.
(3) e is all 1. At this point, if the valid number m is all 0, indicating ± infinity (plus or minus depends on the sign bit s), if the valid number m is not all 0, the number is not a number (NaN).
In the third step, the decimal 0.1 is represented by a 32 bit bit.
The binary of 0.1 is 1.100110011001100110011001...*2^-4
According to the formula
The positive sign bit is 0, the 1,m of the first digit is 100110011001100110011001;e=-4+127=123, and the binary is represented as
Can be drawn s=0,m=100 1100 1100 1100 1100 1100,e=0111 1011
So written in binary form is
0 10011001100110011001100 01111011
Fourth step, forget the first two steps and turn the binary back into decimal.
How does this go back to the 10 system? Just decimal to binary is multiplied by 2 to get every bit, that now is the reverse, each bit divided by 2 plus up is.
Just say the 0.1 binary
0.0001100110011 ...
Back to decimal, that's about it.
1* (2^-4) +1* (2^-5) +1* (2^-8) +1* (2^-9) +1* (2^-11) + ...
The result of the calculation should be 0.1000000000000001, that is, the conclusion.
Be patient with you to verify under, anyway I have no patience to count.
It seems wrong to think about it ... How can this value be smaller than the original value after truncation? Well, that doesn't seem to be the reason.
Then what is the code for C #? As for why? The single-precision floating point number (float) is a valid bit of 23 bits, and the double-precision floating-point number (double) is 52 bits. I can only help here, too.
(RPM) 0.1 output from Python 0.1000000000000001 says floating-point binary