IEEE-754 floating-point conversion

Source: Internet
Author: User

I asked a friend about the representation of floating point Decimals in Java and C/C ++. I didn't expect him to write a blog to answer my questions. Thank you, slimzhao ).

After reading his blog, I learned a lot. Now I will reference it and summarize it. At the same time, I want to remember it.

Each part of the floating point format:
For float: 32 bits in total, bit 31 is MSB (most significant bit), bit 0 is LSB (least significant bit), then
Bit 31 is the symbol bit, And the next eight digits are the exponent bit. The exponent bit is regarded as an unsigned number. The difference between it and 127 is the base-2 Index. The last 23 digits are decimal parts.
 The fractional part is as follows:
This is the decimal part of 0.9f, a total of 23 digits, which indicates:
Such a decimal number based on 2,1 before the decimal point is implicit, which is a small trick to get an additional representation.So the entire number above is:
1 + (1/2 + 1/4) + (1/32 + 1/64) + (1/512 + 1/1024) + 1/(1024*8) + 1/(1024*16) + 1/(1024*16*8) + 1/(1024*16*16)
The above formula can be directly copied to the Windows Calculator, but note that after copying, click "= ". Otherwise, an operation is not performed.
The result is 1.7999999523162841796875, and the index score is-1 (001111110) 2 = (126) 10; 126-127 =-1). The result is divided by 2, the value is
The length after the decimal point. It is 23 digits. Adjust the input precision of printf to so many decimal points:
Printf ("%. 23f/N", 2.0f-1.1f );
The last 7 BITs do not match the most accurate result theoretically, because printf has an upper limit on its accuracy in ansi c.

Now we can know that the internal representation of the floating point in C/C ++ language is because the precision requirement of printf is not specified. The default precision is 6 digits after the decimal point, followed by rounding, so the result is 0.9. The printf (); function is added after Java SE 5.0. The effect is the same as that in C.
But the Java println can be seen from the jdkapi (double. tostring (double ))
(For Java, the println does not have any standards to follow before Java, so it can increase the default accuracy on its own)

public static String toString(double d)
Return doubleParameter string representation. All the characters mentioned below are ASCII characters.

  • If the parameter is Nan, the result is a string"NaN".
  • Otherwise, the result is the symbol of the parameter and the size (absolute value) string. If the symbol is negative, the first character in the result is'-'('/u002D'); If the symbol is positive, it is not displayed in the result. As for the size valueM:
    • IfMIt is an infinite number of characters."Infinity"Indicates it. Therefore, positive infinity produces"Infinity"Negative infinity produces"-Infinity".
    • IfM0, use characters"0.0"Indicates it; therefore, the result of negative zero is"-0.0"And the result of positive zero is"0.0".
    • IfMIf the value is greater than or equal to 10-3, but less than 107, it is in decimal format without leading zero.MIt is represented by an integer, followed'.'('/u002E'), FollowedMOne or more decimal digits of the decimal part.
    • IfMIf it is less than 10-3 or greater than or equal to 107, it is represented by the so-called "Computer Science note. LetNBecome satisfied 10N<=M<10N+ A unique integer of 1; then letABecomeMAnd use 10NThis form represents it. Therefore, 1 <=A<10. Then, useAThe integer part of the parameter indicates the size value. It is expressed as a decimal digit followed'.'('_apos;), FollowedAThe decimal digits of the decimal part, followed by the letter'E'('-pos;), Which is expressed in decimal integers.N, Which is very similar to the result produced by using the integer. tostring (INT) method.

Must beMOrAHow many digits does the fractional part of show? At least one digit must be used to represent the fractional part. In addition, many (but only as many) digits are required to uniquely distinguish between the parameter values anddoubleType. In other words, assume thatXIt is an exact arithmetic value expressed in decimal notation and is used for finite non-zero parameters.D.SoDMust be the closestXOfdoubleValueIf twodoubleValues are the sameX, ThenDMust be one of them, andDMust be0.

Use the numberformat subclass to create a localized string representation of floating point values.


d- doubleValue.
Return Value:
Parameter string representation.

This time I want to tell you how many times I want to check the JDK, instead of having to stop at printstream. println (double D. Let's take a look.
Print (dobule) will see you see double. tostring (double ).

My friend may be unfamiliar with Java, so he doesn't know much about JDK, but unlike me, he doesn't know much about Java. He is a master of C/C ++.

Finally, we provide you with a URL about the IEEE-754 floating-point conversion.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.