Java currently follows the ieee-developed floating-point notation for float,double operations. This structure is a scientific notation, denoted by the symbol, exponent, and mantissa, and the base is set to 2--, which means that a floating-point number is represented as the mantissa multiplied by 2 exponent and then added to the Symbol.
Let's look at a Java code:
public class Floattobinary {
public static void main (string[] Args) {
Float f1=8.5f;
System.out.println ("f1 Bottom Data (decimal):" +float.floattointbits (f1));
int Int1=float.floattointbits (f1);
System.out.println ("f1 Bottom Data (binary):" +integer.tobinarystring (int1));
}
}
Printing results:
F1 underlying data (decimal): 1091043328
F1 Bottom Data (binary): 1000001000010000000000000000000
We know that the float and double respectively occupy 32 and 64 bits in memory, see Below:
32
|
sign bit |
order |
Mantissa |
Length |
1 |
8 |
23 | TD width= "valign=" Top ">
Double |
1 |
11 |
52 |
64 |
IEEE floating-point numbers represent the standard:
V = ( -1) sxmx2e
E = E-bias
Where bias represents the offset, the offset of float isbias=2 k -1-1=2 8-1-1=127,double is bias=2 10-1=1023
When a floating-point number is stored in a computer, it is divided into three parts according to the binary scientific notation: the sign bit, the exponent part and the part of the tail number. As shown in the Following:
650) this.width=650; "src=" http://s5.51cto.com/wyfs02/M02/87/C8/wKioL1fh6-LwibpzAAB7GmVzFbY425.png "title=" F04fae9b-64c5-4dfb-9080-d910d399fe48.png "alt=" wkiol1fh6-lwibpzaab7gmvzfby425.png "/>
storage, according to the highest level of storage symbol bit, the secondary high storage index portion, low storage tail part of the order of Storage. The storage is arranged as Follows:
650) this.width=650, "name=" image_operate_35261385456315487 "src="/e/u261/themes/default/images/spacer.gif "alt=" Single-precision and Double-precision floating-point data types detailed "title=" single-precision and double-precision floating-point data types in detail "style=" border:1px solid RGB (221,221,221); Background-image:url (""); background-position:50% 50%;background-repeat:no-repeat; "/>
The memory distribution of the float type is as Follows:
650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M01/87/C8/wKioL1fh7IOxcr2lAACOyyaIdFc017.png "title=" Ec849936-ffd8-4998-8943-4bfe29a3cab7.png "alt=" wkiol1fh7ioxcr2laacoyyaidfc017.png "/>
The memory distribution of the double type is as Follows:
650) this.width=650; "src=" http://s5.51cto.com/wyfs02/M02/87/CC/wKiom1fh7LPBdQnTAACjj8hs_DA254.png "title=" 9efbc998-f0b4-4404-aefa-829f0bf122c9.png "alt=" wkiom1fh7lpbdqntaacjj8hs_da254.png "/>
Encoding Rules
Note: Non-statute floating point number is mainly used to enlarge the floating-point representation range near the value of 0, the absolute value of Non-statute floating-point numbers is less than the absolute value of the Statute floating point, that is, the former is closer to 0 on the real axis, which can improve the calculation accuracy near 0. General c, C + + The value ranges for float and double are defined in terms of the statute floating point numbers, as is the case with the MSDN documentation and related materials, but some compilers implement non-protocol floating-point numbers according to the ansi/ieee STD 754-1985 standard, which is illustrated at the end of this article.
Sign Bit: 0 for positive number, 1 for negative number;
Exponential part: The offset of float to 2^8-1,double is 2^11-1;
Tail Part: The value after the decimal point in the actual tail part, The statute floating-point number is indicated by the standard binary scientific notation, its Mantissa range is [up], the tail part of the Non-statute floating-point number is in the range (0,1).
The above theory can be seen everywhere, this is just the definition of IEEE754, let's actually use how it expresses decimals:
① a binary process with a single precision of 8.5f.
First 8.5 is positive so the sign bit is 0;
It is then converted into binary, 1*2^3+0*2^2+0*2^1+0*2^0 (integer part). (decimal point) 1*2^-1 Fractional part simplified to 1000.1
To change the binary number into the form of (1.f) *2^ (exponent), where exponent is the exponent is 1.0001*2^3.
Then we get the order code for e=3+127=130, which means that the binary is 10000010.
The remaining decimal number is 0001, and we are up to 23 bits 00010000000000000000000.
This conforms to the structure symbol bit 0 order code 10000010 mantissa 00010000000000000000000
Then let's take a look at 8.5 stored in memory 01000001000010000000000000000000
Because the Java.lang.Integer.toBinaryString () method returns a string representation of an integer parameter, an unsigned integer with a base of 2, so at the beginning of the program printing results we add a 0, which is the same as the result we have Calculated.
There are many on the net on the conversion of float, double, here is just I involved in this piece, and then I went to learn a bit, in fact, I just started to see the theoretical knowledge of dizzy, and later saw others with detailed explanation of the case, and then look back to the theory, found that it is not so Difficult.
This article is from the "attack on the program ape" blog, please be sure to keep this source http://zangyanan.blog.51cto.com/11610700/1854836
float, double underlying operation of Java Foundation