**I. Loss of precision in floating-point calculations**

Probably a lot of friends with programming experience are not familiar with this problem: no matter what programming language you use, when you use floating-point data for accurate calculations, you may encounter errors in the calculation results. Take a look at the following example.

This is an example of a result error when using floating-point data for precise calculations, written in Java, omitted.

Double A = (1.2-0.4)/0.1;

System.out.println (a);

If you think the output of this program is "8", then you are wrong. In fact, the output of the program is "7.999999999999999". Okay, here's the question. Where did it go wrong?

This type of problem is not uncommon when floating-point data is calculated accurately. Let's call it "loss of precision." You can try to change the above program, you will find some interesting phenomena:

1, if you directly use a number in place of the expression in parentheses, such as "0.8/0.1" or "1.1/0.1", then it seems that attention just does not appear to be the problem;

2, we may do a second test, is to compare the results of "0.8/1.1" and "(1.2-0.4)/1.1", yes, that's what I did. Then you will find that the former result is "0.7272727272727273" (rounded result), and the latter result is "0.7272727272727272" (no rounding). It can be speculated that after a calculation, the accuracy is lost;

3, very good, I think we are very close to the truth, but the next third test may discourage you, I do this: compare "(2.4-0.1)/0.1", "(1.2-0.1)/0.1" and "(1.1-0.1)/0.1" Results, the first is "22.999 999999999996 ", the second one is" 10.999999999999998 ", the third is" 10.0 ". Seems to have completely overturned our thoughts;

4, you may not forget, because in the above test, the third expression in parentheses in the result is really weird, just "1.0". Then we'll compare "(2.4-0.2)/0.1" and "(2.4-0.3)/0.1", the former result is "21.999999999999996", the latter result is "21.0". Congratulations, you can finally give up this boring test.

Finally, we can also overturn the hypothesis of our first Test: when "2.3/0.1" is used, the result is "22.999999999999996", resulting in loss of precision. In other words, the so-called "loss of precision after one calculation" hypothesis is not tenable.

**Ii. why the loss of precision can occur**

So why is there a loss of precision? After reviewing some information, I have a little clue, the following is my humble opinion, for reference only.

First we have to discuss the problem from the computer itself. We know that the computer does not recognize any data other than binary data. No matter what programming language we use, and in what compilation environment, we must first translate the source program into binary machine code before it can be recognized by the computer. For example, the above mentioned case, our source program 2.4 is a decimal, the computer can not be directly recognized, to be compiled into binary first. But the problem is that 2.4 of the binary representation is not exactly 2.4, but the closest binary representation is 2.3999999999999999. The reason is that the floating-point number consists of two parts: the exponent and the mantissa, which should be understandable if you know how to do binary and decimal conversions of floating-point numbers. If the floating-point number participates in the calculation during this conversion, the conversion process becomes unpredictable and irreversible. We have reason to believe that in this process, the loss of precision has occurred. And as for why some floating-point calculations will get accurate results, it should also happen that the computed binary and decimal can be accurately converted. When outputting a single floating-point data, it can be output correctly, such as

Double d = 2.4;

System.out.println (d);

The output is 2.4, not 2.3999999999999999. In other words, floating-point numbers are displayed correctly in decimal when no float is calculated. This reinforces my idea that if a floating-point number participates in the calculation, the conversion between the floating-point binary and the decimal will become unpredictable and irreversible.

In fact, floating-point numbers are not suitable for precise calculations and are suitable for scientific calculations. Here's a tip: since float and double are used to denote numbers with a decimal point, why don't we call them "decimals" or "real numbers" and call them floating-point numbers? Because these numbers are stored in the form of scientific notation. When a number like 50.534 is converted into scientific notation in the form of 5.053e1, its decimal point moves to a new position (i.e. floating). It can be seen that floating-point numbers are used for scientific calculations, and it is too inappropriate to do accurate calculations.

**III, how to use floating point number for accurate calculation**

Can you use floating-point numbers for accurate calculations? The direct calculation is certainly not possible, but we can certainly solve this problem by some methods and techniques. Because the results of floating-point calculations are very close to the correct results, you probably think of using rounding to process the results to get the correct answer. That's a good idea.

So how to achieve rounding? You might think of the round method in the math class, but there is a problem that the round method cannot set a few decimals, and if we want to keep two decimal places, we can only do this like this:

Public double round (double value) {

Return Math.Round (value*100)/100.0;

}

If this can get the right results, it will not be enough, we can think of ways to improve. Unfortunately, the above code does not work properly, and if you pass this method to 4.015, it will return 4.01 instead of 4.02.

Java.text.DecimalFormat also does not solve this problem, take a look at the following example:

System.out.println (New Java.text.DecimalFormat ("0.00"). Format (4.025));

Its output is 4.02, not 4.03.

Isn't there a way out? Of course there is. A solution is given in the book "Effective Java". The book also points out that float and double can only be used for scientific calculations or engineering calculations, and we use java.math.BigDecimal in precise calculations such as commercial computing.

BigDecimal class One has 4 methods, and we only care about the useful method for accurate calculation of floating-point data.

BigDecimal (Double value)//convert double type data to BigDecimal type data

The idea is very simple, we first through the BigDecimal (double value) method, the double type data is converted to BigDecimal data, and then can be accurately calculated. After the calculation, we can do some processing of the results, such as the results can be rounded apart. Finally, the result is converted back to the double type data by the BigDecimal type data.

The idea is correct, but if you take a closer look at the detailed instructions in the API about BigDecimal, you'll know that if you need to calculate exactly, we can't just use a double, not a string to construct BigDecimal! So, we're starting to care about another way of BigDecimal class, that is, the BigDecimal (String value) method that can help us do the exact calculation correctly.

BigDecimal (string value) to convert string data to BigDecimal type data

So here's the question, imagine, if we're going to do an addition to a floating-point data, we need to convert two floating-point numbers to String data, then construct the BigDecimal with BigDecimal (string value), and then call the Add method on one of them. Pass in another as a parameter, and then convert the result of the operation (BigDecimal) to a floating-point number. Can you tolerate such a cumbersome process if you want to calculate the floating-point data every time? At least I can't. So the best way to do this is to write a class that does the tedious conversion process in the class. Thus, when we need to calculate the floating-point data, we can just call this class. Online already has the expert to provide us with a tool class Arith to complete these conversion operation. It provides the following static methods that can complete the subtraction operation of floating-point data and rounding its results:

public static double Add (Double v1,double v2)

public static double sub (double v1,double v2)

public static double Mul (Double v1,double v2)

public static double div (Double v1,double v2)

public static double div (double v1,double v2,int scale)

public static double round (double v,int scale)

The following will be attached to the source code of Arith, everyone just compile and save it, in order to do floating point calculation, in your source program to import the Arith class can use the above static method for the accurate calculation of floating-point numbers.

**Appendix: Arith Source Code**

Import Java.math.BigDecimal;

/**

* Because Java's simple type does not accurately operate on floating-point numbers, this tool class provides fine

* Accurate floating-point arithmetic, including subtraction and rounding.

*/

public class arith{

Default division Operation Precision

private static final int def_div_scale = 10;

This class cannot be instantiated

Private Arith () {

}

/**

* provides accurate addition operations.

* @param v1 Summand

* @param v2 Addend

* @return of two parameters and

*/

public static double Add (Double v1,double v2) {

BigDecimal B1 = new BigDecimal (double.tostring (v1));

BigDecimal b2 = new BigDecimal (double.tostring (v2));

Return B1.add (B2). Doublevalue ();

}

/**

* Provides accurate subtraction operations.

* @param v1 minuend

* @param v2 meiosis

* @return The difference of two parameters

*/

public static double sub (double v1,double v2) {

BigDecimal B1 = new BigDecimal (double.tostring (v1));

BigDecimal b2 = new BigDecimal (double.tostring (v2));

Return B1.subtract (B2). Doublevalue ();

}

/**

* Provides accurate multiplication operations.

* @param v1 by multiplier

* @param v2 Multiplier

* @return The product of two parameters

*/

public static double Mul (Double v1,double v2) {

BigDecimal B1 = new BigDecimal (double.tostring (v1));

BigDecimal b2 = new BigDecimal (double.tostring (v2));

Return b1.multiply (B2). Doublevalue ();

}

/**

* Provide (relative) accurate division operation, when the occurrence of an endless situation, accurate to

* After the decimal point 10 digits, the later numbers are rounded.

* @param v1 Dividend

* @param v2 Divisor

* @return two parameters of the quotient

*/

public static double div (Double v1,double v2) {

Return Div (V1,v2,def_div_scale);

}

/**

* Provide (relative) accurate division operations. When an exception occurs, the scale parameter refers to the

* Fixed precision, after which the numbers are rounded.

* @param v1 Dividend

* @param v2 Divisor

* @param scale indicates the need to be accurate to several decimal places.

* @return two parameters of the quotient

*/

public static double div (double v1,double v2,int scale) {

if (scale<0) {

throw New IllegalArgumentException (

"The scale must is a positive integer or zero");

}

BigDecimal B1 = new BigDecimal (double.tostring (v1));

BigDecimal b2 = new BigDecimal (double.tostring (v2));

Return B1.divide (B2,SCALE,BIGDECIMAL.ROUND_HALF_UP). Doublevalue ();

}

/**

* Provides precise rounding of decimal digits.

* @param v need to be rounded to the number

* Retain several @param scale decimal points

* Results after rounding @return

*/

public static double round (double V,int scale) {

if (scale<0) {

throw New IllegalArgumentException (

"The scale must is a positive integer or zero");

}

BigDecimal B = New BigDecimal (double.tostring (v));

BigDecimal one = new BigDecimal ("1");

Return B.divide (ONE,SCALE,BIGDECIMAL.ROUND_HALF_UP). Doublevalue ();

}

};

The problem of accurate calculation of floating-point data float and double in Java