Problem: Required to write a function for converting floating point data to string data. input parameter: Float Var = 8974564.53, output result: String result = "8974564.5300000000" (retain 10 digits after the decimal point ). But in actual encoding, No matter what method is used, the result is "8974564.5299999993 ".
Cause: Since decimal to binary decimal places cannot be printed one by one, the floating point number represented by the computer is only a finite set of real numbers. If the computer cannot print 8974564.53 into a binary number accurately expressed within the precision range, it will be replaced by the binary number closest to this number. In this example, the computer cannot accurately express 8974564.53 of the data, so it chooses 8974564.5299999993 in the finite real number that it can express (in fact it is not the number, however, because of the selected precision, the number is rounded to 8974564.52999999932944774627685546875. When the number after the output decimal point is larger, the number is displayed ).
Solution: Set the reserved digits of the floating point.
Developed by IEEESingle-precision floating point numberRepresentation: 1-digit symbol, 8-digit exponent, 23-digit decimal part;
Developed by IEEEDouble-precision floating point numberRepresentation: one-digit sign, 11-digit index, and 52-digit decimal part.
You can roughly determine the deviation of the alternative number selected by the computer to store a floating point number:
Taking double-precision floating point number 8974564.53 as an example, the binary number of the integer part of 8974564.53 is: 100010001111000011100100, that is, the exponent bit of the floating point number is 23, so the error of its decimal part is POW (0.5, 52-23) = 0.000000001862645149230957 ≈ 0.0000000019. The test shows the error between two adjacent double-precision floating-point numbers that can be accurately expressed: 8974564.5300000012-8974564.5299999993 = 0.0000000019, Which is exactly consistent with the previous calculation results.
It can be concluded that in order to accurately represent the double-precision floating point number 8974564.53, the number of digits after the decimal point cannot exceed 8.
The above calculation process design function getprecision :( PS: this function is not applicable to actual purposes and is only used as a reference for actual encoding)
# Include <stdio. h>
# Include <limits. h>
# Include <math. h>
# Include <assert. h>
# Define float_decimal_digits 23 // a single-precision floating point number with a decimal point of 23 and an index of 8 digits
# Define double_decimal_digits 52 // double-precision floating point decimal point 52, 11 digits
Template <typename type>
Int getpowerbinary (type number)
{
// When the integer part of a floating point number is greater than the maximum value that a long int can represent, the fractional part of the floating point is obviously not exactly represented, and its integer part is usually not exactly represented.
Assert (number <= long_max );
If (number> = long_max) Return-1;
// Calculate the number of bits in the binary floating point index
Long int power, integer = static_cast <long int> (number );
For (power = 0; integer> power> 0; ++ power );
Return power;
}
Int getdecimaldigits (float number)
{
Return (float_decimal_digits + 1-getpowerbinary (number ));
}
Int getdecimaldigits (double number)
{
Return (double_decimal_digits + 1-getpowerbinary (number ));
}
// Retrieve the floating point number to indicate the optimal exact digits of the decimal point
// Warning: this function can only work normally in the following situations: the integer part of number is smaller than the maximum value expressed by a long Int.
Template <typename type>
Int getprecision (type number)
{
Double error = POW (0.5, getdecimaldigits (number ));
Return static_cast <int> (-(log10 (error )));
}