The floating-point number is usually expressed as N=S*RJ S called the mantissa, can be negative, j is called the Order can be negative. R is the base and is taken in multiples of 2 in the computer.
In order to improve the accuracy of the computer, the normalization operation is assumed that n=11.0101 is normalized as n=0.110101*210 representation, because the normalization makes the representation of the floating-point number the highest precision.
The floating-point number in the machine consists of two parts: the order code and the mantissa. Base is 2
The order code is composed of the order character and the order, and the mantissa consists of the number and the mantissa.
Where the order code is an integer, the order and the number of bits m together reflect the range of floating-point numbers and the actual number of digits of the decimal point. The mantissa is a decimal that reflects the precision of the floating-point number.
There is also the concept of machine 0: that is, the floating-point number mantissa is 0, or the order code is less than or equal to the smallest possible representation. The machine regards the floating-point number as 0 and says the 0 is machine 0.
Operation steps for adding and subtracting two floating-point numbers in a computer:
1, to the order, so that the two number of decimal position alignment. You need to make the order code the same. That is what we often do in mathematics = x*0.1*1000+y*1000 because the x*100+y*1000 in the computer is low-order high-order alignment. When the two values vary greatly, the decimal point needs to move to the left when the order is performed, resulting in loss of data at the end. A very simple example, such as 0.10100*2100 and 0.10000*21000, when the order of the time the small number needs to add 100, and because the accuracy of 5 bits, so that the left number is represented as 0.00001*21000, resulting in loss of precision.
2, the mantissa sum. Then, as the first example of 0.10100*2100 and 0.10000*21000, the result of the addition is 0.10001*21000
3, normalization, the maximum shift of the decimal point to 1, because the current maximum is 1 so do not move
4, rounding, in order to improve the accuracy, consider the mantissa right shift when the value is lost.
5, overflow judgment.
This is the only reason why the addition of the loss of precision is a phenomenon. Subtraction is also the same truth, the accuracy of the order caused by the loss.
Why is a large floating-point large number plus a very small number calculation result not expected