Basic Learning: float value range and precision in C,
Float Type:
By default, the real number on the right of the value assignment operator is treated as double. Therefore, use the suffix f or F to initialize floating point variables, as shown in the following example:
Float x = 3.5F;
If no suffix is used in the preceding statement, a compilation error occurs because you try to store a double value in the float variable.
Float value range
Float occupies 4 bytes, which is the same as int, that is, 32bit.
1bit (symbol bit) 8 bits (index bit) 23 bits (tail bit)
Storage methods include:
Value RangeBasic expression
(Floating point) value = ending number x base number ^ index, (plus and minus signs )----------------
Therefore, the float index range is-127 ~ 128, while the double index range is-1023 ~ 1024, and the index bit is divided by complement code. The negative index determines the minimum number of absolute values that floating point numbers can express. The positive index determines the maximum number of absolute values that floating point numbers can express, that is, the value range of floating point numbers.
Float range:-2 ^ 128 ~ + 2 ^ 128, that is,-3.40E + 38 ~ + 3.40E + 38; the double value range is-2 ^ 1024 ~ + 2 ^ 1024, that is,-1.79E + 308 ~ + 1.79E + 308.
Other special representations
1. when both the exponent part and the decimal part are 0, the value 0 is displayed. There are + 0 and-0 values (determined by the symbol bit). 0x00000000 indicates positive 0, and 0x80000000 indicates negative 0. 2. if the index part is 1 and the decimal part is 0, it indicates infinity, positive infinity and negative infinity, 0x7f800000 indicates positive infinity, and 0xff800000 indicates negative infinity. 3. when the index part is full 1 and the decimal part is incomplete 0, it indicates NaN, which is divided into QNaN and SNaN. Java contains NaN.
Conclusion:It can be seen that the floating point value range is: 2 ^ (-149 )~~ (2-2 ^ (-23) * 2 ^ 127, that is, Float. MIN_VALUE and Float. MAX_VALUE.
Precision
The precision of float and double is determined by the number of digits of the ending number. Floating point numbers are stored in the memory in scientific notation, And the integer part is always an implicit "1". Since it remains unchanged, it cannot affect the accuracy.
Float: 2 ^ 23 = 8388608, a total of seven digits, which means that there can be a maximum of seven valid digits, but it is absolutely guaranteed to be 6 digits, that is, the float precision is 6 ~ 7 valid digits;