Float and double range and precision

Source: Internet
Author: User
Tags float range

1. Scope
The float and double ranges are determined by the number of digits of the index.
The float index has eight digits, while the double index has 11 digits. The distribution is as follows:
Float:
1bit (symbol bit) 8 bits (index bit) 23 bits (tail bit)
Double:
1bit (symbol bit) 11 bits (index bit) 52 bits (tail bit)
Therefore, the float index range is-127 ~ + 128, while the double index range is-1023 ~ + 1024, and the index bit is divided by complement code.
The negative index determines the smallest non-zero number of absolute values that floating point numbers can express. The positive index determines the maximum number of absolute values that floating point numbers can express, that is, the value range of floating point numbers.
Float range:-2 ^ 128 ~ + 2 ^ 128, that is,-3.40e + 38 ~ + 3.40e + 38; the double value range is-2 ^ 1024 ~ + 2 ^ 1024, that is,-1.79e + 308 ~ + 1.79e + 308.

2. Precision
The precision of float and double is determined by the number of digits of the ending number. Floating point numbers are stored in the memory in scientific notation, And the integer part is always an implicit "1". Since it remains unchanged, it cannot affect the accuracy.
Float: 2 ^ 23 = 8388608, a total of seven digits,This means that there can be a maximum of seven valid numbers, but it is absolutely guaranteed to be 6 digits, that is, the float precision is 6 ~ 7 valid digits;
Double: 2 ^ 52 = 4503599627370496, a total of 16 digits. Likewise,The precision of double is 15 ~ 16 bits.

Certificate ----------------------------------------------------------------------------------------------------------------------------------------------

Original article: http://blog.csdn.net/wuna66320/article/details/1691734

Range 1

The float and double ranges are determined by the number of digits of the index.

The float index has eight digits, while the double index has 11 digits. The distribution is as follows:

Float:

1 bit (symbol bit)

8 bits (exponential)

23 bits (tail)

Double:

1 bit (symbol bit)

11 bits (exponential)

52 bits (tail digit)

In mathematics, especially in computer-related numeric (floating point number) problems, there is a basic expression [1]: // execute the ieee754 floating point representation format

Value of floating-point=SignificandXBase^ Exponent, with sign --- F.1

The expression is as follows:

(Floating point) value = ending number x base number ^ index, (plus and minus signs) -------------- F.2

Therefore, the float index range is-127 ~ 128, while the double index range is-1023 ~ 1024, and the index is divided by the complement code. The negative index determines the minimum number of absolute values that floating point numbers can express. The positive index determines the maximum number of absolute values that floating point numbers can express, that is, the value range of floating point numbers.

Float range:-2 ^ 128 ~ + 2 ^ 128, that is,-3.40e + 38 ~ + 3.40e + 38; the double value range is-2 ^ 1024 ~ + 2 ^ 1024, that is,-1.79e + 308 ~ + 1.79e + 308.

2 precision

The precision of float and double is determined by the number of digits of the ending number. Floating point numbers are stored in the memory in scientific notation, And the integer part is always an implicit "1". Since it remains unchanged, it cannot affect the accuracy.

Float: 2 ^ 23 = 8388608, a total of seven digits, which means that there can be a maximum of seven valid digits, but it is absolutely guaranteed to be 6 digits, that is, the float precision is 6 ~ 7 valid digits;

Double: 2 ^ 52 = 4503599627370496, a total of 16 digits. Similarly, the precision of double is 15 ~ 16 bits.

Float and double storage

------------------------------------------------------------------

In C and C #, float and double types are used for data of the floating point type. float data occupies 32 bits and double data occupies 64 bits, how do we allocate memory when declaring a variable float F = 2.25f? If it is randomly allocated, isn't the world a mess? In fact, both float and double are in compliance with IEEE specifications in terms of storage, and float follows IEEE r32.24, double follows r64.53.

Both single precision and double precision are divided into three parts in storage:

    1. Sign (sign): 0 indicates positive, 1 indicates negative
    2. Exponent: used to store exponential data in scientific notation and stored in shift mode.
    3. Mantissa: mantissa

Shows the storage method of float:

 

The dual-precision storage method is:

 

Both r32.24 and r64.53 use scientific notation to store data. For example, 8.25 is represented in decimal notation as 8.25*10.0, while 120.5 can be expressed as: 1.205*102. You don't need to talk about the knowledge of these primary schools. While our dumb computer doesn't know the decimal data at all. He only knows the numbers 0, 1. Therefore, in computer storage, we must first change the number to the binary scientific counting method, 8.25 can be expressed as 1000.01 in binary format. I rely on it and it won't even be converted, will it? Then I guess it's okay.120.5 in binary format: 1110110.1 in binary scientific notation 1000.01 can be expressed as 1.0001*23,111 0110.1 can be expressed as 1.1101101*26,The scientific notation of any number is 1. xxx * 2.N. The ending part can be expressed as XXXX. The first part is 1. Why do we need to express it? The first digit of the decimal point can be omitted. Therefore, the accuracy of the 23bit ending part can be changed to 24bit. The truth is that the 24bit can be precise to the last digit of the decimal point, we know that the binary value of 9 is 1001, SO 4 bits can be precise to one decimal place in decimal places, and 24 bits can make float accurate to 6 decimal places,For the index part, because the index can be positive and negative, the index range of the eight-digit index position can be-127-128. Therefore, the storage of the index part adopts shift storage, the stored data is metadata + 127.Next, let's take a look at the real storage methods of 8.25 and 120.5 in the memory. // Note: The index is just to facilitate the comparison.

First, let's take a look at 8.25, which is represented by the binary scientific Notation: 1.0001*23.

According to the above storage method, the symbol bit is: 0, indicating positive, the index bit is: 3 + 127 = 130, and the number of digits is, so the storage method of 8.25 is shown in:

 

The Storage Method of Single-precision floating point number 120.5 is shown in:

 

So how do you know the decimal value of a piece of data in the memory and how to store it with single precision? In fact, this is the reverse push process. For example, the following memory data is given: 0100001011101101000000000000.Data Segmentation, 0 10000 0101 110 1101 0000 0000 0000 0000, the storage in the memory is as follows:

 

According to our calculation method, we can calculate that such a group of data is expressed as: 1.1101101*26 = 120.5

The storage of double-precision floating-point numbers is similar to that of single-precision data. The difference is the number of digits in the index and tail. So I will not describe the double-precision storage method in detail here. I will only show the last storage mode diagram of 120.5. You can think about why this is the case.

 

Below I will solve one of our doubts with this basic knowledge point. Please refer to the following program and observe the output results.

Float F = 2.2f;

Double D = (double) F;

Console. writeline (D. tostring ("0.0000000000000"); // output format of C #

F = 2.25f;

D = (double) F;

Console. writeline (D. tostring ("0.0000000000000 "));

The output results are confusing. After 2.2 of the single precision is converted to double precision, it is accurate to 13 digits after the decimal point to 2.2000000476837, and 2.25 of the single precision is converted to double precision, to 2.2500000000000, why is the value 2.2 After conversion changed, but 2.25 is not changed? Strange, right? In fact, we can find the answer through the introduction of the two storage results above.

First, let's take a look at the 2.25 Single-precision storage method. It's very simple: 0 1000 0001 0000 0000 0000 0000 0000 2.25, while the dual-precision representation is: 0 100 0000 0001 0010 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000, so that the value of 0000 does not change during forced conversion,

Let's take a look at "2.2". "2.2" should be represented by the scientific Notation: Convert decimal places to binary decimal places by * 2, and take the integer part, so 0.282 = 0.4, so the first digit of the binary decimal is the integer part of 0.4, 0, 0.4 × 2 = 0.8, the second digit is 0, 0.8 × 2 = 1.6, and the third digit is 1, 0.6 × 2 = 1.2, the fourth digit is 1, 0.2*2 = 0.4, and the fifth digit is 0, so it is never possible to multiply to = 1.0,The resulting binary is an infinite loop arrangement of 00110011001100110011...For single-precision data, the ending number can only represent the precision of 24 bit, so 2.2 of Float storage is:

 

But it is stored in this way, but the value converted to decimal is not 2.2, because decimal conversion to binary may be inaccurate, such as 2.2, data of the double type also has the same problem, so there will be some errors in the floating point representation. When the single precision is converted to the double precision, there will also be errors, for decimal data that can be expressed in binary, such as 2.25, this error does not exist, so the above strange output results will appear.




Float and double range and precision

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.