Floating Point Number storage in computer)

Last Update:2018-12-03 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Author: jillzhang

Contact: jillzhang@126.com

This article is original and reposted. Please retain the source and author. Thank you.

In C and C # languages, float and double types are used for data of the floating point type for storage. float data occupies 32 bits and double data occupies 64 bits, how do we allocate memory when declaring a variable float F = 2.25f? If it is randomly allocated, isn't the world a mess? In fact, both float and double are in compliance with IEEE specifications in terms of storage, and float follows IEEE r32.24, double follows r64.53.

Both single precision and double precision are divided into three parts in storage:

Sign (sign): 0 indicates positive, 1 indicates negative
Exponent: used to store exponential data in scientific notation and stored in shift mode.
Mantissa: mantissa

Shows the storage method of float:

The dual-precision storage method is:

R32.24 and r64.53 both use scientific notation to store data. For example, 8.25 is represented as 8.25 * in decimal notation, and 120.5 is represented as 1.205 *, I don't need to talk about these primary school knowledge. While our dumb computer doesn't know the decimal data at all. He only knows the numbers 0, 1. Therefore, in computer storage, we must first change the number to the binary scientific counting method, 8.25 can be expressed as 1000.01 in binary format. I rely on it and it won't even be converted, will it? Then I guess it's okay. 120.5 in binary format: 1110110.1 in binary scientific notation 1000.01 can be expressed as 1.0001 *, 1110110.1 can be expressed as 1.1101101 *, and any number is expressed as 1. xxx *, the ending part can be expressed as XXXX, and the first part is "1". Why do we need to express it? The first digit of the decimal point can be omitted. Therefore, the accuracy of the 23bit ending part can be changed to 24bit. The truth is that the 24bit can be precise to the last digit of the decimal point, we know that the binary value of 9 is 1001, SO 4 bits can be precise to one decimal point in decimal, and 24bits can make float accurate to 6 digits after the decimal point, and for the exponent part, because the index can be positive and negative, the index range of the 8-bit index value should be-127-128. Therefore, the storage of the index part adopts shift storage, and the stored data is metadata + 127, next let's take a look at the real storage methods of 8.25 and 120.5 in the memory.

First, let's take a look at 8.25, which is represented by the binary scientific Notation: 1.0001 *

According to the above storage method, the symbol bit is: 0, indicating positive, the index bit is: 3 + 127 = 130, and the number of digits is, so the storage method of 8.25 is shown in:

The Storage Method of Single-precision floating point number 120.5 is shown in:

How do you know the decimal value of a piece of data in the memory that is stored in a single precision? Actually, it is the reverse push process. For example, the following memory data is given: 0100001011101101000000000000. First, we split the data into segments: 0 10000 0101 110 1101 0000 0000 0000 0000, the storage in the memory is as follows:

According to our calculation method, we can calculate that such a group of data is expressed as: 1.1101101 * = 120.5

The storage of double-precision floating-point numbers is similar to that of single-precision data. The difference is the number of digits in the index and tail. So I will not describe the double-precision storage method in detail here. I will only show the last storage mode diagram of 120.5. You can think about why this is the case.

Below I will solve one of our doubts with this basic knowledge point. Please refer to the following program and observe the output results.

Float F = 2.2f;
Double D = (double) F;
Console. writeline (D. tostring ("0.0000000000000 "));
F = 2.25f;
D = (double) F;
Console. writeline (D. tostring ("0.0000000000000 "));

The output may be confusing. After 2.2 of the single precision is converted to double precision, it is accurate to the first 13 digits after the decimal point to 2.2000000476837. After 2.25 of the single precision is converted to double precision, it is changed to 2.2500000000000, why is the value 2.2 After conversion changed, but 2.25 is not changed? Strange, right? In fact, we can find the answer through the introduction of the two storage results above. First, let's take a look at the 2.25 Single-precision storage method. It's very simple: 0 1000 0001 0000 0000 0000 0000 0000 2.25, while the dual-precision representation is: 0 100 0000 0001 0010 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000, so that the value of 0000 will not change during forced conversion, and let's look at 2.25 again, 2.2 The Scientific notation should be: Convert decimal places to binary decimal places to decimal places X 2, take the integer part, so 0.282 = 0.4, so the first part of the binary decimal point is 0.4 integer 0, 0.4 × 2 = 0.8, the second digit is 0, 0.8 × 2 = 1.6, the third digit is 1, 0.6 × 2 = 1.2, the fourth digit is 1, 0.2*2 = 0.4, and the fifth digit is 0, which is never possible to multiply to = 1.0. The obtained binary is an infinite loop arrangement of 00110011001100110011 ..., for single-precision data, the ending number can only represent the precision of 24 bit, so 2.2 of Float storage is:

However, in this storage mode, the value converted to decimal is not 2.2. It may be inaccurate when the decimal is converted to binary, such as 2.2, data of the double type also has the same problem, so there will be some errors in the floating point representation. When the single precision is converted to the double precision, there will also be errors, for decimal data that can be expressed in binary, such as 2.25, this error does not exist, so the above strange output results will appear.

This article is original by the author and is only published in the blog Park. I hope you can indicate the source and author when you reprint it. Thank you.

Note: In the writing process, the following documents are referenced:

Http://www.msdn.net/library/chs/default.asp? Url =/library/CHS/vccore/html/_ core_why_floating_point_numbers_may_lose_precision.asp

Http://blog.csdn.net/ganxingming/archive/2006/12/19/1449526.asp

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Floating Point Number storage in computer)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Floating Point Number storage in computer)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support