Representation of the floating point type in a computer

Source: Internet
Author: User
Tags binary to decimal

When I was recruiting on campus last time, I asked a few students a question about floating point computing.CodeWhy is the first row returned?FalseAnd both the second and third rows returnTrue.


Console. Writeline ("1.123f + 1.345f = 2.468f? {0 }",

1.123f + 1.345f = 2.468f );// False

Console. Writeline ("1.123f + 1.344f = 2.467f? {0 }",

1.123f + 1.344f = 2.467f );// True

Console. Writeline ("1.123 + 1.345 = 2.468? {0 }",

1.123 + 1.345 = 2.468 );// True


We know, IntegerThe type is4Bytes,The value range is2 **-32To2 ** 32Between.In addition, each integer has a memory representation in the computer.,For example, number12345Is:

00000000 00000000 00110000 00111001

That is, the conversion method from binary to decimal: 12345 = 213 + 212 + 25 + 24 + 23 + 20


In the computerFloatType,The biggest problem with this method is how to save the decimal point position in the binary.--Because the decimal point is not fixed.So computer experts came up with the scientific notation to represent floating-point numbers.,Because in scientific notation,The decimal point is always fixed.,Is behind the first number.,For example12345The scientific notation:1.2345*105. While0.012345The scientific notation:1.2345*10-2.


If the formula is used for representation,The formula for scientific notation is .


Because our computer is stupid.,Only Process0And1,So when the floating point is expressed in the computer,Base number in the above formulaBYes2,Instead10.In computer memory,The formula for calculating the floating point is actually saved.,Instead of the exact value,So the floating point numbers in the computer are all approximate values.,Instead of the exact value.In computer,ToFloatType,In Memory32The content represented by a single digit is:

Seeeeeeeee emmmmmmm mmmmmm Mmmmmmmm

WhereSSymbol bit,1This floating point number is a negative number0It indicates a positive number. WhileEIs the index in the scientific notation.E, BecauseEEither positive or negative, because1234.5The scientific notation of is1.2345*104,While0.12345The scientific notation of is1.2345*10-1. SoEThe calculation rule isEeeeeee ERepresents the number minus27-1(7Because we have8One digit indicates the index ).8The bits that represent the index can be used to represent a negative number and a positive number.


While MIs used to calculate the scientific notation. M The calculation rule is, starting from left to right, the first M Representative 2-1 , Second M Indicates 2-2 And so on. Because M Or greater 1 Or smaller 1 (Remember, the base number in our notation is 2 Instead 10 ), If E The value is greater 0 Then we can add this 1, If E The value is less 0 So we can omit this 1 This is because we can always adjust E To achieve this. Therefore M Before the decimal point 1 Omitted. 1 You can E To calculate M Before the decimal point. 1 .


ForDoubleIn terms of data type, the information represented by each bit in the computer is as follows:

Seeeeeeeee eeeemmmm Mmmmmmmm mmmmmmmmmm Mmmmmmmm mmmmmm

Now let's look at the above Code.:

DoubleA = 1.345f;

DoubleB = 1.123f;

DoubleResult = A + B;

DoubleExpected = 2.468f;


Console. Writeline ("Result = expected? {0 }", Result = expected );

Debug thisProgram,Take a look at the memoryAThe value is3ff5851ec0000000(For convenience, I willArticleThe program at the beginning is changed),In the calculator, convert the hexadecimal value into binary.:



Because the calculator will omit the zero prefix,Therefore, the above value is actually:

00111111 11110101 10000101 00011110 11000000 00000000 00000000 00000000


BecauseEThe part value is greater than or equal0Because127-(27-1) = 1,So we are computingMYou need to add an implicit1,That is to say, the value of the yellow background actually indicates:


(Decimal) 1.0 + 0101 10000101 00011110 11000000 00000000 00000000 00000000Decimal number

= 1.0 + 2 **-2 + 2 **-4 + 2 **-5 + 2 **-10 + 2 **-12 + 2 **-16 + 2 **-17 + 2 **-18 + 2 **-19 + 2 **-21 + 2 **-22

= 1.0 + 0.25

+ 0.0625

+ 0.03125

+ 0.0009765625

+ 0.000244140625

+ 0.0000152587890625

+ 0.00000762939453125

+ 0.000003814697265625

+ 0.0000019073486328125

+ 0.000000476837158203125

+ 0.0000002384185791015625

= 1.0 + 0.3450000286102294921875

= 1.3450000286102294921875


ThereforeAlgorithm, B Is actually 1.1230000257492065 , So A + B Actually 2.468000054359436 And the above program Expected The actual value is 2.4679999351501465 , That's why the above 1.345f + 1.123f! = 2.468f .


While1.344f + 1.123f = 2.467fIt is a coincidence, because the values of the first two are exactly the same as the form of the latter in the computer.


The conclusion is that the floating point type cannot be used.=Number to determine, because the floating point number is an approximate value, can only be determined by the subtraction of the two is less than an acceptable error.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.