Research on floating-point number principle

Source: Internet
Author: User

First, the article for the reason

Heard in the lab today that the realization mechanism of floating-point numbers, before just studied the original code, anti-code, complement, shift code relations, and this kind of problem is very low, generally easy to ignore. Simply ask why, thoroughly figuring out the problem and leaving the evidence behind, so there's this blog post.

Second, the original code, anti-code, complement, shift code

Do not recite complex formulas, précis-writers as follows, are used in the most simple and understandable language:
Special note: First of all, the positive number of the original, counter, and complement are the same, 0 of the original code and anti-code are two, because here 0 is divided into +0 and 0.

Original code:

That is, the direct binary representation, the highest bit is the sign bit: Positive number is 0, negative number is 1 for example: x=+101011, [X] original = 00101011
x=-101011, [X] original = 10,101,011 digits not enough with 0 complement.

Anti-code:

The inverse code of a positive number is the same as its original code, and the inverse of a negative number is a bitwise negation of its original code, except for the sign bit. Example: x=-101011, [X] original = 10101011,[x] anti-=11010100

Complement:

The complement of a positive number is the same as its original code, and the complement of a negative number is the minus 1 of its inverse code.
Example: x=-101011, [X] original = 10101011,[x] anti-=11010100,[x] complement =11010101
Note: the complement also has a speed algorithm, the sign bit unchanged, from the original code low start from right to left number, until the first 1 is encountered, retain This 1, after the bitwise reverse

Ps:0 's complement is unique if the machine has a word length of 8 then [0] complements =00000000

Move the Code "simplest":

Whether positive or negative, just reverse the sign of its complement.

Example: x=-101011, [x] original = 10101011, [x] anti =11010100,[x] fill =11010101,[x] Move =01010101

Third, from fixed point to floating point

The first point to the floating point is a leap forward, the content in fact can be very simple can be very complex. We generally say integers are fixed-point integers, that is, the decimal point fixed last. But integers can be either integers or floating-point numbers, for example, 255 is an integer, and 255.0 is a floating-point number.

What is a floating point number? This requires a small number of representations to speak:

3.1 Representation of floating-point numbers

An arbitrary binary number n in a computer can be written as

M: Mantissa, is a pure decimal.
E: The index of the floating point, which is an integer.
R: Cardinality, the machine for the binary count value is a constant, typically r is 2, 8, or 16

The mantissa mainly determines the effective bit, and the order code mainly represents the number of digits (decimal place).
Order code: A fixed-point integer, indicating the position of the decimal point in the data, determines the range of floating-point numbers, commonly used complement or shift code representation
Mantissa: Determines the numerical accuracy of floating-point numbers, is a fixed-point decimal, with a complement, also determines the entire floating-point symbol

When the machine word length is certain, the longer the order code, the greater the range, the lower the accuracy.
Floating-point number indicates a larger range than fixed-point, high accuracy

Draw an easy-to-understand table:
C-language float type number representation

sign Bit (S) Order Code (E) mantissa (M)
1 8 23

Floating-point numbers represent ranges such as:

Example:
A range of 8-bit fixed-point decimals that can be represented
0.0000001-0.1111111
1/128-127/128

Order Code 2 bit, mantissa 4 bit
can represent 2-11*0.0001-211*0.1111
0.0000001-111.1

Order Code 3 bit, mantissa 3 bit
can represent 2-111*0.001-2111*0.111
0.0000000001-1110000

Note: The range of float and double is determined by the number of digits of the exponent.
Float has an exponential position of 8 bits, and a double has a 11-bit exponent, distributed as follows:
Float
1bit (sign bit) 8bits (digit digit) 23bits (trailing digit)
Double
1bit (sign bit) 11bits (digit digit) 52bits (trailing digit)

3.2 Normalization of floating-point numbers

Purpose of Normalization:
(1) In order to improve the accuracy of data representation
(2) for uniqueness of data representation
(3) The mantissa is a normalized R-binary: Absolute value is greater than or equal to 1/R

The representation of the normalized number of the binary source code:
Positive 0.1xxxxxx
Negative 1.1xxxxxx

The normalized form of a complement mantissa: the highest bit of the mantissa is the opposite of the sign bit:
Positive 0.1xxxxxx
Negative 1.0xxxxxx

"32-bit floating-point number, IEEE754 with 127-Step code offset"
IEEE754 specifies that the offset is 2^ (e-1) -1,e is the length of the bit that stores the exponent, and the field that stores the exponent in a 32-bit floating-point number has 8 bits so the offset bit 2^ (8-1) -1=127. Here according to the standard, and does not use the shift code as the order code
* * Note that the IEEE754 in the shift code and the usual use of the shift code is not the same, IEEE754 with a 127-shift code, that is, the original number plus 127
Instead of the usual 128 shift code (that is, the complement symbol bit reversed) * *

Example 1:

If the reverse is
Example 2

Example 3

3.3 Where to move the code

A very interesting question.
Said so much, from the shift code when the order code, to move code-1 to the order code, do not know why, now come to see.
According to the Baidu Encyclopedia to move the code said:

Shift code (also called the Code) is the symbol of the reverse of the complement, generally used as the order of floating-point numbers, the purpose of the introduction is to ensure that the floating-point number of machine zero for the full 0.

The reasons are:

When using the complement to express the order code, when the order code is infinitely small, creating a underflow, the order code becomes 0, then the value of this floating-point number becomes 1.
In fact, this number is infinitely close to zero. Then we need to remove the "-0" value as machine zero.

Detailed explanations are as follows:
Because the expression of the floating-point number is 1.s * (2 ^ P), so the floating point 0 is actually taking infinity close to 0, instead of its value is actually equal to 0

The 0 representation is actually 1.0 * (2 ^-128), where-128 is the minimum number that the order can represent

So the question is, if the 128 is actually 10000000, then 0 of the floating-point number cannot be equal to 0. So the shift code is used.

However, the IEEE754 standard uses a 127 shift code, that is, on the basis of 0 plus 127 as 0, so the order of the range is-127-128, rather than the usual-128-127

The article is basically over, these should already be able to deal with the daily understanding, if the need for deeper understanding, you need to consult more information, if there is inappropriate content in the blog post, welcome criticism, also welcome the discussion

attached:
In C + +, a valid floating-point representation:
(1) Decimal decimal form. He has numbers and decimal points, and must have a decimal point. For example (123.) (123.0) (. 123).
(2) Exponential form. such as 123e3. The letter E (or E) must be preceded by a number, and the exponent after E must be an integer.
(3) Normalized exponential form, preceded by a decimal point and has only one non-zero number. such as 1.2345e8

Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

Research on floating-point number principle

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.