Csapp (1): Computer representation of numbers

Source: Internet
Author: User
Tags rounds

in the computer, use BITS to store information. The same bit-level representation, changing the way it is interpreted, expresses different information.

0. XOR in bit-level operations

A bit-level operation should be noted that the difference or this operation, the meaning of X^y is: for the first bit, x, y in the I-bit value is not the same, the result is 1; this means that, on the I bit, x, Y has only one 1 o'clock, the result is 1, there is only one 0 o'clock, the result is 1; The result is 0; Variation of XOR and its flexibility: X^y = x|y–x&y; X^y = X&~y | ~x&y; (other changes added).

I. Unsigned integers and encodings

bit vector x=[xw-1,..., x0], which is used to represent an unsigned integer, the integer value is:

B2uw (x) =∑xi2i; (I=0 to W-1)

B2U is binary to Unsigned, which maps a 0, 1 string with a length of W to a nonnegative integer. The range of nonnegative integers where bit vector x can be represented is [0, 2w-1]; This encoding is hereinafter referred to as "unsigned encoding".

Unsigned code vs. complement: modulo operation

Unsigned encoding as its name implies, you can encode only unsigned numbers (non-negative numbers); So, how do you represent signed integers? We can use the highest bit of the bit vector to represent the sign bit: 0 for positive, 1 for negative, and the rest for the value itself, This encoding is called the original code (Sign-magnitude), and for bit vector x, its integer value is:

There are two problems with this:

1) [100] and [000] all represent 0, namely: There are positive 0 and minus 0 points;

2) When the computer does the addition and subtraction operation, the sign bit can not participate in the operation, which will make the computer system become complex.

in order to solve these two problems, especially problem 2, the introduction of the complement. What is a complement? This has to do with the characteristics of the computer:

For unsigned encoding, the bit vector x(which makes w=3) all bits are 1, then it represents the unsigned integer value is: 23-1=7, at this time this integer value plus 1, what happens? The integer value becomes 8. A bit-level representation of 8 is [1000], but x is only 3 bits, so x=[000], whose integer value is 0!! This means that an integer operation in a computer is a modulo operation. For X, when it represents an integer value greater than 2w-1, a 2w is automatically shed, i.e., for unsigned integer x, its bit vector x is equivalent to the bit vector of an unsigned integer (x mod 2w). For example, clock, now clock is 3 o'clock, if you want to tune it to 1 points, you can dial 2 hours (subtraction), you can also dial 10 hours (addition; (3+10) mod 12 =1).

Take w=3 as an example, 1+ ( -1) =1+ ( -1 mod 8) = 1+7 =[001]+[111]= [000] =0.

We can use [111] to represent-1, which allows the sign bit to participate in the operation. The key to the symbolic operation of "unsigned operation" is to convert a signed negative integer to an unsigned positive integer using the MoD feature:-X (positive x) corresponds to an unsigned integer equal to 2w-x. This solves two problems with the original code:

1) The bit vector x for the 0,-0=2W-0=2W,W bit cannot be represented by-0, i.e.: In the complement does not-0;

2) Since the signed number is converted to an unsigned number, the "sign bit" can naturally participate in the operation.

Three, signed integer and complement

So what are the mathematical characteristics of the complement? Bit vector x=[xw-1,..., x0], which is used to represent a signed integer, the integer value is:

B2TW (x) =-xw-12w-1+∑xi2i; (I=0 to W-2)

B2T is binary to both ' s-complement, which maps a 0, 1 string with a length of W to a signed integer. Where the bit vector x can represent an integer range of [ -2w-1, 2w-1];

Iv. conversions between signed and unsigned numbers

for bit vector x, the value obtained using unsigned encoding is completely different from the value obtained by using the complement. However, the bit-level representation of the bit vector x has not changed. In this way, a problem is raised, a bit vector x represents the unsigned integer value bit u, and converts it to a signed integer t, what is T?

To convert an unsigned number to a signed number (Unsigned to two ' s-complement), this process uses bit vector x as the medium, because x is invariant. Known as the unsigned number x, we can assume that its bit vector is x, then there is

When X<2w-1, there is xw-1=0, when x>=2w-1, there is xw-1=1, so:

Similar to:

as shown in the following:

When using the C language, especially when comparing size and mathematical operations, when mixing signed and unsigned numbers, the C language implicitly converts the signed number to an unsigned number, which can easily lead to an error or a program vulnerability. Therefore, it is recommended never to use unsigned!!! Use the unsigned number only as a bit vector.

(for truncation and expansion of integers, one thing to note: Change the size first, then change the symbol.) )

Five, Integer Arithmetic (1): addition

assuming that x and y are both bit vectors of W-bit, the most important feature of the computer's integer operation is the modulo operation characteristics.

For unsigned integers, it is easy to have the following formula:

When you execute a C program, the overflow is not signaled as an error. How do we know if an overflow has occurred? Suppose S=x+y; If an overflow occurs, there is s=x+y-2w,x,y<2w, so there is s<x (equivalent s<y). (Note: This method is for unsigned operations only)

For complement addition, the following derivation (on the basis of known unsigned operations, all of the following complement operations are deduced on the basis of the corresponding unsigned operation ):

according to the order z= (x+y) mod 2w, the x+y can be divided into [ -2w, 0) and [0,2w-1] two parts, according to the U2T function of the segmentation characteristics, you can also divide z into [0, 2w-1) and [2w-1,2w] two parts, so:

when executing a C program, how do you know if an overflow occurred? When both X and Y are negative, the result is a non-negative number (including 0), then negative overflow, if both x and Y are non-negative, the result is negative, then it is a positive overflow.

Referring to addition, the subtraction associated with it: x-y=x+ (-y); So for x, we often need to ask for an additive inverse. For x is not equal to -2w-1, its additive inverse is-X, for x=-2w-1,-x and can not be represented by a W-bit vector, so its inverse element is itself, because, at this time 2x=2w =2w mod 2w=0:

So how do you solve it at the rank level? That is: for the complement, known X and its w bit vector x, the value-x corresponding bit vector y is what? According to our model of the complement, we can represent the unsigned number conversion bit unsigned (according to the second section). If x is [0, 2w-1-1], according to the T2U formula, then-x=2w-x= (2w-1) +1-x, y=[11..11]+[00..01]-x; For x is [ -2w-1,0], according to the T2U formula, then-x=2w-(x+2w) =0-x,y=[00..00]-x; either 11. 11]+[00..00]-x or [00..00]-x, they are all equal to the x each bit reversed and then add 1 to the result.

Six, Integer Arithmetic (2): multiplication

for unsigned multiplication, there is x*y= (x*y) mod 2w.

For the complement multiplication, there are:

we can tell that the low-level of the unsigned multiplication and the complement multiplication is the same (the second and fourth rows).

When the compiler is multiplying, it will try to use SHIFT and addition operations instead of multiplication multiplied by the constant factor. For X*k, the constant k is first expressed as a set of alternating sequences of 0 and 1: [(0 ... 0) (1 ... 1) (0 ... 0) ... (1 ... 1)]. For example 14 can be written as [(0..0) (111) (0)]. Consider a set of contiguous 1 (for 14,n=3,m=1) from a bit position n in place of position m, which we can have:

1) (X<<n) + (x<<n-1) +...+ (x<<m);

2) (x<<n+1)-(X<<M);

About shift: For C language, move K bit, but k>=w, what will happen? Note: The C language standard avoids explaining what to do in this case. So what if 32-bit X needs to move 32 bits to the left? Can (X>>31) >>1: The shift number k is disassembled into several smaller than the number of W.

When the power constant is divided by 2, it can be replaced by a right shift, but when the computer moves right, it causes rounding down, for example: -7/2 will get-4. However, the rule requires that we round to 0, i.e. -7/2 should be equal to-3. How to achieve it? We can offset this value between shifts (biasing), i.e. (x<0?  (x+ (1<<k)-1): x) >>k; This method of replacing division with the right shift cannot be generalized to divide by any constant.

Seven, floating point rules

A floating-point representation encodes the rational number of a shape such as v=x*2y, which involves a very large number of executions (| v|>>0), very close to 0 (| V|<<1), and more generally as a calculation of the approximate value of a real number operation, is useful. The binary notation of decimals can only represent those numbers that can be written as x*2y, and the other values can only be represented in an approximate manner.

IEEE floating-point standards represent a number in the form of v= ( -1) s * M * 2E:

where S is the symbol bit (sign), the Mantissa (significand): M is a binary decimal, the range is [up] or [0,1]; Order code (exponent): The role of E is weighted against floating-point numbers. As shown: The highest bit is the sign bit, the middle segment is the order segment, and the last paragraph is the mantissa segment.

Floating-point numbers are divided into three cases: 1. Normalized value, 2. Non-normalized value, 3. Special value

CASE1: Normalized value

The bit pattern of the intermediate Order segment (exp) is not all 0, nor is it all 1 o'clock. At this point, the order snippet is interpreted as a biased representation of the signed integer, namely: E=e-bias, where E is the unsigned number represented by the order snippet, Bias=2k-1-1, so, for the K (number of bits of the order segment) =8, E is the value of the range is -126~ 127 (because the order segment is not equal to 0 is not equal to 255);

The decimal segment F has 0<=f<1, the decimal point is on the left of the most significant bit of the F segment, and the Mantissa m=1+f;

CASE2: Non-normalized values

When the order segment is all 0 o'clock, a non-normalized value is represented. At this time the order value is E=1-bias, and the tail value m=f, the non-normalized number has two uses: 1) provides a method for representing the value 0, (IEEE Standard) has positive 0 minus 0, 2) represents those very close to 0, they provide an attribute, called gradual overflow (gradual underflow).

CASE3: Special Values

This time the order value is all 1. When the decimal field is all 0 o'clock, the resulting value is infinite; when the decimal field is nonzero, the result value is called "NaN".

1. Note that the non-normalized number of code values is E=1-bias instead of-bias, with the single precision as an example, the minimum normalized number of e=-126,m=1.0, the largest non-normalized number e=-126 (so the non-normalized number is so), M is [0.11..11]2. It is the definition of the non-normalized number E, which realizes the smooth transition of the maximum non-normalized number to the minimum normalized number.

2. The bit expressions are interpreted as unsigned integers, and they are arranged in ascending order, just as they represent floating-point numbers, which is not accidental, so the IEEE is designed so that floating-point numbers can be sorted by using integer sort functions. Of course, negative numbers have to be dealt extra.

A few conclusions (K-Order code bits, n decimal digits):

1. The bit representation of the smallest positive non-normalized value (except 0), the least significant digit of the trailing digit is 1, the remainder is 0, with M=f=2-n, and e=2-2k-1;

2. The largest non-normalized number, the tail digits are all 1, with m=f=1-2-n, and e=2-2k-1;

3. The minimum normalized number, the least significant bit of the order code bit is 1, the remainder is 0, and the trailing digit is 0, with m=1,e=2-2k-1;

4. The bit of value 1.0 indicates that the most significant bit of order code is 0, the remainder is 1, and the trailing digits are all 0, with m=1,e=0;

5. The maximum normalized number, the least significant bit of the order code bit is 0, the remainder is all 1, and the trailing digit is all 1, with m=2-2-n,e=2k-1-1.

Viii. some notes on floating-point numbers

rounding, because floating-point numbers can only approximate the real numbers, so oh we want a systematic way to find "closest" matching values, IEEE defines four rounding methods, rounds to even, rounds to 0, rounds up, rounds down.

Floating-point addition does not have a binding nature. (3.14+LE10) -1e10 get 0.0, while 3.14+ (1e10-1e10) gets 3.14. For scientific computing programmers, the lack of integration and distribution is a serious problem, and even the seemingly simple task of determining whether two lines intersect and write code in a three-dimensional space can be a big challenge.

Converting large floating-point numbers into integers is a common source of program errors. In the C language, this results in rounding to 0.

Csapp (1): Computer representation of numbers

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.