Why is the char type in the following range?

Source: Internet
Author: User

Why is the char type in the following range?

In C, the signed char type ranges from-128 ~ 127. This is also written in every textbook, but no book (including teachers) will give you why-128 ~ 127. It seems that this problem is easy and easy, so you don't have to think about why. Isn't there a formula with an Integer Range:-2 ^ (n-1 )~ 2 ^ (n-1)-1 (n is the number of digits occupied by the memory of the integer type), so the 32-bit int type is-(2 ^ 31 )~ 2 ^ 31-1 (-2147483648 ~ 2147483647, but why is the absolute value of the minimum negative number always 1 more than the maximum positive number? Some programmers who have been working for a few years are ambiguous, because they have not thought deeply, but only know what to write in the book. As a result, I have to think deeply about the problem that has been ignored by many people.

For unsigned integers, it is very simple. All bits represent numerical values, such as char type and 8 bits, expressed in binary as 0000 0000 ~ 1111 1111,1111 1111 the maximum is decimal 255, so the range of unsigned char is 0 ~ 255. Here we will popularize the binary-to-decimal method. Multiply the value of each bit in the binary value by its bitwise (2 ^ (n-1 ), n is the right-to-left digit), and then add n to obtain the decimal number, for example: 1111 1111 = 1*2 ^ 7 + 1*2 ^ 6 + 1*2 ^ 5 + 1*2 ^ 4 + 1*2 ^ 3 + 1*2 ^ 2 + 1*2 ^ 1 + 1*2 ^ 0 = 255.


However, for signed integers, the highest bit of the binary value indicates positive and negative, not a value. If the highest bit is 0, it indicates positive. If it is 1, it indicates negative.In this way, the remaining (n-1) bits can represent the value, for example, char a =-1; then the binary representation is 1 10000001, And the representation is 0 0000001, therefore, the signed char type removes the remaining 7 digits of the symbol bit from 1111 111 to 127 at the maximum, and adds the symbol, 0 1111111 = 1111111, 1 127 =-127, and the range should be ~ 127. Similarly, the int type is the same, but the problem arises. in textbooks, the value is-128 ~ 127 ah, let's take a look at this amazing wonderful thing.


Popularize the computer's internal integer storage format,We all know that the computer stores numeric values in binary. unsigned integers are all stored. signed integers are the highest bit as the symbol bit.And the rest are numerical values..


This seems reasonable, but it brings a lot of trouble. When adding, 1 + 1:

0000 0001

+ 0000 0001

---------

0000 0010 .................. 2


1-1 =? Since the computer will only add and not subtract, it will convert to 1 + (-1), so:

0000 0001

+ 1000 0001

____________________

1000 0010 ............... -2


1-1 =-2? This is obviously not correct, so in order to avoid the subtraction operation error, the computer experts have developed an anti-code,Directly use the highest bit to indicate the symbol bit is called the original code. The binary code mentioned above is in the form of the original code. The reverse code is the reverse code of the other digits except the highest bit. It is specified that the positive number is the same as the original code, the Negative Inverse code is the original code except the symbol bit.Therefore, the source code of-1 is 1 0000001, and the reverse code is 1 1111110.


Now we use the back code to calculate 1 + (-1 ):

0000 0001

+ 1111 1110

--------

1111 1111 ............ The original code is 1000 0000 =-0.


Although the anticode solves the subtraction problem, it brings another problem:-0. Since 0000 0000 indicates 0, there is no need for-0. + 0 =-0 = 0, one 0 is enough. In order to avoid the problem of two 0 s, the computer Masters invented the complement Code, which stipulates:The complement of an integer is itself, and the complement of a negative number is its anticode plus one. Therefore, it takes two steps to convert a negative number into an anticode. First, it is converted into an anticode first, and second, it is added.


In this way, the complement of-1 is 1111 1111, 1 + (-1 ):

0000 0001

+ 1111 1111

________________

0000 0000 ........................ Because char is 8 bits, the result of the highest bit 1 being discarded is 0, and the operation is correct.

 

-0: the complement of the source code 1000 0000 is 1 0000 0000. Because char is eight bits, it takes eight bits as low as 00000000.
+ 0: the source code is 0000 0000, And the completion code is also 0000 0000. Although the completion code 0 is the same, there are two 0 s. Since there are two 0 s, and 0 is not a positive number, it is not a negative number, and the original code is 0000 0000.


In this way, the original codes of signed char are used to represent-127 ~ The number is between 127, but the original code 1000 is useless. You can use the permutation and combination to calculate the value 0 ???????, It can represent 2 ^ 7 = 128 numbers, which is exactly 0 ~ 127. 1 ???????, The number can also be 128. The total number of signed char is 256, which is the same as-127 ~ In the middle of 127, two zeros exactly match.


Now let's take a look at the remaining 1000 0000, since-127 ~ 0 ~ 127 has the corresponding original code, so what does 1000 0000 mean? Of course it is-128. Why is it-128, some people on the Internet say that-0 means that 1000 0000 and 128 are the same, so 1000 0000 indicates-128. I am not sure that I agree, or-128 does not have the original code, but only 1000, nonsense. Since there is no source code, we can also say that-128 is the same as-0 (1000 0000), so we can use 1000 0000 to represent-128. I can only say that, the answer should not be so far-fetched. The original code 1000 0000 and-128 are actually different.


But why can I use it to represent-128 for computation? If it is not limited to char type (that is, it cannot be limited to 8 bits), let's look at the original code of-128: 1 1000 0000, 9 bits, the highest sign bit. Calculate its reverse code: 1 0111 1111. Then, the complement code is:1 1000 0000, This is the-128 ComplementAnd found that, like the original code, 1 1000 0000 and 1000 0000 are the same? If the same person is blind, the-128 original code is different from the-0 (1000 000) original code, but in the char type, it can be expressed as-1000 in 128 000, the key is that char is 8 bits,It discards the highest sign bit 1 of-128.After truncation, the original code of-128 is the same as that of-0,That is to say, 1000 0000 and-128 discard the highest bit and the remaining eight digits are the same.So we can use-0 to represent-128. In this way, the remaining-0 (1000 0000) is used to represent-128 after truncation, because even if the truncated-128 and char range are different (-127 ~ 127) the operation does not affect the result, so I dare to say-128.


For example,-128 + (-1 ):

1000 0000 ------------------ discard the highest-bit-128

+ 1111 1111 ------------------1

________________

10111 1111 ------------------ char takes eight bits. The result is incorrect, but it does not matter. The result-129 is beyond the char type. Of course it cannot be expressed.


For example,-128 + 127:

1000 0000

+ 0111 1111

--------

1111 1111 ---------------1 results are correct, so that is why 1000 0000 128 represents.


That's why char is-128 ~ 127 instead of-127 ~ 127, the same is true for short int-32768 ~ 32767 because in 16 bits,-32768 indicates that the original code is 17 bits, and the remaining 16 bits are discarded as-0.


Another problem is that since-128 is dropped by the highest bit. Why can I print-128 again?

// In the memory, the bucket is stored with the complement code 1 1000, but only 0000 1000 char a =-0000 is stored because it is char. // Since the maximum bit is discarded, the output should be a decimal number of 1000 000 original codes-0, but why can it be output-128. Printf ("% d", );

I guess it is an internal convention in the computer. Like float, the precision of 24 bits can be expressed with 23 bits, because the highest bits is 1 by default. Then we can remove 23 bits and Add 1 more.


-128 is the same principle. When the data bus extracts 1000 000 from the memory, the CPU will add the highest bit to it and change it to 1 1000 0000 128 so that it can be converted to-output, otherwise, how can I output 1000 0000? Of course, this is one of my inferences. You have to ask the CPU designer how to implement it.


Let's look at another example:

Char a =-129; printf ("% d", a); // how much will be input ?? The result is 127. Why?
-129 When the complement code is 10 0111 1111, only the last eight bits are used for storage, that is, the value of 0111 111 is exactly 127, and the value of-130 after truncation is 126 .....


In this case, we will not discuss the model first.


So:

Unsigned char a =-1; if (1> a) {printf ("greater than");} else {printf ("less ");}

What is the result? Surprisingly, if it is less than, not greater than, where are you? storage problems:

A is unsigned, and its eight bits are used to store values without a signed bit. The Compiler converts-1 to a complement code of 1111 1111, but because it is unsigned, the computer treats 1111 11111 as unsigned. Naturally, it is 2 ^ 8-1 = 255, so it is equivalent to if (1> 255) it must be printf ("less.


This article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.