1.
The GB encoding of a Chinese character is 2 bytes, and the highest bit of the high byte is 1, that is, the high byte> 127. It usually needs to be converted to unsigned char for judgment. Please pay attention to this. For example:
While (* p)
{
If (unsigned char) * p> 127) // Chinese Character
{
P + = 2;
}
Else // standard ASCII characters
{
P + = 1;
}
}
2. What is the difference between char and unsigned char? When should I use char and unsigned char?
A: It is essentially eight bits, that is, one byte. Char regards the highest bit as the sign bit, and unsigned char uses all as the positive number digits, resulting in-128 ~ 127 and 0 ~ 255 of the difference between dimensions.
When should I use unsigned char? If you want to perform bitwise operations on these eight digits, you need to use unsigned char. If you use char to read these 8 bytes, because the highest bit is treated as the symbol bit, if the bit operation is greater than 127, it will overflow.
Remember that there is a trap. When unsigned char x = 255, after x ++, the value of x becomes 0.
Char and unsigned char are unsigned.
If both are used as characters, there is no difference, but there is a difference when using integers:
Char Integer Range:-128 to 127 (0x80 _ 0x7F ),
The Integer Range of unsigned char is 0 to 255 (0 _ 0xFF)
In most cases, data of the char, signed char, and unsigned char types share the same characteristics. However, when you assign a single byte number to a large integer number field, you will see their differences in symbol extension. Another difference is that when a number between 128 and 255 is assigned to the signed char variable, the compiler must first convert the value, and a warning will also appear. It is easier to use unsigned char to assign values in hexadecimal notation. Depending on the compiler implementation, char is either equivalent to signed char or equivalent to unsigned char.
3. The same memory: 10010000
The char * is interpreted as-112.
The unsigned char * is interpreted as 144.
Similarly, if the memory content is assigned to the integer value, the unsigned char type will still get 144, And the char type will be negative.
4. # include <stdio. h>
Int main ()
{
Char x = 0x80;
Unsigned char y = 0x80;
Unsigned char z [] = "hello ";
Printf ("x = % d, HEX = % 2X, (x> 1) = % d, HEX = % 2X. /n ", x> 1, x> 1 );
Printf ("y = % d, HEX = % 2X, (y> 1) = % d, HEX = % 2X. /n ", y> 1, y> 1 );
Printf ("% s =", z );
For (x = 0; x <4; ++ x) printf ("% 02X", z [x]);
Printf ("/n ");
Return 0;
}
5.
# Include <iostream. h>
Int main ()
{
Unsigned char sz [] = "012254 my future ";
Int nhz;
Sz [3] = 100;
Char szchinese [3] = "/0 ";
For (int I = 0; sz [I]! = '/0'; I ++)
{
If (sz [I]> 0x80)
{
Szchinese [0] = sz [I];
Szchinese [1] = sz [I + 1];
I ++;
Cout <szchinese <endl;
}
// Nhz = sz [I];
}
Return 0;
}