Example: Returns the number of characters in the input string:
int Getchinesecharactercount (char *pstr)
{
int retcnt = 0;
int i=0;
While (pstr[i]!=0)
{
if (Pstr[i] & 0x80)
{
retcnt++;
i++; Because of a Chinese character twoBytes
}
i++;
}
return retcnt;
}
The following are collected from:
Http://blog.163.com/[email protected]/blog/static/791554782011523103550237/
Ord ($STR) &0x80 to judge Chinese characters80 The corresponding binary code is 1000 0000, the highest bit is one and represents the Chinese character. The Chinese character coding format is commonly known as the 10 format.
a Kanji account is 2 bytes, but only one character is represented
"In Windows, the encoding of the Chinese Simplified character set is expressed in both 1 bytes and 2 bytes. When the high is 0x00~0x7f, it is a byte, and the high position is more than 0x80 with 2 bytes "
Note: The brackets are all 2 binary
when you find that the content of a byte is greater than 0x7f, then it must be a (with another byte pieced together into a) Chinese characters, how to judge certainly greater than 0x7f ?
0x7f (1111111) The next number is 0x80 (10000000), so want to be greater than 0x7f, the highest bit of this byte is certainly 1, we just need to determine whether the highest level is 1 on the line.
Judging Method:
bit and (the same bit is 1 for 1, otherwise 0):
such as: to determine whether the third digit of a number is 1, as long as the 4 (100) and the number of the 2nd bit to determine whether 1 is 2 (10) bit with.
in the same vein, whether the eighth digit is 1 to follow (10000000) is 0x80.
Why not >0x7f here? PHP may be OK, but in other strongly typed languages, the highest bits of 1 bytes are used to indicate negative numbers, and a negative number must not be greater than 0x7f (the largest integer)
Let me give you an example:
A's Assic code is (1100001)
A's Assic code is (1000001)
B's Assic code is 98 (1100010)
B's Assic code is (1000010)
find a rule: a A-Z letter, as long as the lowercase letter, the sixth bit is definitely 1, we can use this to determine the case:
This time just with a letter with 0x20 (100000) to the position and judgment:
if (ord ($a) &0x20) {
//Uppercase
}
How do I change all the letters to uppercase? The sixth bit of 1 is changed to 0 on the line:
$a = ' a ';
$a = Chr (ord ($a) & (~0x20));
echo $a;
Chinese characters in the C + + search string