PHP Ord ($STR) >0x80 understanding of the self

Source: Internet
Author: User
Tags character set ord

In order to identify double-byte characters, such as Chinese characters or Japanese Korean and so on are two bytes, a high of 1 per byte, while the general West character only one byte, seven-bit valid encoding, high to 0
and the 0x80 corresponding binary code is 1000 0000, the highest bit is one, represents the Chinese character. The encoding format is commonly referred to as 10 format. One Chinese character occupies 2 bytes, but only one character

The encoding of the GBK simplified character set is represented by 1 bytes and 2 bytes at a time. When the high is 0x00~0x7f, it is a byte, and the high 0x80 above is represented by 2 bytes.

Note: There are 2 in the brackets inside the system
When you find that a byte of content is greater than 0x7f, it must be a (pieced together with another byte) of Chinese characters, how to judge is definitely greater than 0x7f?
0x7f (1111111) The next number is 0x80 (10000000), so if you want to be larger than 0x7f, the highest bit of this byte is definitely 1, so we just need to determine if the top is 1.

Judging method:
Bit and (same bit is 1 only 1, otherwise 0):
For example: To determine whether the third digit of a number is 1, as long as the 4 (100) bit with, to determine whether the 2nd digit of a number is 1 with 2 (10) bit.
Similarly judge whether the eighth digit is 1 to be followed (10000000) is the 0x80 position.

Why not >0x7f,php may be ok here, but in other strongly typed languages, the highest bits of 1 bytes are used to mark negative numbers, and a negative number certainly cannot be greater than 0x7f (the largest integer)

Another example:

The code is as follows Copy Code

The Assic code for A is 97 (1100001)
The Assic code for A is 65 (1000001)

b The Assic code is 98 (1100010)
b The Assic code is 66 (1000010)

Found a rule: a A-Z letter, as long as the lowercase letter, the sixth digit is certainly 1, we can use this to determine the case:
At this point, just follow the letter with 0x20 (100000) to position and judge:

The code is as follows Copy Code
if (ord ($a) &0x20) {
Capital
}

How do I change all letters to uppercase? The sixth digit 1 is changed to 0:

  code is as follows copy code
$a = ' a ';
$a         = Chr (ord ($a) & (~0x20));
Echo $a;

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.