Gb2312 Chinese character location code, Exchange Code, and internal machine code

Source: Internet
Author: User

In 1981, the State promulgated a total of 6763 gb2312 Chinese Character standards, including level 1 3755, level 2 3008, and 682 non-Chinese characters. Standard encoding is provided for each character to facilitate mutual conversion within the computer.

As the gb2312 standard, only a 94 × 94 two-dimensional table is defined. The behavior zone number is listed as a location number. In this way, you can use the area code and location code to find Chinese characters. This encoding is what we call a location code.

For example

Chen (1934) Area No.: 19 BITs: 34. for processing and storage convenience, the Area No. And no. of each Chinese character are expressed in one byte in the computer.
The location code cannot communicate with Chinese characters, because the ASCII Code specifies the OOH-1F as the control code, so there is a conflict. Since computers are not invented by Chinese people, they can only follow the international standard iso2022 and ADD 32 to both the area code and location code to prevent conflicts. The code after 32 is called an international exchange code.
Chen-region no.: 19 + 32 = 51
00010011 + 00100000 = 00110011
Location: 34 + 32 = 66
00100010 + 00100000 = 01000010
That is, 5166 hexadecimal 3342
Because Chinese and Spanish characters are generally mixed in text, if the Chinese character information is not identified, it will be confused with the single-byte ASCII code. One solution to this problem is to regard a Chinese character as two extended ASCII codes so that the maximum bits of the two bytes representing the gb2312 Chinese character are 1.
The dual-byte Chinese character code with a high value of 1 is the internal code of gb2312 Chinese characters.
The value of 00110011 is changed from 33 to B3 when the highest bit is changed to 1.
The maximum value of 01000010 is changed from 42 to C2.
In this way, Chen's internal code should be b3c2
It should be noted that no matter what type of input method you use to input Chinese characters, the internal codes of the Chinese characters are the same.
If you want to convert a Chinese character's internal code to a location code, it is actually an operation in the opposite direction.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.