In 1981, the State promulgated a total of 6763 gb2312 Chinese Character standards, including level 1 3755, level 2 3008, and 682 non-Chinese characters. Standard encoding is provided for each character to facilitate mutual conversion within the computer.
As the gb2312 standard, only a 94 × 94 two-dimensional table is defined. The behavior zone number is listed as a location number. In this way, you can use the area code and location code to find Chinese characters. This encoding is what we call a location code.
For example
Chen (1934) Area No.: 19 BITs: 34. for processing and storage convenience, the Area No. And no. of each Chinese character are expressed in one byte in the computer.
The location code cannot communicate with Chinese characters, because the ASCII Code specifies the OOH-1F as the control code, so there is a conflict. Since computers are not invented by Chinese people, they can only follow the international standard iso2022 and ADD 32 to both the area code and location code to prevent conflicts. The code after 32 is called an international exchange code.
Chen-region no.: 19 + 32 = 51
00010011 + 00100000 = 00110011
Location: 34 + 32 = 66
00100010 + 00100000 = 01000010
That is, 5166 hexadecimal 3342
Because Chinese and Spanish characters are generally mixed in text, if the Chinese character information is not identified, it will be confused with the single-byte ASCII code. One solution to this problem is to regard a Chinese character as two extended ASCII codes so that the maximum bits of the two bytes representing the gb2312 Chinese character are 1.
The dual-byte Chinese character code with a high value of 1 is the internal code of gb2312 Chinese characters.
The value of 00110011 is changed from 33 to B3 when the highest bit is changed to 1.
The maximum value of 01000010 is changed from 42 to C2.
In this way, Chen's internal code should be b3c2
It should be noted that no matter what type of input method you use to input Chinese characters, the internal codes of the Chinese characters are the same.
If you want to convert a Chinese character's internal code to a location code, it is actually an operation in the opposite direction.