Chinese location Code, GB code, in-machine code, input code (external code), glyph code, etc.

Source: Internet
Author: User
Tags control characters

Location code in order to make each Chinese character has a national uniform code, Location code is the state 94*94 a square, where each row is called a zone, each column is called a bit, combined together to form a location code, we can query the relevant site of a Chinese character location code, such as the Chinese character "I" Location code is 46 50, Identify "Me" in zone 46, 50 bits.    GB Code Location Code +2020h. GB code is not equal to the location code, it is a small conversion from the location code to get. The conversion method is: The decimal area code and the code is converted to 16 code and bit code, and then got a with the national standard Code has a relative position difference, and then the first byte of this code and the second byte are added 20H, the GB code. such as: "Bao" word Location Code 1703D, GB code for 3123H, it is converted by the following: 1703d->1103h->+2020h->3123h. In-machine code GB code +8080h. Input code (external code) input code is the use of English keyboard input characters when the code. At present, our country has introduced the input code has hundreds of kinds, but the user uses more about more than 10 kinds, according to the input code code the main basis, can divide into the sequence code, the sound code, the shape code, the phonetic shape code four kinds, such as "Bao"? Word, with full spell, the input code for the code for "BAO", with the location code, the input code for "1703", with the Wubi font is  "WKS". Glyph code, one of the lattice codes. In order to output Chinese characters on the display or printer, the Chinese characters are designed into bitmap by graphic symbols, and the corresponding lattice codes (glyph codes) are obtained.
The font used for display is called display font. Displays a Chinese character generally using 16x16 lattice or 24x24 lattice or 48x48 lattice. The size of a known Chinese character lattice can be used to calculate the byte space required to store a Chinese character.
Example: Using 16x16 lattice to represent a Chinese character, that is, each character with 16 lines, 16 points per line, a point requires 1-bit binary code, 16 points need 16-bit binary code (that is, 2 bytes), a total of 16 lines, so it takes 16 rows x2 Bytes/Line = 32 bytes, that is, 16x The 16-point array represents a Chinese character, and the glyph code needs 32 bytes.
That is: the number of bytes = number of dot matrix x (number of dots/8) is used to print the font is called Print font, where the Chinese characters more than the display font, and the work is not like the display font needs to be transferred into memory.    Why not use the location code directly to represent the GB code, to add 2020H?    20H is 32D. The location code is a table of China-defined 94 times 94. A byte with a low of seven bits has a status of 127. In English, 0 to 32 characters are control characters, the 127th bit is the Del character, that is, the deletion of characters, so there are 34 control characters. (from 0 to 127) 128 minus (0 to 32 is 33 plus 127th bit of that character altogether 34) 34 equals 94. So there are 94 states that are available for use in Chinese.
GB code is actually exchange code, is used in exchange, of course, Exchange code is not to cause ambiguity, 94 rows of 94 columns of a code plus 32 is the line number from 33 to 126 column number is also from 33 to 126. This does not conflict with 0~32 's English control characters. Why should the code of the machine add 8080H on the basis of the GB code, instead of just using the GB code as the internal code?

Because there are only 26 letters in English, so with a byte can be expressed, with a byte words can be expressed 2^8 symbols, that is, 256 symbols, more than enough Ah, so foreigners have developed a specification, the provision of 0-127 (00000000-01111111) characters they used, Used to denote English characters and some symbols, ASCII code, but there are many Chinese characters, 256 is not enough, so the country used two ASCII to represent a Chinese character, that is, 2 bytes to identify a Chinese character, such as "Bao" The Location Code is: 1703, so the GB code is: 1703 10 of the system + 2020H=3123H, however: 31H and 23H in ASCII Chinese value, 31H in ASCII means that the number 1,23h represents "#" (This can be queried online), if I use the GB code as the internal code, if the memory has two bytes of 31H and 23H, So in the end is to express the Chinese character "Bao"? Or the character #? So there is ambiguity, but the solution is, 0-127 is not by the English characters accounted for it? So I'm going to say no after 127? So I put the character of two bytes per byte on the machine 128 (16 binary is 80H), so the problem solved, the Chinese character "Bao" in the machine code into: 3123h+8080h=b2a3h (10 is 45475), open Notepad and hold alt+45475 to see if it is "Bao", This will not conflict with the English ASCII.

Chinese location Code, GB code, in-machine code, input code (external code), glyph code, etc.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.