Basic knowledge of Chinese character set and character encoding

Source: Internet
Author: User
Tags character set control characters

Characters are the general name of various words and symbols, including the national characters, punctuation marks, graphic symbols, numbers and so on. A character set is a collection of multiple characters, with many character sets, and each character set contains a different number of characters, common character set names: ASCII character set, GB2312 character set, BIG5 character set, GB18030 character set, Unicode character set, and so on. To accurately handle various character set characters, the computer needs to be coded so that the computer can recognize and store all kinds of text.

The number of Chinese characters is large, but also divided into Simplified Chinese and traditional Chinese two different writing rules of the text, and the computer was originally designed in English single-byte characters, therefore, the Chinese character encoding, is the technical basis of information exchange. In this paper, we will discuss several typical character sets according to the time order of character set, select several representative Chinese character sets, and study the origin, characteristics and technical characteristics of history.

ASCII Character Set

1. The origin of the name

ASCII (Americanstandardcodeforinformationinterchange, American Information Interchange Standard Code) is a computer coding system based on the Roman alphabet.

2. Characteristics

It is mainly used to display modern English and other Western European languages. It is the most common single byte coding system today, and is equivalent to the international standard ISO646.

3. Include content

Control characters: Enter, backspace, newline keys, and so on.

Display characters: uppercase and lowercase letters, Arabic numerals, and western symbols

4. Technical characteristics

7 bits (BITS) represent a character, a total of 128 characters

5. ASCII Extended Character Set

The 7-bit coded character set can only support 128 characters, in order to represent more European common characters to extend the ASCII, the ASCII extended character set uses 8 bits (BITS) to represent one character and a total of 256 characters.

The ASCII extended character set expands from the ASCII character set to include table symbols, computational symbols, Greek letters, and special Latin symbols.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.