How Python learns--coding

Last Update:2015-01-12 Source: Internet

Author: User

Tags coding standards

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1. Various coding methods

ascii:http://zh.wikipedia.org/zh-hans/ASCII unicode:http://zh.wikipedia.org/zh-hans/Unicode  UTF  -8:http://zh.wikipedia.org/zh/utf-8 Gbk:http://zh.wikipedia.org/zh/%e6%b1%89%e5%ad%97%e5%86%85%e7%a0%81%e6 %89%a9%e5%b1%95%e8%a7%84%e8%8c%83

gb_2312:http://zh.wikipedia.org/zh/gb_2312

2. The origins of various coding methods

1. Encoding: In the computer, All data is stored and computed using a binary number representation (because the computer represents 1 and 0, respectively, with high and low levels). Specifically which binary numbers to indicate which symbol, of course everyone can contract their own set (this is called coding), and if you want to communicate with each other without causing confusion, then we must use the same coding rules, in Is the United States, the standardization of the introduction of the ASCII code, Uniform rules of the above-mentioned symbols with which binary numbers to represent.

 for Information Interchange): It is well known that the computer was invented by Americans, so the formulation of ASCII was done by the Americans, so ASCII was made to show modern American English. These include: 26 Basic Latin letters, Arabic numerals and English punctuation marks.

3. Gb2312:ascii can only solve the information exchange needs of the Americans, the Chinese language as a means of communication tools to develop their own coding, to solve the requirements of information exchange. GB2312 is such a coding method, it is the national standard of the People's Republic of China Simplified Chinese character set, the full name of "Information interchange with Chinese character encoding character set • Basic Set".

4.Unicode: There are more than 200 countries and regions in the world, there are dozens of kinds of commonly used language, and countries have developed their own coding standards. For example, Japan: Shift_JIS, South Korea: euc-KR, countries have the national standard
Inevitable conflict, the result is that in multi-language mixed text, the display will be garbled. The production of Unicode is to solve this problem. Unicode unifies all languages into a set of encodings, so there is no more garbled problem.
It is common to use two bytes to represent a character (4 bytes if you want to use very remote characters). Unicode is supported directly by modern operating systems and most programming languages.

Since the Unicode approach solves the conflict, that is, the need to exchange information around the world, why do we have to utf-8 this encoding method? See

5.utf-8 (8-bit Unicode Transformation Format) If the information is basically all in English, Unicode encoding requires more storage space than ASCII encoding, It is not cost-effective to store and transmit. Therefore, in order to save space, there has been the conversion of Unicode encoding to "Variable length encoding" UTF-8 encoding. The UTF-8 encoding encodes a Unicode character into 1-6 bytes according to a different number size.
The commonly used English letters are encoded in 1 bytes, the Chinese characters are usually 3 bytes, only the very uncommon characters will be encoded into 4-6 bytes. If the text you want to transfer contains a large number of English characters, encode it with UTF-8
You can save space.

How Python learns--coding

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More