Character encoding
When the Python interpreter loads the code in the. py file, the content is encoded (default ASCII)
Binary
Example: Ancient beacon, ignition and not ignition only two states, the transmission of information too little.
Agreed ignition number 1, representing 1-100
Ignition number 2, representing 101-1000
Ignition Number 3, representing 1001-5000
Ignition number 4, representing 5001-1000
Although there is progress, but not accurate enough
If you introduce binary, you can accurately represent any number
Character encoding
Conversion of binary and letter
ASCII (American Standards Code for information Interchange, US standard Information Interchange code)
GB2312 (1980) More than 7,445 kanji characters, including 6,763 Chinese characters and 682 other symbols.
GBK1.0 (1995) contains 21,886 symbols, which are divided into Chinese characters and graphic symbols: 21,003 characters in the Chinese character area.
GB18030 (2000) replaces GBK1.0 's official national standard. The standard contains 27,484 Chinese characters, but also includes Tibetan, Mongolian, Uyghur and other major minority characters. Now the PC platform must support GB18030, the embedded products are not required. So mobile phones, MP3 generally only support GB2312.
It is clear that the ASCII code cannot represent all the words and symbols in the world, so it is necessary to create a new encoding that can represent all the characters and symbols, namely: Unicode
Unicode (Uniform Code, universal Code, single code) is a character encoding used on a computer. All characters account for 2 bytes.
UTF-8, 1 bytes in English characters, 3 bytes in Chinese
Summarized as follows:
ASCII 255 1bytes
-->1980 gb2312 7,445 Kanji characters
-->1995 GBK1.0 21,885 X
-->2000 GB18030 27,484 X
--Unicode 2bytes
--Utf-8 En:1byte, zh:3bytes
Python3 Maximum difference than Python2, default support character encoding set
Python2 Chinese support, the first line input: #-*-Coding:utf-8-*-
Python3 Default Support Utf-8
Example:
Name = (name)
Operation will be error, ASCII code can not be expressed in Chinese, need to be modified to:
#-*-Coding:utf-8-*-name = (name)
This article is from the "Tread Footprints" blog, please be sure to keep this source http://zoucuo.blog.51cto.com/962798/1885277
Python Learning note 8-9 (character encoding and binary)