Python Learning note 8-9 (character encoding and binary)

Source: Internet
Author: User

Character encoding

When the Python interpreter loads the code in the. py file, the content is encoded (default ASCII)


Binary

Example: Ancient beacon, ignition and not ignition only two states, the transmission of information too little.

Agreed ignition number 1, representing 1-100

Ignition number 2, representing 101-1000

Ignition Number 3, representing 1001-5000

Ignition number 4, representing 5001-1000

Although there is progress, but not accurate enough


If you introduce binary, you can accurately represent any number


Character encoding

Conversion of binary and letter

ASCII (American Standards Code for information Interchange, US standard Information Interchange code)

GB2312 (1980) More than 7,445 kanji characters, including 6,763 Chinese characters and 682 other symbols.

GBK1.0 (1995) contains 21,886 symbols, which are divided into Chinese characters and graphic symbols: 21,003 characters in the Chinese character area.

GB18030 (2000) replaces GBK1.0 's official national standard. The standard contains 27,484 Chinese characters, but also includes Tibetan, Mongolian, Uyghur and other major minority characters. Now the PC platform must support GB18030, the embedded products are not required. So mobile phones, MP3 generally only support GB2312.


It is clear that the ASCII code cannot represent all the words and symbols in the world, so it is necessary to create a new encoding that can represent all the characters and symbols, namely: Unicode

Unicode (Uniform Code, universal Code, single code) is a character encoding used on a computer. All characters account for 2 bytes.

UTF-8, 1 bytes in English characters, 3 bytes in Chinese

Summarized as follows:

ASCII 255 1bytes

-->1980 gb2312 7,445 Kanji characters

-->1995 GBK1.0 21,885 X

-->2000 GB18030 27,484 X

--Unicode 2bytes

--Utf-8 En:1byte, zh:3bytes

Python3 Maximum difference than Python2, default support character encoding set

Python2 Chinese support, the first line input: #-*-Coding:utf-8-*-

Python3 Default Support Utf-8

This article is from the "Tread Footprints" blog, please be sure to keep this source http://zoucuo.blog.51cto.com/962798/1885264

Python Learning note 8-9 (character encoding and binary)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.