python-day10--character encoding

Source: Internet
Author: User
Tags save file

1. Review:

Software → operating system → hardware

2. Text Editor:

Startup: HDD → memory → run (CPU)

Read files: HDD → memory →cpu Read

Save files: Saving to hard disk

3.python Interpreter

Startup: HDD → memory → run (CPU)

Read files: HDD → memory →cpu Read

(These two phases are the same as the text editor, but the third stage is different, related to the syntax and so on)

Explain execution: This phase will open up new space in memory

4. Character encoding: As the name implies is encoded characters

The role of ① character encoding: the character encoding table is the standard for translating characters that can be recognized by a standard into a binary that can be identified.

② different character-encoding tables

Ascii

A byte bytes with 8 bits representing a total of 2**8=256

GBK

Use 2 bytes to represent 1 characters, a total of 2**16

Unicode (Universal Code)

Use 2 bytes to represent 1 characters (but waste space when saving English characters)

UTF-8 (Universal Code)

Save 1 bytes in English, 3 bytes in Chinese

5. Computer memory with Unicode (FAST), hard disk with UTF-8 (small footprint, stable transmission)

6. Save file Procedure: Memory unicode→encode→ hard disk utf-8/or other character encoding

Read file procedure: Hard disk utf-8/or other character encoding →decode→ memory Unicode

7. The above summary:

① what encoding (encode) to use when you save the file, use what code to take (decode)

The default character encoding for the ②PYTHON3 interpreter is UTF-8, which can be changed: #coding: GBK or other

The default character encoding for the ③PYTHON2 interpreter is ASCII and can be changed: #coding: Uft-8 or other

8.python interpreter The string is used in the third stage, and the execution encounters a string that opens up new memory space.

In Python3, strings are binary in memory in Unicode format, while strings in Python2 are the result of having been encode, that is, bytes.

9.unicode→encode→bytes

Bytes→decode→unicode

There are two types of strings in 10.python3:

①.unicode (automatic control of the interpreter)

②.bytes (unicode→encode→bytes) (Human control)

There are two types of 11.python2 strings:

①. str = bytes (unicode→encode→bytes) (automatic control of the interpreter)

②. U ' string ' (equivalent to Unicode in Python3)

12. Why you should have bytes:

Computer's most basic transmission signal is binary, it is quite said that the most basic transmission signal is bytes, so the data to be transmitted with bytes.

python-day10--character encoding

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.