Python-based------python coding

Last Update:2017-05-14 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

First understand the history, but this article jumbled, such as old lady binding cloth----------smelly and long

Coding history:

1. The computer can only process numbers, and text files are converted to numbers only
To process. 8bit==1 bytes So one word energy-saving representation of the largest number is 255

2. Americans invent computers. In English, all characters are represented by a single byte
ASCII (one byte) encoding is the American Standard Code.

3. When the Chinese use the computer, they need to express Chinese characters, so it is clear
The encoded format of the GB2312, which is a two-byte representation of a Chinese character. Similarly, other language countries
The corresponding code is created. Without a common standard, so when different languages are used
No corresponding encoding will produce garbled

4. For uniform standards, Unicode encoding appears, all languages unified to a set of encodings
Comparison between Unicode and ASCII encoding
1) Letter A:ascii Decimal 65, binary is 0100 0001
ASCII in Chinese characters cannot be encoded with Unicode 200,132 binary: 01001110 00101101
2) for the computer to identify the uniform length, so a front position of 0 is 00000000 0100 0001
Standards in this unity
5. Standard unified, garbled problem solved, but Unicode encoding length is longer, but computer English is the main,
If the content is all in English, Unicode encoding is more than the ASCII encoding more storage space, while the transmission is also one times more
How to solve it?

6. If the Unicode encoding can change, then UTF-8 appears.
Utf-8, letters A byte, a Chinese character 3 bytes, particularly uncommon in 4-6
This saves space and storage

7, then the problem is: the computer only recognize Unicode encoding
How to convert between Utf-8

When it needs to be recognized by the computer, it is loaded into memory, and the encoding used must be Unicode encoded
Use UTF-8 encoding when you need to transfer over the network, or when stored in a file, in order to save space costs
So there's a mutual transformation.

Python2 and Python3 on Windows/linux encoding conversion python2: in Windows:

1. First look at what the window itself is coded
Import Sys
Sys.getdefaultencoding ()
#out: "Utf-8"
2. String all in English
s1= "ABC"--Type (S1): Str
S2 = u "abc"--Type (s2): Unicode
Meaning of U "": Indicates that the following string is stored in Unicode format
S1.encode ("UTF8") success
S2.encode ("UTF8") success

3. When Chinese is present:
S1 = "Hello"--GB2312 encoded. Windows under
S2 = u "Hello"
S1.encode ("UTF8") error
S2.encode ("UTF8") success

Cause of Error:
is encoded in memory in Unicode, but
S1 is not Unicode encoded (because of wasted storage) when it passes over, and
Encode is encoded by converting a Unicode object into an encoded format in a parameter
So S2 won't get an error.

Workaround:
First, convert this gb2312 code to Unicode-encoded objects
And then into the utf-8.
S1.decode ("gb2312"). Ecode ("UTF8") succeeded under windows as "gb2312"
Decode ("XX") method is to convert an object encoded as "XX"
As a Unicode object

Under Linux:

1. First look at what the Linux code itself is
Import Sys
Sys.getdefaultencoding ()
#out: "ASCII"
2. String all in English
s1= "ABC"--Type (S1): Str
S2 = u "abc"--Type (s2): Unicode
Meaning of U "": Indicates that the following string is stored in Unicode format
S1.encode ("UTF8") success
S2.encode ("UTF8") success
3. When Chinese is present:
S1 = "Hello"--utf-8 encoded. Why is it not ASCII under Linux? Does the ASCII mean Chinese?
It must have been converted for utf-8.
S2 = u "Hello"
S1.encode ("UTF8") error
S2.encode ("UTF8") success
Workaround:
First, convert this utf-8 code to Unicode-encoded objects
And then into the utf-8.
S1.decode ("UTF8"). Ecode ("UTF8") successful Linux Chinese as "Utf-8"
Equivalent to S1 and back again, itself is the Utf-8 code

Python 3:

In Python3, all STR types are encoded in Unicode format and can be encode directly to "Utf-8"

In Windows:
1. String All in English
s1= "ABC"--Type (S1): Str
S2 = u "abc"--Type (s2): Unicode
Meaning of U "": Indicates that the following string is stored in Unicode format
S1.encode ("UTF8") success
S2.encode ("UTF8") success
2. When Chinese is present:
S1 = "Hello"--Unicode encoding. Windows under
S2 = u "Hello"---> No need to write this, no U "", 3 also think this is Unicode
S1.encode ("UTF8") success
S2.encode ("UTF8") success
Under Linux: As in window

Summary: Talk about #-*-coding:utf-8-*-

Python2 with 3 the largest difference:
2 When the file has Chinese appearance must be added at the beginning, and the character string must be added U ""
Role:
Tell Python that this file is encoded in UTF-8 format and will be interpreted according to this code.
The Unicode conversion is then done internally
Why not write in 3:
3 Python will interpret the file in Unicode
3

Python-based------python coding

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Python-based------python coding

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Python-based------python coding

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support