A summary of solutions to Python coding problems

Last Update:2018-08-29 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Here is a summary of several cases that will lead to coding problems, and explain one by one ...

Case one: Chinese output is garbled?

# Python Version: 2.7.6
>>> string1 = "I love Fish C Studio"
>>> string1
' \xe6\x88\x91\xe7\x88\xb1\xe9\xb1\xbcc\xe5\xb7\xa5\xe4\xbd\x9c\xe5\xae\xa4 '
>>> Print string1
I love Fish C Studio
>>> string2 = "I Love FISHC"
>>> string2
' I Love FISHC '

Q: Why can't I display Chinese strings directly?

Analysis:

Because the default encoding for the python2.x version is ASCII,ASCII, the default is to use only one byte to hold the data. Because Chinese characters are profound, one byte is not enough to store all the Chinese characters. Therefore, string1 can only print out the Chinese string in memory data, which is not an error.

Solution:

Use Python3, because Python3 uses UTF-8 encoding by default.

Extended Knowledge:

1. You can obtain the current default encoding in the following ways:

>>> Import Sys
>>> sys.getdefaultencoding ()
' ASCII '

2. Character set and character set encoding detailed

Case two: concatenation of ordinary strings and Unicode strings throws a Unicodedecodeerror exception

>>> string = "I love" + U "FISHC"
Traceback (most recent):
File "<stdin>", line 1, in <module>
Unicodedecodeerror: ' ASCII ' codec can ' t decode byte 0xe6 in position 0:ordinal not in range (128)

Analysis:

Use the + sign for string stitching, a normal string on the left, and a Unicode string on the right. When two types of strings are stitched together, Python automatically converts the Chinese string on the left to a Unicode string, and then the concatenation operation. But because "I love" ASCII encoding for ' \xe6\x88\x91\xe7\x88\xb1 ', where hexadecimal ' \xe6 ' corresponds to a value of 230. Unicode and ASCII are compatible when the encoded value is 0 to 127, and there is no problem with the conversion. However, ASCII encoding cannot be converted directly to Unicode when the value is greater than 128. Therefore, throw unicodedecodeerror.

Solution:

1. Using Python3

2. Specify the decoding method to convert to Unicode:

>>> string = "I Love". Decode (' utf-8 ') + u "FISHC"
>>> Print String
I love FISHC.

3. Encode the Unicode string part:

>>> string = "I love" + U "FISHC". Encode ("Utf-8")
>>> Print String
I love FISHC.

Extended Knowledge:

The invention of the Unicode encoding system is to unify the coding of the national characters, so it is called the universal code. Unicode sets a unique binary encoding representation for each language, meaning that the corresponding code can be found on Unicode regardless of the language of the country. Therefore, when different coding systems are converted to each other, Unicode can be used as an "intermediary".

The conversion process of other encoding systems to Unicode is called decoding (decode), and the process of converting Unicode to another encoding system is called encoding (encode). For example A encoding needs to be converted to B encoding, the process is as follows:

Encode (b), Unicode, Decode (a), a-coded

Case three: File encoding differs from Python encoding

The test.txt content is as follows and is saved as GB2312 encoding:

I love fish c studio, really!

test.py content is as follows:

F1 = open ("Test.txt")
Print (F1.read ())
F1.close

When the code executes, it will error:

>>>
Traceback (most recent):
File "/users/fishc/documents/python/test.py", line 4, <module>
Print (F1.read ())
File "/library/frameworks/python.framework/versions/3.4/lib/python3.4/encodings/ascii.py", line +, in decode
return Codecs.ascii_decode (input, self.errors) [0]
Unicodedecodeerror: ' ASCII ' codec can ' t decode byte 0xce in position 0:ordinal not in range (128)

Analysis:

If the front of the content can be understood, then solve such a coding problem is no longer difficult to live you ~ ~ ~

The encoding format for opening files using open depends on the system (which can be obtained through locale.getpreferredencoding), reading the wrong information carefully, and the system uses ASCII to decode the contents of the file, encountering errors ... Because we know that the file format is GB2312, we only need to set encoding= "gb2312" when opening the file to solve the problem:

F1 = open ("Test.txt", encoding= "gb2312")
Print (F1.read ())
F1.close

A summary of solutions to Python coding problems

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

A summary of solutions to Python coding problems

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

A summary of solutions to Python coding problems

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support