The encode and decode of Python strings the method of solving garbled problems

The encode and decode of Python strings the method of solving garbled problems _python

Last Update:2017-01-18 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Why do you get an error? Unicodeencodeerror: ' ASCII ' codec can ' t encode characters in position 0-1: Ordinal not in range (128) "? This article is to study the problem.

The representation of strings inside Python is Unicode encoding, so in encoding conversions, Unicode is usually used as the intermediate encoding, that is, the other encoded strings are decoded (decode) into Unicode. Again from Unicode encoding (encode) into another encoding.

The role of Decode is to convert other encoded strings into Unicode encoding, such as Str1.decode (' gb2312 '), which means converting the gb2312 encoded string str1 to Unicode encoding.

The role of encode is to convert Unicode encoding to other encoded strings, such as Str2.encode (' gb2312 '), which means converting a Unicode-encoded string str2 to a gb2312 encoding.

Therefore, it is important to figure out what the string STR is encoded in, and then decode into Unicode and then encode into another encoding

The default encoding for strings in code is the same as the encoding of the code file itself.

such as: s= ' Chinese '

If it is in a UTF8 file, the string is UTF8 encoded and, if it is in a gb2312 file, it is encoded as gb2312. In this case, to encode the conversion, you need to first convert it to Unicode encoding using the Decode method, and then use the Encode method to convert it to another encoding. Typically, a code file created using the system default encoding is used when no specific encoding is specified.

If the string is so defined: S=u ' Chinese '

The encoding of the string is specified as Unicode, which is the internal encoding of Python, regardless of the encoding of the code file itself. Therefore, for this case to do the encoding conversion, just use the Encode method directly to convert it to the specified encoding.

If a string is already Unicode, then the decoding is an error, so it is usually judged by whether the encoding is Unicode:

Isinstance (S, Unicode) # used to determine if Unicode

Str in non-Unicode encoded form encode will complain.

How do I get the default encoding for a system?

#!/usr/bin/env python
#coding =utf-8
Import Sys
Print sys.getdefaultencoding ()

This procedure is printed on English Windows XP: ASCII

In some Ides, the output of a string is always garbled, or even wrong, because the IDE's result output console itself cannot display the string's encoding, rather than the program's own problem.

If you run the following code in Ulipad:

S=u "Chinese"
Print S

Will prompt: Unicodeencodeerror: ' ASCII ' codec can ' t encode characters in position 0-1: Ordinal not in range (128). This is because ulipad the console Information Output window on Windows XP is output in ASCII encoding (the default encoding for the English system is ASCII), and the string in the above code is Unicode encoded, so the output error occurs.

Replace the last sentence with the following: Print S.encode (' gb2312 ')

Can correctly output "Chinese" two words.

If the last sentence should read: Print S.encode (' UTF8 ')

Output: \xe4\xb8\xad\xe6\x96\x87, which is the result of the console Information Output window output UTF8 encoded strings in ASCII encoding.

Unicode (str, ' gb2312 ') is the same as Str.decode (' gb2312 '), which converts the gb2312-encoded STR to Unicode encoding

Using str.__class__, you can view the encoded form of STR

Principle said for a long time, finally a package cure all diseases of the bar:

Copy Code code as follows:

#!/usr/bin/env python
#coding =utf-8
s= "Chinese"

If Isinstance (S, Unicode):
#s =u "Chinese"
Print S.encode (' gb2312 ')
Else
#s = "Chinese"
Print S.decode (' utf-8 '). Encode (' gb2312 ')

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

The encode and decode of Python strings the method of solving garbled problems _python

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

The encode and decode of Python strings the method of solving garbled problems _python

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support