Python print Encoding

Last Update:2018-12-03 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Python print will automatically encode and convert the output text, but the write method of the file object will not. Therefore, when some strings are output normally using print, write to file is not necessarily the same as print.
The encoding of print conversion is related to environment variables. Windows XP converts data to GBK. In Linux, it is converted according to environment variables. Use the locale command in Linux. For example, mine is:
[Zhaowei @ papaya zhaowei] $ locale
Lang = zh_cn
Lc_ctype = "zh_cn"
Lc_numeric = "zh_cn"
Lc_time = "zh_cn"
Lc_collate = "zh_cn"
Lc_monetary = "zh_cn"
Lc_messages = "zh_cn"
Lc_paper = "zh_cn"
Lc_name = "zh_cn"
Lc_address = "zh_cn"
Lc_telephone = "zh_cn"
Lc_measurement = "zh_cn"
Lc_identification = "zh_cn"
Lc_all =
At this time, it will be considered gb2312. In python, you can use the locale module to obtain the encoding of the current environment:

Import locale

Print locale. getdefalocallocale ()

Print will automatically replace the string with this encoding during output. Let's take a look at the following. The word "Taobao" is a well-known word that is not found in gb2312. When you convert it to gb2312, an error will occur.

#-*-Encoding: gb18030 -*-
Import locale
Import sys, encodings, encodings. aliases

# Now A is Unicode
A = u'hangzhou'

Print A. encode ("gb2312 ")

The above code reports an exception, which is the cause. But print a can be output directly (assuming your environment variable is GBK, gb18030, or UTF-8 ). If your environment variable is gb2312, this print will report an error! So when processing text data from other places, it is best not to use gb2312 encoding, Chinese data, must use gb18030 or UTF-8!
Writing Unicode data with the write of the file object will also lead to errors! Encoding conversion is required.

#-*-Encoding: gb18030 -*-
Import locale
Import sys, encodings, encodings. aliases

# Now A is Unicode
A = u'hangzhou'

F = open ("aaa.txt", "W ")
F. Write ()
F. Close ()

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Python print Encoding

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support