Python encoding type conversion methods and python

Last Update:2016-07-23 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This document describes how to convert Python encoding types. We will share this with you for your reference. The details are as follows:

1: Python and unicode

To correctly process multilingual text, Python introduces Unicode strings after version 2.0.

2: print in python

Although python needs to convert text encoding to unicode encoding, terminal display is done by a traditional Python string (in fact, the print Statement of Python cannot print double-byte Unicode characters ).

Python print will automatically convert the output unicode encoding (for other non-unicode encoding, print will be output as is) (when output to the console ), the write method of the file object will not be used. Therefore, when some strings are output normally using print, the write Method to the file is not necessarily the same as the print method.

In linux, it is converted according to environment variables. in linux, you can see it by using the locale command. The print statement transmits the output content to the operating system. The operating system encodes the input byte stream based on the system encoding.

>>> Str = 'learn python' >>> str '\ xe5 \ xad \ xa6 \ xe4 \ xb9 \ xa0python' # asII encoding >>> print learn python >>> str = u'learn python'> str #### unicode encoding '\ xe5u \ xad \ xa6 \ xe4 \ xb9 \ xa0python'

3: decode in python

Convert other character sets to unicode encoding (only Chinese characters need to be converted)

>>> Str = 'learn' >>> ustr = str. decode ('utf-8') >>> ustru' \ u5b66 \ u4e60'

In this way, the Chinese characters are encoded and converted, and python can be used for subsequent processing. (if not converted, python will perform default encoding conversion based on the environment variables of the machine, in this case, garbled characters may occur)

4: encode in python

Convert unicode to other character sets

>>> Str = 'learn' >>> ustr = str. decode ('utf-8') >>> ustru' \ u5b66 \ u4e60 >>> ustr. encode ('utf-8') '\ xe5 \ xad \ xa6 \ xe4 \ xb9 \ xa0' >>> print ustr. encode ('utf-8') Learning

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Python encoding type conversion methods and python

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support