Conversion of character encoding in python3, python3 character encoding

Last Update:2017-05-10 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

A = 'I' #### python3 uses unicode encoding by default.
### Unicode> gb2312
Unicode_gb2312 = a. encode ('gb2312') ### because the default value is unicode, decode () is not required. encode is used to convert the code to gb2312.
Print ('My gb2312', unicode_gb2312) ### returned result: My gb2312 B '\ xce \ xd2 \ xba \ xdc \ xba \ xc3'
### Gb2312> utf8
Gb2312_utf8 = unicode_gb2312.decode ('gb2312 '). encode ('utf-8') # The current character is gb2312. Therefore, decode to unicode (the input parameter in decode is the encoding set of the current character) and encode to UTF-8.
Print ('I am UTF-8', gb2312_utf8) ### returned result: I am UTF-8 B '\ xe6 \ x88 \ x91 \ xe5 \ xbe \ x88 \ xe5 \ xa5 \ xbd'
### Utf8> gbk
Utf8_gbk = gb2312_utf8.decode ('utf-8'). encode ('gbk') # to convert the current character set to UTF-8, decode it into a unicode Character Set and then encode it into a gbk character set
Print ("I am gbk", utf8_gbk) ### returned result: I am gbk B '\ xce \ xd2 \ xba \ xdc \ xba \ xc3'
### Utf8> uicode
Utf8_unicode = utf8_gbk.decode ('gbk') #### note that encode () is not required for unicode conversion ()
Print ('I am unicode', utf8_unicode) ### returned result: I am a unicode and I am fine
### Unicode> gb18030
Unicode_gb18030 = utf8_unicode.encode ('gb18030 ')
Print ('I am gb18030', unicode_gb18030) ### returned result: I am gb18030 B' \ xce \ xd2 \ xba \ xdc \ xba \ xc3'

### Summary each encoding must be converted to unicode first and then converted to the desired encoding through unicode
# From the above we can see that the results returned by gb2312, gbk, and gb18030 are all the same. It should be because these three are all Chinese codes, so they are all backward compatible with each other.
# The first encoding in China is gb2312, then gb18030, and then gbk, the number of characters they support also increases with the order of nearly 7000 characters from the first 30 thousand to the present

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Conversion of character encoding in python3, python3 character encoding

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support