Python for Chinese encoding decoding first off

Source: Internet
Author: User

Simply record several points for later forgetting:


1, the default encoding method in Python is ASCII

In [1]: Import Sysin [2]: Sys.getdefaultencoding () out[2]: ' ASCII '


2. Set the default encoding in Python

In [1]: Import Sysin [2]: Reload (SYS) <module ' sys ' (built-in) >in [3]: sys.setdefaultencoding (' Utf-8 ') in [4]: sys.ge Tdefaultencoding () ' Utf-8 '


3. The encoding format set on the top of Python # _*_ Coding:utf-8 _*_ does not affect default Python's default encoding format

#! /usr/bin/env python# _*_ coding:utf-8 _*_import sysprint sys.getdefaultencoding ()

The result is an ASCII encoded format after execution


So what is the encoding format that Python has set up on the top of its head?

#1, this declaration is required if there is a Chinese comment in the code
#2, a more advanced editor (like my Emacs), will format this as a code file according to the header declaration
#3, the program will be decoded by the head declaration, the initialization U "Life is too short", such a Unicode object, (so the head Declaration and code storage format to be consistent)

The above ideas come from http://python.jobbole.com/81244/this article


Let's do a test:

#! /usr/bin/env python# _*_ coding:utf-8 _*_import sysprint sys.getdefaultencoding () #reload (SYS) #sys. setdefaultencoding (' Utf-8 ') # will be encoded as Unicodes1 = U "This is a Test 1" # will be encoded as Asciis2 = "This is a Test 2" s1.encode (' GBK ') s2.encode (' GBK ') print S1print s2

Above test results:

Asciitraceback (most recent): File "testunicoding.py", line, in <module> s2.encode (' GBK ') unicodedec Odeerror: ' ASCII ' codec can ' t decode byte 0xe8 in position 0:ordinal not in range (128)

Main s2 The default encoding format for this string is ASCII and cannot be decode to Unicode first. Something's wrong.

After changing the default encoding mode to Utf-8

#! /usr/bin/env python# _*_ coding:utf-8 _*_import sysprint sys.getdefaultencoding () reload (SYS) sys.setdefaultencoding (' Utf-8 ') print sys.getdefaultencoding () # will be encoded as Unicodes1 = U "This is a Test 1" # will be encoded as Asciis2 = "This is a Test 2" s1.encode (' GBK ') s2.encode (' GBK ') print S1print s2


Execution Result:

Asciiutf-8 This is a test 1 which is a test 2


This article is from the "Learning Notes" blog, so be sure to keep this source http://unixman.blog.51cto.com/10163040/1656678

Python for Chinese encoding decoding first off

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.