Python coding problem, from dull ache to removing root cause

Source: Internet
Author: User
Tags stdin utf 8 utf 8 encoding

Access to the Information link

Python code why is it so painful?

python2.7 Manual str function

Python source file default encoding with internal default encoding

1. The source file is encoded as ASCII by default, so if you do not show what encoding the current code is written with, Python will parse it with ASCII, and if there is UTF-8 encoding in the source file, it will be an error because ASCII cannot translate UTF8 encoding.

# file test.py  save a='a'b=' good ' using UTF8

After running

' \xe5 '  in file test.py on line 2, but no encoding declared; see http: for details

The above error says character ' \xe5 ' is non-ASCII, because ' \xe5 ' is part of the ' Good ' UTF8 byte string

As follows

Print ' good ' >>> a=' good ' >>> A in the terminal encoded as UTF8  '\xe5\xa5\xbd

So, the current source file is what encoding editing, be sure to declare it. For example,

# Coding=utf-8 #先在这里声明 # here is the program code

2. One thing to note is that when editing code in the command-line mode of a terminal, there is no need to declare the encoding used for the current code, I guess because Python reads the encoding of the current system directly in command-line mode

For example, under Windows CMD, look at the word ' good '

Print ' good '>>> a=' good ' >>> A on the terminal (CMD) encoded as GBK  '\xba\xc3'

You can see that in cmd, ' good ' is two bytes, different from the three bytes on the

3. Internal default encoding is ASCII, which requires attention when using some functions, such as STR and Unicode

Sometimes, run a script, for example, we usually save the script with UTF8 encoding, if run on Windows cmd, print Chinese will be garbled, because the default code of CMD is GBK, it explains UTF8 byte string, it will naturally mess up.

So how to let the terminal printing information, regardless of the terminal encoding is not garbled?

1. Adjust terminal default encoding

2. Let the script cater to the taste of the terminal, or plan A: The script is saved as GBK, or Plan B: in need of terminal display place to turn the code, I said B plan

 #  coding=utf-8  import   Sysa  = "  good   " #   This file is saved as UTF 8 encoding, if you want to display normally on CMD, you need to switch to GBK,  Aunicode = A.decode ( " utf-8   ") #   First decoding to Unicode, decoding the time to tell Python,a is a UTF8 byte string, do not think that is the ASCII byte string  agbk=aunicode.encode (  " GBK   ") #   Unicdode encoded A is re-encoded as GBK  print  AGBK 

There is an episode above, I directly with A.encode (' GBK ') line? This is not possible because the encoding (encode) is for Unicode and must be Unicode encoded if abruptly uses something like ' I '. Encode (' GBK '), it will be an error

As follows

 >>> a= "  good  "  >>> A.encode ( " GBK   " ) Traceback (most recent call last): File 
     " <STDIN>   ", line 1, in  <module>unicodedecodeerror:  "   ASCII   " codec Can"  t decode byte 0xba in position 0:ordinal  not  in  range (128 >>> 

See, Python crawled out of Unicode decoding exception, Python also said ' ASCII ' how, why?

Because Python only encode the Unicode string, and if it is not encode the byte string, it decode the byte string, which is

A.encode ('GBK') = = (   a.decode (' default encoding '). Encode ('  gbk')   )

The above Unicode decoding exception, is also in the decoding of the time thrown, Python think A is the default encoding ASCII encoding, can be a UTF8 encoding ah, ' good ' in ASCII does not exist, so will error

3. I don't care what your terminal code, terminal you have to give me normal display.

Then use the most direct Unicode encoding and let Python print it out according to the system's current encoding.

# Coding=utf-8 Print u' me '

When printing a Unicode string, the Unicode corresponding system-encoded character is printed, so it is not garbled.

Derivative of a small problem, I just want to see what a variable Unicode is like, then use the Reper function (return the string form of an object)

>>> a=u ' good '>>> au'\u597d' Print repr (a) U ' \u597d '

Say something about the STR function

The STR function, which returns the rendering of an object in the form of a string (in my understanding, is the rendering that one can see),

For different objects, STR has different methods of operation, for example, for string type, it returns

For a function type, the STR function returns the position of the function in memory in the form of a string

When using the STR function with the string type, note that if it is a Unicode type string, be aware of the current default encoding

As below, I am in cmd, encoded as Gbk,python default encoding for ASCII

 >>> a=u "  good  "  >>> str (a) Traceback (most recent call last): File   " <STDIN>  " , line 1, in  <module> Unicodeencodeerror:   " ascii   " codec Can"  t Encode character u   " \u597d"   in position 0 : Ordinal not  in  range (128 >>> 

When using the STR function with Unicode above, this conversion involves the default encoding, which first makes such conversions: Unicodestr.encode (defaultencoding).

If defaultencoding is not coding the code itself, it throws an exception.

So, to set the defaultencoding, the following

 >>> import   sys  >>>  reload (SYS)  <module  " sys  "  (Built-in ) >>>> sys.setdefaultencoding ( '  GBK   ") #   Specifies that the default encoding is GBK  >>> a=u "   Span style= "COLOR: #800000" > " >>> str (a) #   There's no error here.  "  \xc0\xb2   " >>> str (a) ==a True  

Python coding problem, from dull ache to removing root cause

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.