Python error Unicodedecodeerror:

Source: Internet
Author: User

The encoding and decoding in Python is the conversion between Unicode and Str. Encoding is Unicode-STR, on the other hand, decoding is
is Unicode, str. The rest of the problem is deciding when to encode or decode. The "code indication" at the beginning of the file, which is the #-*-coding:-*-this statement. The Python default script file is UTF-8 encoded and is corrected with a "coded indication" when there are characters in the file that are not UTF-8 encoded in the range. About sys.defaultencoding, this is used when decoding does not explicitly indicate the decoding method. For example, I have the following code:

[Python]View PlainCopy
    1. #! /usr/bin/env python    
    2. # -*- coding: utf-8 -*-   
    3. s =  "Chinese"   #  Note that the  str  here is  str  type, not  unicode   
    4. S.encode ( Span class= "string" > ' GB18030 ')    


This code re-encodes s into the GB18030 format, which is the conversion of Unicode-Str. Because S is itself a str type,
Python automatically decodes s to Unicode first, and then encodes it into GB18030. Because decoding is done automatically by Python, and we do not specify the decoding method, Python uses the sys.defaultencoding to decode it in the way indicated. In many cases sys.defaultencoding is
Anscii, if S is not the type, it will go wrong. In the above case, my sys.defaultencoding is anscii, and the encoding method of S and the file encoding method is consistent, is UTF8, so error:
Unicodedecodeerror: ' ASCII ' codec can ' t decode byte 0xe4 in position
0:ordinal not in range (128)
In this case, we have two methods to correct the error:
One is to explicitly indicate the encoding of s

[Python]View PlainCopy
    1. #! /usr/bin/env python
    2. #-*-Coding:utf-8-*-
    3. s = ' Chinese '
    4. S.decode ('utf-8 '). Encode (' gb18030 ')


The second is to change the encoding of the sys.defaultencoding file.

[Python]View PlainCopy
    1. </pre><p><pre name= "code"  class= "python" >#! /usr/bin/env  python   
    2. # -*-  coding: utf-8 -*-   
    3. import sys    
    4. Reload (SYS)  # python2.5  is deleted after initialization   sys.setdefaultencoding  This method, we need to reload    
    5. sys.setdefaultencoding ( ' utf-8 ')    
    6.   
    7. str =   "Chinese"    
    8. str.encode ( ' GB18030 ')   


After reading, change to this

Print "<P>ADDR:", form["addr"].value.decode (' gb2312 '). Encode (' Utf-8 ')
Successfully passed.

Let me summarize the reasons why I wrote this:

1. Encode conversion when the retrieved data is inconsistent with the code declared in your current script

2. In the encoding conversion, the data is first converted to Unicode code in its own encoded format, and the Unicode is encoded by UTF8.

3. Why my browser will pass back GB2312 encoded data to the server, which should be related to the client's system encoding

Python error Unicodedecodeerror:

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.