Python unicodeencodeerror: ' GBK ' codec can ' t encode character solution _python

Source: Internet
Author: User
Tags python script

When you use Python to write a file, or when you write a network stream to a local file, you will most likely encounter: Unicodeencodeerror: ' GBK ' codec can ' t encode character ' \xa0 ' in position ... This problem. There are a lot of similar files on the web about how to solve this problem, but it is encode,decode related, is this the real cause of the problem? No. Many times, we used decode and encode, tried all kinds of coding, utf8,utf-8,gbk,gb2312 and so on, the code has been tried, but the compile time still appears: Unicodeencodeerror: ' GBK ' codec Can ' t encode character ' \xa0 ' in position XXX. Broke down.

Writing a Python script under Windows is a serious coding problem.

When you write a network data stream to a file, we encounter several encodings:

1: The encoding of the #encoding = ' XXX ' here (the contents of the first line of the Python file) refers to the encoding of the Python script file itself, which is irrelevant. As long as xxx and the file itself are encoded the same. For example, notepad++ "format" menu can be set up a variety of coding, then need to ensure that the menu set in the Code and encoding xxx the same line, different words will be the error

2: Network data stream encoding such as access to the Web page, then the network data stream encoding is the page encoding. You need to decode the Unicode encoding using decode.

3: Target file Encoding to write the encoding of the network data stream to the new file, then I need to specify the encoding of the new file. Write file code such as:

Copy Code code as follows:

F.write (TXT)

, then TXT is a string, which is a string that has been decoded by Decode. The key point is coming: the encoding of the target file is the culprit that causes the title to refer to the problem. If we open a file:
Copy Code code as follows:

f = open ("Out.html", "W")

, under Windows, the new file's default encoding is GBK, so that the Python interpreter will use GBK encoding to parse our network data stream txt, but txt is already decode Unicode encoding, this will lead to parsing, this problem. The solution is to change the encoding of the target file:
Copy Code code as follows:

f = open ("Out.html", "w", encoding= ' Utf-8 ')

。 In this way, the problem will no longer exist.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.