Today, write Python, the online data stream written into the file encountered, the Web search results and the correct processing, the original solution copied over, hey
When writing a file using Python, or when writing a network stream to a local file, in most cases you will encounter: Unicodeencodeerror: ' GBK ' codec can ' t encode character ' \xa0 ' in position ... The problem. There are a lot of similar files on the network about how to solve this problem, but nothing is encode,decode related, is this the real cause of the problem? No. Many times, we used decode and encode, tried all kinds of code, utf8,utf-8,gbk,gb2312 and so on, the code is tried all over, but the compile time still appear: unicodeencodeerror: ' GBK ' codec Can ' t encode character ' \xa0 ' in position XXX. It crashed.
在windows下面编写python脚本,编码问题很严重。将网络数据流写入文件时时,我们会遇到几个编码:1: #encoding=‘XXX‘ 这里(也就是python文件第一行的内容)的编码是指该python脚本文件本身的编码,无关紧要。只要XXX和文件本身的编码相同就行了。 比如notepad++ "格式"菜单里面里可以设置各种编码,这时需要保证该菜单里设置的编码和encoding XXX相同就行了,不同的话会报错2:网络数据流的编码 比如获取网页,那么网络数据流的编码就是网页的编码。需要使用decode解码成unicode编码。3:目标文件的编码 要将网络数据流的编码写入到新文件,那么我么需要指定新文件的编码。写文件代码如:
[Python] View plaincopy
F.write (TXT)
, then TXT is a string that is decoded by decode. The key point is coming: the encoding of the target file is the culprit that causes the title to refer to the problem. If we open a file:
[Python] View plaincopy
f = open ("Out.html", "W")
, under Windows, the default encoding of the new file is GBK, so that the Python interpreter will use GBK encoding to parse our network data stream txt, however, TXT is already a decode Unicode encoding, so that will lead to parsing, the above problem occurs. The workaround is to change the encoding of the target file:
[Python] View plaincopy
f = open ("Out.html", "w", encoding= ' Utf-8 ')
。 In this way, the problem will cease to exist.
20170427 error Unicodeencodeerror: ' GBK ' codec can ' t encode character ' \xa0 ' in position