Personal testing, the problem solved completely!
2018/07/08 21:37
Environment: windows,pycharm,python3.6.2
When writing a file using Python, or when writing a network stream to a local file, in most cases you will encounter: Unicodeencodeerror: ' GBK ' codec can ' t encode character ' \xa0 ' in position ... The problem. There are a lot of similar files on the network about how to solve this problem, but nothing is encode,decode related, is this the real cause of the problem? No. Many times, we used decode and encode, tried all kinds of coding, utf8,utf-8,gbk,gb2312 and so on, the code is tried all over, but the compile time still appear: unicodeencodeerror: ' GBK ' codec can ' t encode character ' \xa0 ' in position XXX. It crashed.
Writing Python scripts under Windows is a serious coding problem.
When you write a network data stream to a file, we encounter several encodings:
1: #encoding = ' XXX ' here (i.e. the contents of the first line of the Python file) refers to the encoding of the Python script file itself, which does not matter. As long as xxx and the file itself is the same code. For example, notepad++ "format" menu can be set up a variety of codes, it is necessary to ensure that the menu set encoding and encoding xxx the same line, different words will be error
2: Network data stream encoding such as access to Web pages, then the network data stream encoding is the page encoding. The Unicode encoding needs to be decoded using decode.
3: Target file Encoding to write the encoding of the network data stream to a new file, we need to specify the encoding of the new file. Write file code such as:
Copy the code code as follows:
F.write (TXT), then txt is a string, which is a string decoded by decode. The key point is coming: the encoding of the target file is the culprit that causes the title to refer to the problem. If we open a file:
Copy the code code as follows:
f = open ("Out.html", "W"), under Windows, the default encoding for new files is GBK, so that the Python interpreter uses GBK encoding to parse our network data stream txt, but txt is already decode Unicode encoding , this will lead to the resolution of the above problems. The workaround is to change the encoding of the target file:
Copy the code code as follows:
f = open ("Out.html", "w", encoding= ' utf-8 '), so the problem will no longer exist.
Ps:
1.str Turn bytes called encode,bytes turn str called decode
2. Commonly used Chinese code name
Reference: Blog Park article https://www.cnblogs.com/themost/p/6603409.html
"Python" Python3 a successful solution to the ' GBK ' codec can ' t encode characte?