Python implementation crawler encounters coding problems:
Error:unicodeencodeerror: ' GBK ' codec can ' t encode character ' \xxx ' in position XX
WORKAROUND: Change the standard output
fromUrllibImportRequestimport Ioimport syssys.stdout = io. Textiowrapper (sys.stdout.buffer,encoding= ' gb18030 ') #改变标准输出的默认编码Req=request. Request ('http://www.baidu.com') Req.add_header ('user-agent','mozilla/5.0 (Windows NT 6.3; WOW64) applewebkit/537.36 (khtml, like Gecko) chrome/61.0.3163.100 safari/537.36') Resp=Request.urlopen (req)Print(Resp.read (). Decode ('UTF-8'))
Add the code in the page labeled red to
Ps:
1.str Turn bytes called encode,bytes turn str called decode
2. Commonly used Chinese code name
Reference article: http://blog.csdn.net/jim7424994/article/details/22675759
Unicodeencodeerror: ' GBK ' codec can ' t encode character ' \xbb ' in position