python3.x Crawler,
Found the error "Unicodedecodeerror: ' Utf-8 ' codec can ' t decode byte 0x8b in position 1:invalid start byte", has been looking for file errors, finally after the user's tips, the cause of the error Then there is a message in my header:
"' accept-encoding ': ' gzip, deflate '"
This is the one I copied directly from Fiddler, why the browser can be normal browsing, and Python imitation can not?
Comprehensive Online Explanation:
This sentence means that the local receive compression format data, the server passed the compressed format gzip file, and unzip this gzip file can only use deflate algorithm, the browser can be automatically extracted, the program can not automatically unzip gzip, need to set up the additional to the line, set method reference https:// www.crifan.com/set_accept_encoding_header_to_gzip_deflate_return_messy_code/
Summary: Write the crawler time still do not write ' accept-encoding ': ' gzip, deflate ', let the server pass the original file come over it, do not compress.
Python web crawler Error "Unicodedecodeerror: ' Utf-8 ' codec can ' t decode byte 0x8b in position" solution