Python3 of the Requests class crawl Chinese page garbled solution

Source: Internet
Author: User

This garbled phenomenon is basically caused by coding, we want to go to the code we want, first po a knowledge point, Song Tian teacher in Python crawler and information extraction said: Response.encoding refers to the HTTP header to guess the response content encoding method, if there is no charset in the header, the default encoding is Iso-8859-1, In this way, some of the non-canonical server return will be garbled; response.apparent_encoding refers to the content of the response from the content encoding. The requests internal Utils also provides a function get_encodings_from_content that gets the page encoding from the return body, so that if the server returns a header that does not contain Charset, then get_encodings_from_ Content to know the correct encoding of the page. The following is the process of debugging:

ImportRequests fromRequests.exceptionsImportrequestexceptiondefget_one_page (URL):Try: Response=requests.get (URL)ifResponse.status_code = = 200:            #print (Response.text)            Print(response.encoding)Print(response.apparent_encoding) R=Response.textPrint(Requests.utils.get_encodings_from_content (R) [0]) a=r.encode ('iso-8859-1'). Decode (Requests.utils.get_encodings_from_content (R) [0])Print(a)Print('------------------------------------') b= R.encode ('iso-8859-1'). Decode (response.apparent_encoding)Print(b)returnNoneexceptrequestexception:returnNonedefmain (): URL='http://www.mh160.com/'get_one_page (URL)if __name__=='__main__': Main ()

Look at the picture! Look at the picture! Look at the picture!


Python3 of the Requests class crawl Chinese page garbled solution

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.