This paper describes the method of Python3 using requests package to capture and save Web page source code. Share to everyone for your reference, as follows:
Use Python 3 's requests module to crawl the Web source and save to file example:
Import requestshtml = Requests.get ("http://www.baidu.com") with open (' Test.txt ', ' W ', encoding= ' Utf-8 ') as F:f.write ( Html.text)
This is a basic file save operation, but here are a few notable questions:
1. Install requests package, command line input PIP install requests can be installed automatically. Many people recommend the use of requests, the urllib.request can also crawl the Web page source
The 2.open method encoding parameter is set to Utf-8, otherwise the saved file will appear garbled.
3. If you output the crawled content directly in cmd, you will be prompted for various coding errors, so save to File view.
The 4.with open method is a better way to do this, freeing up resources when you are finished automatically.
Another example:
Import REQUESTSFF = open (' Testt.txt ', ' W ', encoding= ' utf-8 ') with open (' Test.txt ', encoding= "Utf-8") as F:for line in F:ff. Write (line) Ff.close ()
This is an example that demonstrates reading a TXT file, reading one line at a time, and saving it to another TXT file.
Because a single row of data is printed at the command line, the Chinese can have a coding error, so each time a row is read and saved to another file, the read is tested correctly. (Note the encoding encoding method when open)