Using python to capture documents from the web,
The example in this article describes how to capture documents from a URL of the Web using Python and share it with you for your reference. The specific method is analyzed as follows:
The instance code is as follows:
Import urllib doc = urllib. urlopen ("http://www.python.org "). read () print doc # print the web page def reporthook (* a): print a # Save the http://www.renren.com web page to renre.html, # Each read a block calls a word reporthook function urllib. urlretrieve ("http://www.renren.com", 'renren.html ', reporthook) # Save the http://www.renren.com web page to urllib in renre.html. urlretrieve ("http://www.renren.com", 'renren.html ')
The program running result is as follows:
<! DOCTYPE html PUBLIC "-// W3C // dtd xhtml 1.0 Transitional // EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> .................. ........ webpage content </body>
Urllib. urlopen returns a class object.
I hope this article will help you with Python programming.