The address I crawled is http://tieba.baidu.com/p/3125473879?pn=2, this post is about 82 pages, the following code mainly crawl all 82 pages of pictures, the code is as follows:
"" "Crawl Baidu Stick Image" "" #导入模块import reimport urllibfrom urllib.request import urlopen, urlretrieve# get crawl Page source code def gethtml (URL): page = urlopen (URL) html = str (Page.read ()) page.close () return html# matches our urldef getimg (HTML) by source code and regular Expressions: reg = r ' The crawl results are as follows, I'm just a little bit simpler here, in detail later.
650) this.width=650; "src=" Http://s2.51cto.com/wyfs02/M01/82/88/wKiom1dX5WzxSmXcAASy_ifjAEA695.jpg "title=" Qq20160608172901.jpg "alt=" Wkiom1dx5wzxsmxcaasy_ifjaea695.jpg "/>
This article is from the "Little Water Drop" blog, please make sure to keep this source http://wangzan18.blog.51cto.com/8021085/1787514
Python3 crawl Baidu Paste Photos