Python crawlers crawl the image address instance code on a webpage,
The example in this article is to crawl an image address on a webpage, as shown below.
Read the source code of a web page:
Import urllib. requestdef getHtml (url): html = urllib. request. urlopen (url). read () return htmlprint (getHtml (http://image.baidu.com/search/flip? Tn = baiduimage & ie = UTF-8 & word = % E5 % A3 % 81% E7 % BA % B8 & ct = 201326592 & lm =-1 & v = flip ))
Use a regular expression to crawl an image address on a webpage:
Import reimport urllib. requestdef getHtml (url): html = urllib. request. urlopen (url ). read () return htmldef getImg (html): r = R' "thumbURL": "(http: // img. +? \. Jpg) "'# define regular imglist = re. findall (r, html) return imglisthtml = str (getHtml (" http://image.baidu.com/search/flip? Tn = baiduimage & ie = UTF-8 & word = % E5 % A3 % 81% E7 % BA % B8 & ct = 201326592 & lm =-1 & v = flip ")) print (getImg (html ))
Running result:
Summary
The above is all the content about the Python crawler crawling the image address instance code on a web page. I hope it will help you. If you are interested, you can continue to refer to other related topics on this site. If you have any shortcomings, please leave a message. Thank you for your support!