What is a reptile?
Is the process of crawling web data
How do crawlers crawl Web data?
Three main features of the Web page:
- The Web page has its own unique URL.
- Web pages are HTML to describe page information.
- Web pages use the HTTP/HTTPS protocol to transfer HTML data.
Crawler Design Ideas:
- Get video ID
- Stitching full URL
- Get video playback Address
- Download video
Module using requests
Install "Pip Install requests"
Seven main methods of the requests library
Find a single video playback address
Get Web page source code
Get play Address
Download video
Full Code
1 #-*-coding:utf-8-*-2 ImportRequests3 fromlxmlImportetree4 ImportRe5 fromUrllib.requestImportUrlretrieve6 #Get video ID7 #Stitching full URL8 #get video playback address9 #Download VideoTen #Python Learning Exchange Group: 125240963, the group to share the daily dry, including the latest Python enterprise case study materials and 0 basic introductory tutorials, welcome to the group of small partners to learn exchange One A defDownload (URL): - #url = ' Http://www.pearvideo.com/category_9 ' - #Get Source code the #html = requests.get (URL) -HTML =requests.get (URL). Text - #to process a text file into an explanatory object - #If I use the regular, I can not and can +HTML =etree. HTML (HTML) -video_id = Html.xpath ('//div[@class = "VERVIDEO-BD"]/a/@href') +Video_url = [] AStarurl ='http://www.pearvideo.com' at #Full Stitching URL - forIdinchvideo_id: -Newurl = Starurl +'/'+ID - video_url.append (Newurl) - #get video playback address - forPlayurlinchVideo_url: in #get page source code -HTML =Requests.get (playurl). Text to #print (Playurl) +req ='srcurl= "(. *?)"' - #video Real play address thePurl =Re.findall (req,html) * #print (purl) $ #Get video namePanax Notoginsengreq ='' -PName =Re.findall (req,html) the Print("Downloading:%s"%pname) + AUrlretrieve (Purl[0],'./video/%s.mp4'%pname[0]) the #Download () + - defDownloadmore (): $n = 12 $ whileTrue: - ifn > 48: - return theURL ="http://www.pearvideo.com/category_loading.jsp?reqType=5&categoryId=9&star=%d"%N -N+=12Wuyi Download (URL) theDownloadmore ()
Python is almost anything you don't know, how to play a little video with Python