One, WebBrowser module--Open the browser to get the specified page
The open () function can start a new browser
#! Python 3#! Mapit.py-launches a map in the browser using an address from the command line or clipboard.ImportWebBrowser, sys, PyperclipifLen (SYS.ARGV) > 1: Address=' '. Join (sys.argv[1:])#Get address from command line.Else: Address= Pyperclip.paste ()#Get address from clipboard.Webbrowser.open ('https://www.google.com/map/place/'+ address)
Second, requests module-download files and Web pages from the Internet
Steps to download and save to a file:
① call Requests.get () to download the file
② uses ' WB ' to invoke open () to write a binary way of opening a new file
③ Loop using the Iter_content () method of the Respose object
④ calls write () in each iteration to write the content to the file
⑤ call Close () closes the file
Import= requests.get ('http://www.gutenberg.org/cache/epub/1112/pg1112.txt' ) res.raise_for_status () # Make sure the program stops when the download fails playfile = open (' RomeoAndJuliet.txt'wb')for in res.iter_content (100000): playfile.write (chunk)10000078981playfile.close ()
Third, Beautiful soup--parsing html, that is, Web page writing format
1. Bs4. Beautufulsoup () returns a BeautifulSoup object
2. The Soup.select () method returns a list of Tag objects, which is the way BeautifulSoup represents an HTML element
CSS selectors (there are various examples on the network)
3. The GetText () method returns the element text, or the internal HTML
4. The Get () method returns the property value
#! Python3#Lucky.py-open several Google search results.Importrequests, SYS, WebBrowser, BS4Print('Googling ...')#display text while downloading the Google pageres = Requests.get ('http://google.com/search?q='+' '. Join (Sys.argv[1: ])) Res.raise_for_status () Soup= BS4. BeautifulSoup (Res.text)#Retrieve Top Search result links.Linkelems = Soup.select ('. R a')#Open a browser tab for each result.Numopen = min (5, Len (linkelems)) forIinchRange (Numopen): Webbrowser.open ('http://google.com'+ Linkelems[i].get ('href'))
Iv. selenium--start and control a Web browser
(Selenium can fill out the form and simulate the mouse click in this browser)
1. Launch the selenium controlled browser
from Import webdriver>>> browser = webdriver. Firefox ()>>> type (browser)<class' Selenium.webdriver.Firefox.webdriver.WebDriver'>>>> browser.get ('http:/ /inventwithpython.com')
2. Find elements in the page
1. Find_element_* method returns a Webelement object
2. Find_elements_* method returns a list of Webelement_* objects
3. Click () method: Click on the page
4. Send_keys () Method: Fill in and submit the form
5. From selenium.webdriver.commom.keys import keys: Sending special keys
Crawling information from the Web