Function Introduction: Use Selenium and Chrome browser, let it automatically open Baidu page, and set to show 50 per page, and then in Baidu Search box input selenium, to query. Then open the page and select "Selenium-Open source China community" and open the page Knowledge Brief: The role of Selenium: 1. Originally used for Web site automation testing, in recent years, to obtain accurate site snapshots. 2). Can be run directly on the browser, let the browser automatically load the page, get the required data, you can also screen screenshots, or to determine whether certain actions on the site occur. Project steps: 1. Install the Chromedriver driver when using Google's Chrome browser. : It is best to put together your Python files after downloading so that they can be called later. 2. Install Selenium (Ignore this step if installed) Windows users are installed directly with the PIP install Selenium command. 3. Code display:
fromSeleniumImportWebdriver fromTimeImportSleep#behind is your browser driver location, remember the front plus R ', ' R ' is to prevent character escapesDriver = Webdriver. Chrome (R'C:\Python34\chromedriver_x64.exe')#open Baidu page with GetDriver.get ("http://www.baidu.com")#find the "settings" option on the page and clickDriver.find_elements_by_link_text ('Set') [0].click ()#Find the "Search settings" option after opening the settings to show 50 items per pageDriver.find_elements_by_link_text ('Search Settings') [0].click () Sleep (2) M= driver.find_element_by_id ('nr') Sleep (2) M.find_element_by_xpath ('//*[@id = "nr"]/option[3]'). Click () Sleep (2)#handling pop-up warning pagesDriver.find_element_by_class_name ("Prefpanelgo"). Click () Sleep (2) Driver.switch_to_alert (). Accept () Sleep (2)#Find the input box of Baidu and enter "Selenium"DRIVER.FIND_ELEMENT_BY_ID ('kw'). Send_keys ('Selenium') Sleep (2)#Click the Search buttonDRIVER.FIND_ELEMENT_BY_ID ('su'). Click () Sleep (2)#on the page that opens, find the "Selenium-open source China community" and open this pageDriver.find_elements_by_link_text ('Selenium-Open source China community') [0].click ()
4. The following page actions are done automatically
"Python crawler" automates web search and browsing with selenium and Chrome browser