fromSeleniumImportWebdriver fromScrapy.selectorImportSelector#Analog LoginBrowser = Webdriver. Chrome (executable_path='Chromedriver.exe')#The path is the storage location of the Chromedriver.exe, as long as the configuration of the environment in Windows is not requiredBrowser.get ('http://w')#The URL that needs to be logged inBrowser.find_element_by_xpath ('//div[@view]/imput'). Send_keys ('..........')#the form you want to enter, such as account numberBrowser.find_element_by_xpath ('//div[@view]/imput'). Send_keys ('..........')#the form you want to enter, such as a passwordBROWSER.FIND_ELEMENT_BY_ID ('Captcha'). Send_keys (Input ("Verification Code input verification code")) Browser.find_element_by_xpath ('The button that//div[landed on]'). Click ()#Click to loginBrowser.quit ()#Exit Browser
Basic click-and-Send
#can use selenium to get JS loaded HTML, such as crawling JS loaded contentBrowser= Webdriver. Chrome (executable_path="') Browser.get ('http: //...')Print(Browser.page_source)#JS loading the completed source code" "If you need a selector quickly, use the selector in Scrapy" "Seit= Selector (text=Browser.page_source)Print(Seit.xpath ('//*[@...] /text'). Extract ())#Note that the JS load itself is slow, in the scrapy asynchronous, crawling content quickly, resulting in some JS did not load complete#in that case, you just need to sleep a few seconds .
Data acquisition with dynamic loading
Import Time fromSeleniumImportWebdriver fromScrapy.selectorImportSelectorbrowser= Webdriver. Chrome (executable_path='..') Browser.get ('http ://.....') Seit= Selector (text=Browser.page_source) Seit.xpath ('//div[@class = ""]/text ()'). Extract (). Send_keys ("00000000") Seit.xpath ('//div[@class = ""]/text ()'). Extract (). Send_keys ('************') Seit.xpath ('//div[@class = ""]/text ()'). Extract (). Click ()#Drop down forIinchRange (3): " "three drop-down operation Execute_script is used to execute JS code" "Browser.execute_script ("Window.scrollto (0,document.body.scrollheight) var lenofpage=docment.body.scrollheight;return lenOfPage") Time.sleep (3)#Phantomjs Headless Browser: http://phantomjs.org/download.html
ImportTime
fromSeleniumImportWebdriver
fromScrapy.selectorImportSelector
Browser = Webdriver. Chrome (Executable_path=' ... ')
Browser.get (' http ://... ')
Seit = Selector (text=browser.page_source)
Seit.xpath ('//div[@class = ' "]/text () '). Extract (). Send_keys ("00000000")
Seit.xpath ('//div[@class = ' "]/text () '). Extract (). Send_keys (' ************ ')
Seit.xpath ('//div[@class = ' "]/text () '). Extract (). Click ()
#Drop Down
forIinchRange(3):
" "three drop-down operationExecute_scriptis used to performJSCode""
Browser.execute_script ("Window.scrollto (0,document.body.scrollheight) var lenofpage=docment.body.scrollheight;return lenOfPage")
Time.sleep (3)
# PHANTOMJSHeadless Browser:http://phantomjs.org/download.html
Basic knowledge points of selenium