Background knowledge:
PHANTOMJS is a WebKit-based server-side JavaScript API. It fully supports the web without the need for browser support, its fast, native support for various web standards: DOM processing, CSS selectors, JSON, Canvas, and SVG. PHANTOMJS can be used for page automation, network monitoring, web screen screenshots, and no interface testing.
Selenium is also a tool for Web application testing. The selenium test runs directly in the browser, just as the real user is doing. Supported browsers include IE (7, 8, 9), Mozilla Firefox, Mozilla Suite, and more. The main features of this tool include: test and browser compatibility--test your application to see if it works well on different browsers and operating systems.
PHANTOMJS is used to render parsing js,selenium used to drive and with Pyt
#coding =utf-8from Selenium Import webdriverdriver = Webdriver. Phantomjs (executable_path= ' C:usersgentlyguitardesktopphantomjs-1.9.7-windowsphantomjs.exe ') driver.get ("HTTP/ phperz.com/") driver.find_element_by_id (' Search_form_input_homepage '). Send_keys (" Nirvana ") Driver.find_element_ by_id ("Search_button_homepage"). Click () print Driver.current_urldriver.quit ()
Hon of the docking, Python for the later processing.
Selenium2 supported Python versions: 2.7, 3.2, 3.3 and 3.4
Additional installation of Selenium server is required if remote operation is required
Installation:
First installed selenium2, which way can be installed, I usually directly download the compressed package, and then use the Python setup.py install command to install, Selenium 2.42.1: https://pypi.python.org/pypi/ selenium/2.42.1
Then download Phantomjs,https://bitbucket.org/ariya/phantomjs/downloads/phantomjs-1.9.7-windows.zip, unzip to see a Phantomjs.exe file
Example 1 :
The Executable_path is just the path of the Phantomjs.exe, the result of the operation:
Https://phperz.com/?q=Nirvana
Walk through of the example :
It is worth mentioning that:
The Get method waits until the page is fully loaded before proceeding with the program
But for Ajax:it ' s worth noting that if your page uses a IoT of AJAX on load then webdriver could not know when It had complete Ly loaded
Send_keys is the fill input
Example 2 :
#coding =utf-8from Selenium Import webdriverfrom selenium.webdriver.common.keys import Keysfrom Selenium.webdriver.support.ui Import webdriverwaitfrom selenium.webdriver import Actionchainsimport timeimport Sysdriver = Webdriver. Phantomjs (executable_path= ' C:usersgentlyguitardesktopphantomjs-1.9.7-windowsphantomjs.exe ') driver.get ("HTTP/ www.zhihu.com/#signin ") #driver. Find_element_by_name (' email '). Send_keys (' your email ') driver.find_element_by_ XPath ('//input[@name = "password"]). Send_keys (' Your password ') #driver. Find_element_by_xpath ('//input[@name = ' Password "]). Send_keys (Keys.return) time.sleep (2) driver.get_screenshot_as_file (' Show.png ') #driver. Find_element_ By_xpath ('//button[@class = "Sign-button"]). Click () Driver.find_element_by_xpath ('//form[@class = ' Zu-side-login-box "]). Submit () try:dr=webdriverwait (driver,5) dr.until (Lambda The_driver:the_driver.find_element_ By_xpath ('//a[@class = "Zu-top-nav-userinfo"). is_displayed ()) Except:print ' Login failed ' sys.exit (0) driver.get_ Screenshot_as_file (' show.png ') #webdriver #user =driver.find_element_by_class_name (' Zu-top-nav-userinfo '). Actionchains (Driver). Move_to_element (user). Perform () #移动鼠标到我的用户名loadmore =driver.find_element_by_xpath ('//a[@id = "Zh-load-more"] actions = Actionchains (driver) actions.move_to_element (Loadmore) Actions.click (Loadmore) Actions.perform () Time.sleep (2) driver.get_screenshot_as_file (' show.png ') print Driver.current_urlprint driver.page _sourcedriver.quit ()
This program is completed, login to know, and then can automatically click on the page below the "more" to load more content
Walk through of the example :
From Selenium.webdriver.common.keys import Keys,keys This class is the key on the keyboard, the text of the Send_keys (Keys.return) is to press a carriage return
From Selenium.webdriver.support.ui import webdriverwait is for a later wait operation
From Selenium.webdriver import Actionchains is the class that imports an action, the wording of this sentence, I looked for a long time
Find_element recommend the use of XPath method, very convenient
Syntax for XPath expressions tutorial: http://www.ruanyifeng.com/blog/2009/07/xpath_path_expressions.html
It is worth noting that you should avoid selecting the value with a space attribute, such as class = "Country name", otherwise it will be an error, probably compound class or something wrong
The correct way to check the user's password is to take a screenshot after filling it in.
If you want to get a screenshot, this is the line:
Driver.get_screenshot_as_file (' Show.png ')
However, the screenshot here is not with the scroll bar, is to give you the entire page photo down
Try
Dr=webdriverwait (driver,5)
Dr.until (Lambda the_driver:the_driver.find_element_by_xpath ('//a[@class = "Zu-top-nav-userinfo"]). Is_displayed () )
Except
print ' Login failed '
Sys.exit (0)
is used to check if an element is loaded to see if the login is successful, I think it is possible to use a black box. 5 of the Explanations: 1 page changes are scanned every 500 milliseconds in 5 seconds until the specified element
For a form submission, you can select the login button and then use the click Method, or you can select the form and then use the Submit method, which can handle the absence of a login button, so it is recommended to use the Submit ()
For a single click, you can either use the click () or use a series of actions, as in the text:
Loadmore=driver.find_element_by_xpath ('//a[@id = ' Zh-load-more ')
actions = Actionchains (driver)
Actions.move_to_element (Loadmore)
Actions.click (Loadmore)
Actions.perform ()
These 5 sentences are actually equivalent to a sentence, find element and click, but the scope of action is more extensive, for example, in this case, to click on a tag object, I do not know why directly with the click does not work
Print Driver.current_url
Print Driver.page_source
Print two properties of a Web page: URL and source
Reprint http://www.phperz.com/article/15/0829/117337.html
SELENIUM+PHANTOMJS parsing JS