Principles of Selenium and Webdriver

Source: Internet
Author: User
Tags tag name xpath

Main contents transferred from: http://blog.csdn.net/ant_ren/article/details/7968582 and http://blog.csdn.net/ant_ren/article/details/7970793

With the integration of selenium and Webdriver, the new testing tool is called selenium2.x. At selenium1 time, Selenium uses JavaScript to achieve the goal of test automation.

1. Selenium RC

The early selenium used JavaScript injection technology to work with browsers, and required selenium RC to start a server, translating API calls that manipulate web elements into a segment of JavaScript, Inject this Javascript after the selenium kernel launches the browser. Anyone who has developed a Web application knows that JavaScript can get and invoke any element of the page and manipulate it freely. This is the purpose of selenium: automating Web operations. The disadvantage of this JavaScript injection technique is that the speed is not ideal, and stability relies heavily on the quality of the JavaScript that the selenium kernel translates into the API.

Starting selenium server as well as RC has been retained so far, it should be considered forward compatibility, the command is as follows:

[Plain]View Plaincopyprint?
    1. Java-jar Selenium-server-standalone-2.14.0.jar-role Hub
    2. Java-jar selenium-server-standalone-2.14.0.jar-role Node-hub Http://localhost:4444/grid/register

2. Webdriver

When selenium2.x presents the concept of Webdriver, it provides a completely different way to interact with the browser. That is, using the browser's native API, encapsulated into a set of more object-oriented Selenium Webdriver API, directly manipulate the elements of the browser page, and even manipulate the browser itself (screenshots, window size, start, close, install plug-ins, configuration certificates, etc.). Since the use of the browser native API, the speed is greatly improved, and the stability of the call to the browser vendor itself, is obviously more scientific. However, some of the side effects are that different browser vendors, the operation and rendering of web elements will be somewhat different, which directly led to the selenium webdriver to separate browser vendors, and provide a different implementation. For example, Firefox has a special firefoxdriver,chrome there is a special chromedriver and so on. (even includes androiddriver and iOS webdriver)

Quote a personal endorsement of the original: If you use Webdriver, you can simply abandon selenium Server. Because you don't need to start a server to handle browser interaction at all.


An example of the use of webdriver in a simple answer:

    Static{System.setproperty ("Webdriver.firefox.bin", "C:/Program Files (x86)/mozilla Firefox/firefox.exe");} Firefoxdriver Driver=NewFirefoxdriver (); String URL= "http://ap13933:8080"; Driver.manage (). window (). SetSize (NewDimension (1440,1000));            Driver.get (URL); Webelement name= Driver.findelement (By.id ("UserName")); webelement pwd= Driver.findelement (By.id ("OldPassword"));  while(!name.isdisplayed () | | |pwd.isdisplayed ()) Sleep (100);      Name.clear ();      Pwd.clear ();      Name.sendkeys (username);      Pwd.sendkeys (password);  Pwd.submit (); 

The Webdriver wire protocol is generic, meaning that either firefoxdriver or Chromedriver, a Web Service based on this protocol is started on a certain port after startup. For example, after the Firefoxdriver initialization succeeds, the default will start with http://localhost:7055, and Chromedriver is probably http://localhost:46350. Next, any API that we call Webdriver requires a comandexecutor to send a command that is actually an HTTP request to the Web Service on the listening port. In the body of our HTTP request, we will tell selenium what we want the browser to do next, in the JSON-formatted string specified in the Webdriver wire protocol.

In our new webdriver process, selenium first confirms that the browser's native component is available and has a version match. It then launches a set of Web service in the target browser, a Web service that uses selenium's own design-defined protocol, named the webdriver wire Protocol. The protocol is so powerful that you can almost manipulate the browser to do anything, including opening, closing, maximizing, minimizing, element positioning, element clicking, uploading files, and so on.

Here the author initially drew a diagram to show the working principle of various webdriver:

As we can see, the Webdriver subclasses of different browsers need to rely on specific browser-native components, such as Firefox, which requires a add-on name called WEBDRIVER.XPI. and IE, you need to use a DLL file to convert the Web Service command for the browser native call. In addition, the Webdriver wire protocol is a set of restful Web service based on the figure. If you do not understand what is restful, you can refer to the author's other blog about rest (http://blog.csdn.net/ant_yan/article/details/7963517)

Details about the Webdriver wire protocol, such as what you want to know about what this Web service can do, read the Selenium official protocol document, and in Selenium's source code, We can find a httpcommandexecutor this class, which maintains a map<string, COMMANDINFO>, which is responsible for converting the simple string key, which represents the command, into the corresponding URL, Because the idea of rest is to treat all operations as a single state, each state corresponds to a URI. So when we send an HTTP request to this RESTful Web service with a specific URL, it resolves what needs to be done. Intercept a section of the source code as follows:

Nametourl = immutablemap.<string, commandinfo>Builder (). Put (New_session, post ("/session"). Put (QUIT, delete ("/session/:sessionid"). Put (Get_current_window_handle, GET ("/session/:sessionid/window_handle"). Put (Get_window_handles, GET ("/session/:sessionid/window_handles"). Put (GET, post ("/session/:sessionid/url"))                        //The Alert API is still experimental and should isn't be used. . put (Get_alert, GET ("/session/:sessionid/alert"). Put (Dismiss_alert, post ("/session/:sessionid/dismiss_alert"). Put (Accept_alert, post ("/session/:sessionid/accept_alert"). Put (Get_alert_text, GET ("/session/:sessionid/alert_text"). Put (Set_alert_value, post ("/session/:sessionid/alert_text"))

You can see that the actual sent URL is a relative path, the suffix is more/session/:sessionid start, which means that webdriver each launch browser will be assigned a separate SessionID, multithreading parallel to each other without conflict and interference. For example, one of our most commonly used webdriver api,getwebelement here will be converted to/session/:sessionid/element this URL, and then in the emitted HTTP request The body is enclosed with specific parameters such as by ID or CSS or XPath, and what are the respective values. After receiving and performing this operation, an HTTP response will also be returned. The content is also JSON, which returns various details of the found webelement, such as text, CSS selector, tag name, class name, and so on. Here is the code snippet that parses our HTTP response:

    Try{Response=NewJsontobeanconverter (). CONVERT (Response.class, Responseastext); } Catch(classcastexception e) {if(Responseastext! =NULL&& "". Equals (Responseastext)) {                //The remote server has died and has already set some headers. //normally this occurs when the final window of the Firefox driver//is closed on OS x. Return NULL, as the return value _should_ be//being ignored.  This isn't an elegant solution.               return NULL; }              Throw NewWebdriverexception ("Cannot convert text to response:" +Responseastext, E); } //...  

I believe summed up that here, should be the operating principle of webdriver should be clear! Actually quite admire this set of restful Web service design. Feel package Webdriver exposed public API can also be more friendly with a strong point, this time first summed up here, will continue to analyze selenium source code, continue to share!

3. Summary of experience with selenium2.x

The more object-oriented approach of webdriver greatly reduces the threshold of entry for selenium, and the manipulation of web elements is very simple and easy to learn. The most important part of the actual project is how you interpret the various elements that are positioned on your target project page. Like you want to locate a button, you can use the ID, you can use the CSS, you can use the XPath, in order to click on this button, you write a function call selenium in the API, that is, webelement click () or submit (), So what about the other button? What about hundreds of buttons?

Therefore, you need a set of your own implementation of the algorithm or package, according to the characteristics of the project page to provide a common approach to element positioning. When your universal positioning logic can accurately find any element, the rest of the matter is logical, to the Selenium Webelement API. This set of positioning logic I think is the use of selenium to do the largest part of the Web automation workload. Of course, some companies Web projects use their own UI framework, such as the author's company, so that the web elements of the positioning rules and algorithms are relatively easy to design. If the page code developed by the Web project is cluttered, then you need more sophisticated and rigorous logic to find the elements you want to manipulate and view!

In the author's project, I designed and encapsulated a set of generic API, to intelligently locate the various types of elements in the page. For example, the page in the project has a large number of dialog and the wizard, are implemented with DIV+CSS. I have provided a dialog component with next (), Save (), Finish (), click (String buttonname), Cancel () and other methods, and then track the progress of the operation's completion based on the time of the mask layer and loading icon. Here is just a small example of the opportunity to share more details.

Principles of Selenium and Webdriver

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.