Recently, I took a closer look at selenium's source code, because it mainly uses Webdriver, So I focused on the working principle of WebDriver. In the previous blog, I explained that Webdriver is different from the JS injection Implementation of selenium, and the browser native support is used directly to operate the browser. For different platforms, different browsers must depend on the native of a specific browser.
Component to convert the WebDriver API call to the native invoke of the browser.
In the process of new Webdriver, selenium first checks whether the native component of the browser is available and the version matches. Then, start a complete set of web services in the target browser. This Web Service uses the protocol designed and defined by selenium, which is calledThe
WebDriver wire protocol. This set of protocols is very powerful and allows you to operate your browser to do anything, including opening, closing, maximizing, minimizing, element locating, element clicking, uploading files, and so on.
The WebDriver wire protocol is universal. That is to say, whether it is firefoxdriver or chromedriver, the Web Service Based on this protocol will be started on a port after it is started. For example, after firefoxdriver Initialization is successful, it starts from http: // localhost: 7055 by default, while chromedriver is probably http: // localhost: 46350. Next, we need to use a comandexecutor to send a command to call any WebDriver API, which is actually an HTTP
Request to the Web service on the listening port. In our HTTP Request body, the string in JSON format specified by the WebDriver wire protocol tells selenium what we want the browser to do next.
Here I have drawn a diagram to demonstrate the working principles of various webdrivers:
We can see that the WebDriver subclasses of different browsers depend on the native components of specific browsers. For example, Firefox requires an add-on name called WebDriver. xpi. For IE, you need to use a DLL file to convert the Web service command to the browser native call. In addition, the figure shows that the WebDriver wire protocol is a restful Web
Service. If you don't understand what restful is, see the author's previous blog about rest (http://blog.csdn.net/ant_yan/article/details/7963517)
For details about the WebDriver wire protocol, for example, if you want to know what the web service can do, you can read the official selenium Protocol documentation. In the source code of selenium, we can find an httpcommandexecutor class, A Map <string,
Commandinfo>, which converts simple string keys that represent commands into URLs. The rest concept is to regard all operations as one State, and each State corresponds to one Uri. Therefore, after an HTTP request is sent to this restful web service with a specific URL, it can analyze the operations to be performed. The source code is as follows:
nameToUrl = ImmutableMap.<String, CommandInfo>builder() .put(NEW_SESSION, post("/session")) .put(QUIT, delete("/session/:sessionId")) .put(GET_CURRENT_WINDOW_HANDLE, get("/session/:sessionId/window_handle")) .put(GET_WINDOW_HANDLES, get("/session/:sessionId/window_handles")) .put(GET, post("/session/:sessionId/url")) // The Alert API is still experimental and should not be used. .put(GET_ALERT, get("/session/:sessionId/alert")) .put(DISMISS_ALERT, post("/session/:sessionId/dismiss_alert")) .put(ACCEPT_ALERT, post("/session/:sessionId/accept_alert")) .put(GET_ALERT_TEXT, get("/session/:sessionId/alert_text")) .put(SET_ALERT_VALUE, post("/session/:sessionId/alert_text"))
It can be seen that the actually sent URLs are all relative paths with the suffix starting with/session/: sessionid. This also means that each time WebDriver starts the browser, it will assign an independent sessionid, when multithreading is parallel, there will be no conflict or interference between each other. For example, for the most commonly used WebDriver API, getwebelement will be converted to the/session/: sessionid/element URL here, and then
The specific parameters such as by ID, CSS, and XPath are attached to the request body. What are their respective values. After receiving and executing this operation, an HTTP response will also be returned. The content is also JSON, and various details of the webelement are returned, such as text, CSS selector, tag name, and class name. The following is the code snippet for parsing our HTTP response:
try { response = new JsonToBeanConverter().convert(Response.class, responseAsText); } catch (ClassCastException e) { if (responseAsText != null && "".equals(responseAsText)) { // The remote server has died, but has already set some headers. // Normally this occurs when the final window of the firefox driver // is closed on OS X. Return null, as the return value _should_ be // being ignored. This is not an elegant solution. return null; } throw new WebDriverException("Cannot convert text to response: " + responseAsText, e); } //...
I believe that here, we should be clear about the running principle of WebDriver! In fact, I really admire the design of this restful web service. I feel that the public API exposed by the WebDriver package can be more friendly and powerful. This time, I will summarize it and continue to analyze the selenium source code and share it with you!