Reprinted from: https://www.cnblogs.com/testermark/p/3546287.html
How the Webdriver works:
In our new webdriver process, selenium first confirms that the browser's native component is available and has a version match. Then launch a full set of Web Service in the target browser (actually the driver provided by the browser vendor, such as Iedriver, Chromedriver, they all implement Webdriver's wire protocol.), the Web The service uses Selenium's own design-defined protocol, named
The webdriver wire Protocol。 The protocol is so powerful that you can almost manipulate the browser to do anything, including opening, closing, maximizing, minimizing, element positioning, element clicking, uploading files, and so on.
The Webdriver wire protocol is generic, meaning that either firefoxdriver or Chromedriver, a Web Service based on this protocol is started on a certain port after startup. For example, after the Firefoxdriver initialization succeeds, the default will start with http://localhost:7055, and Chromedriver is probably http://localhost:46350. Next, any API that we call Webdriver requires a comandexecutor to send a command that is actually an HTTP request to the Web Service on the listening port. In the body of our HTTP request, we will tell selenium what we want the browser to do next in the JSON-formatted string specified in the Webdriver wire protocol.
It is more common to understand that because client script (Java, Python, Ruby) cannot communicate directly with the browser, webservice can be used as a translator, which translates the client code into code that the browser can recognize (such as JS). Client (i.e. test script) Create 1 session, in the session through the HTTP request to send a restful request to WebService, WebService translated into the browser know script to the browser, the browser to return the results of execution to WebService, WebService the returned results in some packages (usually in JSON format) and then returns to the client, judging by the return value, whether the operation on the browser is successful or not.
From the official website for the description of Chrome driver:
The chromedriver consists of three separate pieces. There is the browser itself ("Chrome"), the language bindings provided by the Selenium project ("The Driver") and an Execu Table downloaded from the Chromium project which acts as a bridge between "Chrome" and the "Driver". This executable was called "Chromedriver", but we'll try and refer to it as the "server" in this page to reduce confusion.
It probably means that our download of the chrome executable (. exe) is intended as a bridge between the browser and the client (language binding), as well as the understanding of the Web Service (driver).
To give a practical example:
Webdriver diver = new Firefoxdriver ();d river.get ("http://google.com"); in Execution driver.get ("http://google.com "); This code, the client is our test code that sends the following request to the Web Service (remote server): POST session/ 285b12e4-2b8a-4fe6-90e1-c35cba245956/urlpost_data {"url": "http://google.com"} request localhost via post: Port/hub/session/session_id/url address, request the browser to complete the jump URL operation. If the above request is acceptable, or if the Web service implements this interface, then the Web service jumps to the URL that the post data contains and returns the following response{"name": "Get", "sessionId ":" 285b12e4-2b8a-4fe6-90e1-c35cba245956 "," status ": 0," value ":" "} the name of the method that contains the following information name:web the implementation of the service side of the response , here is get, means jump to the specified Url;sessionid: the current session of the Id;status: Request execution status Code, not 0 is not executed correctly, here is 0, indicating everything OK do not worry; value: The return value of the request, where the return value is NULL, If the client calls the title interface, the value should be the title; of the current page if the client sends a request that locates a particular page element, the return value of response may be: {"name": "Findelement", " SessionId ":" 285b12e4-2b8a-4fe6-90e1-c35cba245956 "," status ": 0," value ": {" ELEMENT ":" { 2192893e-f260-44c4-bdf6-7aad3c919739} "}} name,sessionid,status is similar to the above example, the difference is that the return value of the request is element:{2192893e- f260-44c4-bdf6-7aad3c919739}, which indicates the ID of the element to navigate to, through which the id,client can send requests such as Click to interact with the server side.
That's how the various webdriver work.
As we can see, the Webdriver subclasses of different browsers need to rely on specific browser-native components, such as running Firefox requires a add-on name called WEBDRIVER.XPI. and IE, you need to use a DLL file to convert the Web Service command for the browser native call. In addition, the Webdriver wire protocol is a set of restful Web service
Details about the Webdriver wire protocol, such as what you want to know about what this Web service can do, read the Selenium official protocol document, and in Selenium's source code, We can find a httpcommandexecutor this class, which maintains a map<string, COMMANDINFO>, which is responsible for converting the simple string key, which represents the command, into the corresponding URL, Because the idea of rest is to treat all operations as a single state, each state corresponds to a URI. So when we send an HTTP request to this RESTful Web service with a specific URL, it resolves what needs to be done. Intercept a section of the source code as follows:
1. Put (new_session, post ("/session")))2. Put (QUIT, delete ("/session/:sessionid")))3. Put (Get_current_window_handle, GET ("/session/:sessionid/window_handle")))4. Put (Get_window_handles, GET ("/session/:sessionid/window_handles"))) 5. put (GET, post ("/session/:sessionid/url" 6 7 // The Alert API is still experimental and should isn't be used. 8. put (Get_alert, GET ("/session/:sessionid/alert" 9. put (Dismiss_alert, post ("/session/:sessionid/dismiss_alert" 10. Put (Accept_alert, post ("/session/:sessionid/ Accept_alert "11. Put (Get_alert_text, GET ("/session/: Sessionid/alert_text "12. Put (Set_alert_value, post (" /session/:sessionid/alert_text "))
You can see that the actual sent URL is a relative path, the suffix is more/session/:sessionid start, which means that webdriver each launch browser will be assigned a separate SessionID, multithreading parallel to each other without conflict and interference. For example, one of our most commonly used webdriver api,getwebelement here will be converted to/session/:sessionid/element this URL, and then in the emitted HTTP request The body is enclosed with specific parameters such as by ID or CSS or XPath, and what are the respective values. After receiving and performing this operation, an HTTP response will also be returned. The content is also JSON, which returns various details of the found webelement, such as text, CSS selector, tag name, class name, and so on. Here is the code snippet that parses our HTTP response:
1Try{2 response =New Jsontobeanconverter (). CONVERT (Response.Class, Responseastext);5;Catch(ClassCastException e) {4if (responseastext! =Null && "". Equals (Responseastext)) {5//The remote server has died and has already set some headers.6//Normally this occurs when the final window of the Firefox driver7 // is closed on OS x. Return NULL, as the return value _should_ be 8 // being ignored. This isn't an elegant solution. 9 return null; Ten } throw new Webdriverexception ("Cannot convert text to response:" + Responseastext, e);
12}
//...
PS: If you want to know more about the architecture of Webdriver, you can refer to the article http://www.aosabook.org/en/selenium.html.
Reproduced Webdriver Working principle