Browser and server involves a lot of network communication content, here do a weakening introduction, as the front-end focus on part four.
First, network Environmental protection
Let us first assume that the URL we visit is www.abc.com and that the address is not inside the LAN;
First of all, we are in the local network of the total route should be connected with the ISP (Internet service provider), our host to achieve network communication must have the following four elements
1, the IP address of the machine
2. Subnet mask
3, the IP address of the gateway (if we visit the URL in the LAN does not need this item)
4. DNS IP Address
There are two basic ways to get these four features, manual configuration (static fetch) and get through DHCP (dynamic fetch)
Skip the content here, after all, as the front end should be aware that this is enough. If you want to delve into the DHCP service here (via UDP packet communication), to learn more about the details of the service it is recommended to first understand the basic MAC address-based communication mode in Ethernet-broadcast.
Again assume that our host has been acquired. Native Fetch
IP address of this machine: 192.168.1.100
Subnet Mask: 255.255.255.0
IP address of the gateway: 192.168.1.1
IP address of DNS: 68.68.68.222
Second, get the IP address of the server
The URL we entered is after the browser to listen to the event, after a series of processing, the browser to generate a TCP socket (what we call the socket), The socket is used to send an HTTP request to www.abc.com, in order to generate the socket, our host needs to know the IP address of www.abc.com, at this time need to use the DNS service (based on UDP message communication), the following is the DNS query process
The host's operating system generates a DNS query message, which is placed in a UDP message that continues to be placed in an Ethernet frame (the link layer is located in the 5-layer network structure) In front we have obtained the local DNS server address 68.68.68.222, but the link layer in the transmission of the message can not be identified through the IP link layer of communication intermediary, here needs the MAC address (network card address), so need to query the gateway MAC address (need to use the ARP protocol here)
The same thing is a bit more complicated, and I'll simplify it:
DNS Query message------"Gateway Routing---------" local DNS server----------"returns the IP address we want to access
The local DNS server does not necessarily cache the IP address we are querying, so our query IP requests may have been made over a number of queries, such as: Www.abc.com's full spelling should be www.abc.com.
The query process consists of the. ------com.------abc.com.-------www.abc.com. (Root name server-top level Nameservers-level two domain name server--www.abc.com. Domain Name server)
Top-level domain name: to. com,.net,.org,.cn etc. belong to the international top-level domain name, according to the current Internet domain Name System, the international top-level domain name divides into two categories: Category top-level domain name (GTLD) and geographical top-level domain name (ccTLD) two kinds. Category top-level domain name is the "COM", "NET", "ORG", "BIZ", "INFO" and so on the end of the domain name, all by foreign companies responsible for management. A geographic top-level domain name is a domain name that ends in a country or region code, such as "CN" on behalf of China and "UK" for the UK. Geographic top-level domains are typically managed by each country or region
Level two domain name: Level Two domain name is based on the top-level domain name, the analogy of China's two-level domain has,. com.cn,.net.cn,.org.cn,.gd.cn, etc. subdomain is the subdomain of its parent domain, and the parent domain name is abc.com, Subdomains are www.abc.com or *.abc.com.
In general, the level two domain name is a record of the domain name, such as alidiedie.com is a domain name, www.alidiedie.com is one of the more commonly used records, the general default is to use this, but similar *. Alidiedie.com's domain name is all called the level two of alidiedie.com.
Of course, after the local domain name server will cache this check the domain name IP, to avoid the next query in the process of going through this complex, if the domain name server network structure is interested in self-learning, here no longer elaborated.
If the server we are accessing uses proxies (such as the common reverse proxy nginx) then the DNS is returned to the proxy IP address (behind the client and browser interaction also a layer of proxy communication)
III. Client and server interaction (TCP and HTTP)
Before sending the formal request message, the client (browser) will interact with the server with three TCP packet segments, which is what we call the three handshake, each Exchange message is a completed request process, here is simplified to:
Client---syn=1,seq=client_isn------server
Service-Side---syn=1,seq=client_isn,ack= client_isn +1------Server
Client---syn=0,seq=client_isn+1, ack= client_isn +1------Server
1. The browser generates an HTTP request similar to this:
get/http/1.1
Host:www.abc.com
Connection:keep-alive
user-agent:mozilla/5.0 (Windows NT 6.1) ...
accept:text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-encoding:gzip,deflate,sdch
accept-language:zh-cn,zh;q=0.8
accept-charset:gbk,utf-8;q=0.7,*;q=0.3
Cookies: ...
We assume that the length of this part is 6000 bytes and it will be embedded in the TCP packet.
2. The browser generates a TCP socket (socket) that places an HTTP packet into a TCP packet and sets the port number of the receiver (www.abc.com) to 80 (the default). The Port of the sender (native) (that is, the previous socket) is a randomly generated integer between 1024-65535, assuming 55555.
The header length of the TCP packet is 20 bytes, plus the packet embedded in HTTP, the total length becomes 5020 bytes.
3. Then, the TCP packet is then embedded in the IP packet. IP packets need to set both sides of the IP address, which is known, the sender is 192.168.1.100 (native), the receiver is 172.172.72.222 (assuming this IP is the server IP that the second query to).
Then, the TCP packet is then embedded in the IP packet. IP packets need to set both sides of the IP address, which is known, the sender is 192.168.1.100 (native), the receiver is 172.172.72.222.
The header length of the IP packet is 20 bytes, plus the embedded TCP packet, the total length becomes 5040 bytes.
4. The final data enters the link layer, and IP packets are embedded in the Ethernet packet. Ethernet packet needs to set the MAC address of both sides, the sender is the local network card MAC address, the receiver is the gateway 192.168.1.1 MAC address (through the ARP protocol).
The data portion of the Ethernet packet, the maximum length is 1500 bytes, and now the IP packet length is 5040 bytes. Therefore, IP packets must be split into four packets. Because each package has its own IP header (20 bytes), the length of the IP packets for the four packets is 1500, 1500, 1500, 600, respectively.
5. Server-Side response
After the forwarding of multiple gateways, the server 172.172.72.222, received the four Ethernet packets.
Based on the serial number of the IP header, the server will combine four packets, take out the complete TCP packet, then read the "HTTP request" inside, then make an "HTTP response" and send it back with the TCP protocol. The browser receives data decoded and becomes an HTML document.
Four, the browser rendering part:
1. Browser Rendering page
1.1 Browser kernel (rendering engine) get the document content from the browser network module the process:
A. Parsing an HTML document creating a Document Object Model (DOM)
B. Parsing css creating a CSS Object Model (CSSDOM)
C. Execute JS script based on DOM and Cssdom
D. Building a render tree based on DOM and Cssdom
E. Using render tree layouts (layout) as elements
F. Browser UI back-end rendering (Paint) all elements
The above process is a gradual process, that is, the rendering engine will display the content on the screen as early as possible, not until all the HTML has been parsed to start the construction and layout of the rendering tree, while the process continues to parse the flow of content from the network, a part of the content will be parsed and displayed
The main flowchart of WebKit and Mozilla's Gecko rendering engine is attached here
WebKit Rendering Engine main process
Mozilla's Gecko rendering engine main process
1.2 Parsing (how browsers work)
The above main processes involved in the analysis of the main:
A.html Parsing,
HTML syntax rules are not context-independent, HTML language features:
1. Characteristics of language tolerance
2. The fact that the browser is tolerant of illegal HTML that is well known to people
3. The parsing process can be interrupted. Normally resources are immutable during parsing, but in HTML, scripts that contain "document.write" can add additional substrings, so the parsing process actually alters the initial content.
Therefore, HTML can not be used to normal from the top down or from the bottom up of the ordinary parser parsing, the browser made a special parser to parse the HTML
B.CSS Parser
Each CSS file is parsed into a StyleSheet object, each containing CSS rules, cssrule objects that contain selectors and declaration objects, and other objects that conform to CSS rules
C.js script parsing and execution
The web model is synchronous, and the developer wants the parser to parse and execute when it encounters the <script> tag, pause the parsing of the document before the script executes, and if the script is external, you must first get the resource from the network-which is also synchronous, The parsing of the document is paused before getting to this resource, which is a model that has been used for many years and is, of course, defined in HTML4 and 5. Developers can mark scripts as "defer" so that they do not pause parsing of the document, but wait until the document is parsed. HTML5 adds a choice to mark the script as asynchronous so that it can be parsed and executed by a different thread
Analysis of trickery (speculative parsing)
Webkit and Firefox have done this optimization. When executing a script, another thread will parse the rest of the document, find out which resources need to be loaded from the network and load them, this way the resource will be loaded in parallel, the overall speed is better, note-trickery parser does not change the DOM tree but the DOM is left to the main parser, it only resolves the referenced external resources, such as external scripts, style sheets, and pictures.
CSS is a different model, in theory, because stylesheets do not change the DOM tree, so there is no reason to wait for them to stop parsing the document, but there is a problem, in the document parsing phase, script execution will request style information, if the style has not been loaded or parsed, the script will get the wrong information, Obviously this can cause a lot of problems. This looks marginal but is actually common, and when the stylesheet is loaded and parsed, Firefox blocks all the scripts, and Webkit blocks them only when the script tries to access properties that are affected by the styles that are not yet loaded.
1.3 Building the Render tree
The render object and DOM elements are consistent, but not one-to-a-kind relationships, non-visible elements will not be inserted into the render tree, there is an example of "head" element, and the display property is none of the elements will not appear in the tree (visibility for hidden elements appear)
Building a render tree requires evaluating the visual properties of each rendered object, which is done by calculating the style properties of individual elements. Styles come from different sources of style sheets, inline styles, or visual attributes in HTML (like the "background" property), which are converted to CSS style properties
CSS matching content more here not too much introduction
1.4 Layout and drawing (paint)
Layouts can be global and incremental, and the so-called "global" layout occurs when the global style that affects all parsers changes, such as: Font size changes, screen changes, layout can also be incremental, only the rewritten parser and his descendants are laid out, and when the parser is rewriting the state, the incremental layout is triggered (asynchronous), For example, when additional content comes in from the network and is added to the DOM tree, the new parser is added to the render tree. (as mentioned above the entire process is progressive)
The drawing can also be global (the entire tree is drawn) or incremental. In an incremental drawing, the parsing
Changes in a way that does not affect the entire tree. The changed parser invalidates its area on the screen, which causes the operating system to think of it as "rewriting the area" and triggering the "draw" method, which the operating system does skillfully, which merges several areas into one. becomes more complex in Chrome, because the parser is in a different process than the main process, chrome simulates the operating system to some extent, this way listens to these events, and delegates the message to the root node of the render tree, which is traversed until the corresponding parser is reached. The parser redraws itself (and usually its child nodes)
There are two concepts involved in the layout and drawing process: Reflow (reflow) and Repain (redraw), each element of the DOM node is in the form of a box model, which requires the browser to calculate its position and size, and so on, which is called reflow; when the node's position size and other content, After the font, color, etc. are determined, the process is called Repain. When the page is first loaded, it will inevitably have reflow and redraw, and reflow and redraw are very performance-intensive (reflux is usually heavily painted), especially on the mobile side, all to minimize reflow and redraw times.
2. Some points of attention
If you have questions about page rendering and JS execution order, you can look at the content!!
2.1 The process of rendering pages in the browser described in 1 is progressive!! The important thing to say three times.
More than 2.2 of the process is not a single thread to complete, JS execution is single-threaded, but the browser is not single-threaded;
To illustrate this, let's take a look at the multi-threaded browser:
There are at least three resident threads in the kernel of a generic browser (different browser implementations are different):
GUI rendering thread----The main thread of the browser kernel (rendering engine)
JS engine thread----Process the main thread of the JS script
The browser event triggers the thread----control the interaction, responding to the user
There are also threads that terminate after execution, such as the HTTP request thread.
Let's just give an example: first in the browser rendered page in 1, parsing the HTML page constructs the DOM tree and the rendering tree are all GUI rendering main thread completion, if encountered <script> tags, will first put JS loaded back, to the JS engine thread parsing and execution, In the JS execution process is always single-threaded execution, if you carefully read the previous content has mentioned, when the JS execution process will block the document parsing, this is because the JS engine thread and GUI rendering main course is mutually exclusive, the execution of JS when the main thread is suspended, it will block the page parsing, why should this design, The reason is also mentioned before, because they also operate the DOM tree (multiple threads simultaneously manipulate the same object will conflict);
As for the CSS loading and parsing time will block JS execution (Chrome only when the script access style information block), before also mentioned because JS may access CSS style information, (in view of JS execution and rendering the main thread is mutually exclusive, there is also a browser to the JS execution of the main thread execution of the statement, I am not very clear here, after all, the browser is not my writing)
2.3 JS engine Thread (hereinafter we collectively referred to as JS main thread) execution mechanism:
(1) All synchronization tasks are executed on the JS main thread to form an execution stack (and of course there is a heap-storage JS execution process)
(2) There is a "task queue" outside the main thread. As long as the asynchronous task has a running result, an event is placed in the task queue.
(3) Once all the synchronization tasks in the "execution stack" have been executed, the system reads the "task queue" to see what events are in it. Those corresponding asynchronous tasks, so end the wait state, go to the execution stack, and start execution.
(4) The main thread constantly repeats the third step above.
The main thread reads events from the task queue, and the process is cyclic, so the whole mechanism is called event loop.
, when the JS main thread is running, the heap (heap) and stack (stack) are generated, and the code in the stack calls various external APIs, which include various events (Click,load,done) in the task queue. As soon as the code in the stack finishes executing, the main thread reads the task queue and executes the callback function that corresponds to those events. Executes the code in the stack (the synchronization task) and always executes before the task Queue (asynchronous task) is read. The event trigger thread completes the Listener DOM event in the above procedure and adds the event callback to the event queue.
Reason for asynchronous processing: But if single-threaded, the task needs to be queued. Queuing is because of the computational capacity, CPU busy, but also forget, but a lot of time the CPU is idle, because the IO device (input) is very slow (such as the Ajax operation from the network reading data), have to wait for the results, and then down execution. The designer of the JavaScript language realizes that at this point the main thread can completely ignore the IO device, suspend the waiting task, and first run the task that is in the queue. Wait until the IO device returns the result, and then go back and put the suspended task on hold.
So async is a browser of two or more than two threads to do together. such as Ajax asynchronous requests and settimeout.
Now you look at Xia. What's the difference between the code?
SetTimeout (function test () {
DoSomething ();
SetTimeout (test,100);
},100)
SetInterval (unction Test () {
DoSomething ();
},100)
And why is there sometimes some code inside that says
SetTimeout (function test () {
DoSomething ();
},0)
These can be found in the injury JS execution process to find the answer
See: "How Browsers Work", "computer network-top-down approach"
Front-end answers from the input URL to the page show what has gone through