This topic can also be said to be:
Describes the process of opening a http://www.oldboyedu.com address from a browser to send a request to the page you see?
Open Browser input URL to enter, to see the process of the page
Outline:
1. User access to the website process Framework
2,the principle of DNS Resolution * * * *
3,TCP/IP three-time handshake principle * * * *
4,the HTTP protocol principle (www service request process) Request message Details!
5, large-scale site cluster architecture details.
6,the HTTP protocol principle (www service response process) Response message Details!
7,TCP/IP four wave process principle * * * * *
,
when we open the browser input URL enter, to see the page page, there are two main steps: First, the domain name to resolve the process of IP, second, through the IP to find the site server, request to open a specific Web page, the server responds to the request, After the client browser receives the response message, it renders the HTML document and finally gets the page we see on the page.
first of all: SayDnsthe process of parsing, as we all know, can only be done between computersIpcommunicate with each other becauseIpIt's hard to remember.DnsThe server resolves the domain name to the correspondingIp, here to parseWww.oldboyedu.comFor example, when we enter this URL, the browser will first query the browser's cache, this cache lifetime may only1minutes, if not found, then go to query the localDnsCache andHostsfiles, if anyWww.oldboyedu.comThis domain name corresponds to theIp, then directly through thisIpaccess the Web server. If the localDnsCache andHostsfile is not found, this time will send the request, network card configuration information in theDnsserver, the default is two, only ifDns1cannot be accessed until you use theDns2. We also call the network card configuration informationDnsto beLocal DNS, this timeLocal DNSwill query its cache first, there is noWww.oldboyedu.comthe corresponding record, if any, is returned to the user, and if not, the root name server is accessed, and the world has13Root name server, root domain server look, it's looking for. com, he will. comof the top-level domain server.Ipsent toLocal DNS, thenLocal DNSVisit Again. comthe top-level domain server,. comtop-level nameservers, looking for a domain nameOldboyedu.com, and then theOldboyedu.comof theIpsent toLocal DNS, and then continue looking down until you findWww.oldboyedu.comthe AuthorityDnsof theARecord orCName this timeLocal DNSWe 'll find them.Www.oldboyedu.comof theIpSent to the client, and logged in the cache, so that the next time if there are other users accessingWww.oldboyedu.comThis domain name,Local DNSIs recorded in the cache. Client ReceivesLocal DNSsent over to theIpwill go throughIpto access the server, and put thisIprecorded inDnsthe cache.
The above is the principle of DNS resolution.
through dns after parsing, Got the ip ip Send to server http< Span style= "font-family: ' The song Body '; > > Yes, because http is working on the seventh layer of application, tcp is working on the fourth transport layer, so the http tcp Three-time handshake.
the three-time handshake for TCP is: The client first sends a random number with a SYN ID and a seq to the server , and after the service receives it, it needs to respond to the client with an ACK,ack The value is just the seq random number of the value +1, in the response packet, also contains a SYN identifier and a seq random number. After the client receives the response packet from the service side, sending a ack,ack value to the server is the value of seq that was sent from the server just now (+1). Once the three steps are complete, the three handshake is complete, and the data is now ready to be transmitted.
This is the beginning of sending HTTP request messages.
HTTP request message, mainly including, request line, request header, blank line, request body
The request line includes, the request method,the URL, the protocol version, the request method mainly has get,HEAD,POST,PUT,DELETE , MOVE,URL is the Uniform Resource Locator, through this can find the only Web resources on the server, the protocol version, the current mainstream is http1.1, the beginning of the popular protocol version is http1.0, Relative to http1.0,http1.1 mainly from scalability, cache processing, bandwidth optimization, persistent connection,host header, error notification, message delivery, content negotiation and other aspects of the optimization, The above is the content of the request line
Some more, request the head, request the head of the main media type, language type, support compression, client type, host name, etc., the media type is mainly text files, picture files, video files, etc., language type is to tell the server client's accepted language, support compression, can save bandwidth, client type, Displays the version information of the client browser, operating system information, etc.
A blank line, representing the end of the request header, also represents the beginning of the request body
Request message body, only use when the POST submits the form, only
After the server receives the request message, it will give the response message.
The response message mainly includes the starting line, the response head, the blank line, the response message body
The starting line typically contains the HTTP version number, the number status code, the status condition
And the number of status codes, the following are common
The representative OK
301 Permanent Jump
403 not authorized .
404 doesn't have this file .
An unknown error
502 Gateway Error
503 server overload, downtime maintenance
504 Gateway Timeout
response header, including, server's Web software version, server time, long connection or short connection, set charset, etc.
The empty line here is the same as the request message.
Wraps the data to be returned to the client in the message body
There are three kinds of common Web resources, static pages, dynamic Web pages, pseudo-static
static Web page is no background database, not including PHP,jsp,ASP and other programs, not interactive, developers write what, show is what, there will be no change
Dynamic Web page, there is a background database, support more features, such as user registration, login, post, order, blog, etc., Dynamic Web page does not exist independently of the Web page file on the server, but when the user requests the dynamic program on the server, the server resolves these programs, and calls the database to return a complete Web page content , it is different from the URL of a static web page , its URL contains? ,& and other special symbols, search engine included in the time there are certain problems. Dynamic Web pages in order to facilitate the collection, often using rewrite technology, the Dynamic Web page URL disguised as static Web page URL, this is pseudo-static.
Different Web resources, open the process is not the same, the following assumes that we are visiting a static site:
the client will download the HTML file on the server via the HTTP protocol, and then read the HTML file, based on the links in the HTML page, top-down requests, Each request is a link, if it is a picture, will download the side rendering, encountered JS, will be loaded JS, when JS comparison content is more complex, the browser will wait, the mouse in the circle, we call this JS blocking , the page we see will not be displayed until JS is downloaded and executed.
when we visit a Dynamic Web page, the first time the user sends a request, the server receives the request, this assumes that the server is using nginx,nginx will transfer this request to PHP, PHP will go to query the database, according to the value returned by the database, generate a complete Web page content, sent to the user, after the user received, is also the side of the download side rendering, loading js, after the completion of the page we see will be displayed
when the server reaches billions of accessesPV, the process of this visit is more complicated, and the user's request will first visit the NationalCdnnode, throughCdnblock the country .80%the request, whenCdnwhen the server cluster is accessed, the cluster typically has a4layer of proxy, this4layer of the agent, using software to complete, isLvs, the use of hardware isF5,4layer of agent, behind is7load balancing of the layer, which is commonly usedHaproxy,nginx, and then it's more than one.WebServer,Webwhen the server is more, there are two problems, one is the consistency of user data, not because of differentWebserver to provide services that cause data to be out of sync, we need to useNfsshared storage, the second problem isSession, not because of differentWebServer provides services,SessionI can't find it, we need to use it.Memcachedto store and shareSession. Because the user access is too large, this time the bottleneck is the pressure of the database, we generally use the distributed cacheMemcache,redisand so on, the database also needs to do the reading and writing separation optimization, the subsequent process and access to Dynamic Web page similar
when the browser loads a full page, it also needs to be disconnected from the server, which is a four-time TCP wave
first the client sends a The FIN ID and a seq random number, after the server receives, responds with an ack,the value of the ACK equals the value of just seq +1, after it is sent, The server will send another packet, the package also has a FIN logo and a seq random number, the client receives, the response to an ack,the value of Ack equals just the seq The value +1, after completion, the server and the client 4 times the wave is finished!
the process is quite complicated and the answer is very difficult. If it is the first access, first through DNS resolution, from near to far, there is low to high to find the IP address of oldboyedu . If everything is OK, the server will return to the page that requires login or index.jsp or php , of course, after the server processing is HTML page, or directly return HTML Static page, which depends on the configuration and architecture of the server, the time will grow. Later, because there is a cache, similar to the DNS cache and cookies , such as the speed will be fast.
This article is from the "Lee blog" blog, make sure to keep this source http://lidao.blog.51cto.com/3388056/1914578
Old boy Education Daily-March 22, 2017: Please describe the user's access to the website process