the basic process of the BS network model:
When we enter "www.google.com" in the browser, we will first request the DNS server to resolve the domain name of the IP address Chengdu, and then according to the IP address on the Internet to find the Google server, to this server to send a "get" request, There is a server that decides to return a data resource to the requesting user (there may be other complex business logic on the server side, there are many machines on the server side, load balancing needs to be considered, which server responds to the resource, whether the requested file is stored in a static file or stored in a distributed cache or in a database) , when the data is returned to the server, it will be found that the request contains some static resources (CSS files, JS files, picture files) and so on will initiate HTTP requests, and these requests are likely to be on the CDN, then the CDN will process the requests.
All requests are targeted by URL (Uniform Resource Locator).
How to initiate a request
Usually our request is sent by the browser, and we can actually simulate the HTTP request ourselves. The process of establishing an HTTP request is actually the process of establishing a socket link,
(1) Connect-+ establishes the socket link based on the Domain name address and HTTP default 80 port;
(2) Send-and-client sends data in accordance with HTTP protocol format (outputstream.write);
(3) Receive-server waits for Inputstream.read to return data
(4) Close, client and server disconnect
Knowing the above process, we can easily simulate the browser to issue an HTTP protocol, there are many toolkits, such as HttpClient is a packaged toolkit, the following is the use of the package to invoke an instance of the
Of course, we can also use the Curl + URL under Linux to simulate a request
HTTP protocol parsing
The importance of the HTTP protocol is no longer restated, and the following are the various parts of the HTTP protocol:
Browser caching mechanism
Browser caching mechanism is a relatively important mechanism, when we access some static files, such as JS files, CSS files, image files through the cache, you can reduce the number of connections to the server and improve the speed of browsing.
In the browser side, press CTRL +F5 key combination will require the browser to send the request directly to the target server, instead of using the data in the browser cache, and secondly, even if the request to the server, we have access to the server cache data, in order to obtain the latest data, must be controlled through the HTTP protocol, The method is to add pragme:no-cache and cache-control:no-cache in the browser request header
There are fields in the HTTP head that control whether the data requested by the browser is cached or up-to-date.
DNS domain name resolution
The process of DNS domain name resolution can be broadly divided into 10 steps:
When we enter "www.google.com" in the browser and press ENTER, the approximate process is:
(1) The browser checks that there are no www.google.com resolved IP addresses in the cache, and if the data is in the cache, the parsing process will stop. It is important to note that the size and time of the browser cache are limited, usually a few minutes to a few hours. The time the browser is cached can be set by the TTL property. If the time is set too long, once the browser resolves the domain name of the IP address has changed, it will be inaccessible, if too short, every time you need to access the name server.
(2) Once there is no data in the local browser cache, the browser will go to the hosts file under the operating system to check for the resolution of the domain name.
(3) If there is no local, you need to request the local domain name server, how to know the address of the name server (network configuration),
(4) If there is still no local, you need to request the root domain server, only 13 around the world.
..........
The nslookup directive allows you to view the parsing process of a domain name, which can be cleared by Ipconfig/flushdns by using the local DNS cache
CDN working mechanism
CDN becomes a Content distribution network (content-delivery-network), which is an advanced traffic distribution network built on Internent. The goal is to add a new layer of network architecture to the existing Internet, publish the content of the site to the closest user's network edge, so that users can get more resources nearby, so to speak, CDN = mirror (mirror) + cache (cache) + Load balancer (GSLB). At present, the CDN is based on the static data in the cache website, the user downloads the dynamic content from the main station server and then downloads the static file to the CDN.
First, the CDN architecture:
CDN implementation to consider load balancing, and load balancing includes DNS resolution load balancing, cluster load balancing and operating system load balancing, content more ...
Keywords: CDN, load balancer, browser cache
Reference: In- depth analysis of Java Web Technology Insider
In-depth analysis of Java Web Technologies (1)