1.
CDN overview
The full name of CDN is Content Delivery Network, that is, content delivery network. The purpose is to add a new CACHE (cache) layer to the existing Internet, and publish the content of the website to the node closest to the user's network "edge", so that users can get the content they need nearby and improve users The response speed of visiting the website. Technically, it solves the reasons of small network bandwidth, large user visits, uneven distribution of outlets, etc., and improves the response speed of users visiting the website.
CDN network node
Cache layer technology eliminates node device congestion caused by peak data access. The Cache server has a caching function, so most Web page objects, such as html, htm, php and other page files, gif, tif, png, bmp and other image files, and other format files, are within the validity period (TTL) For repeated visits, there is no need to retransmit the file entity from the original website, just pass a simple authentication (Freshness Validation)-transmit a header of tens of bytes, and the local copy can be directly transmitted to the visitor. Since the cache server is usually deployed close to the user side, it can obtain a response speed similar to that of a local area network and effectively reduce the consumption of wide-area bandwidth. Not only can it improve the response speed and save bandwidth, it is very effective for accelerating the Web server and effectively reducing the load of the origin server.
According to different acceleration targets, it is divided into client acceleration and server acceleration
. Client acceleration: Cache is deployed at the network exit to cache frequently accessed content locally to improve response speed and save bandwidth;
. Server acceleration: Cache is deployed on the front end of the server as a proxy caching machine for the Web server to improve the performance of the Web server and speed up access.
If multiple Cache acceleration servers are distributed in different regions, it is necessary to manage the Cache network through an effective mechanism, guide users to visit nearby (for example, guide users through DNS), and load balance traffic globally. This is the basic idea of CDN content transmission network.
The optimization effect of CDN on the network is mainly reflected in the following aspects
Solve the "first mile" problem on the server side
Alleviate or even eliminate the impact of interconnection bottlenecks between different operators
Reduce the pressure on export bandwidth in various provinces
Relieved the pressure on the backbone network
Optimized the distribution of online hot content
2. How CDN works
2.1. Traditional access process (not accelerated cache service)
The process for a user to access a website without CDN cache is:
The user enters the domain name to be accessed, and the operating system queries LocalDns for the ip address of the domain name. LocalDns asks ROOT DNS
Query the authorized server of the domain name (here assumes that the LocalDns cache expires)
ROOT DNS responds with domain name authorization dns records to LocalDns
After LocalDns obtains the authorized dns record of the domain name, it continues to query the ip address of the domain name from the authorized dns of the domain name
After the domain name authorized dns queries the domain name record, it responds to LocalDns
LocalDns will get the domain name ip address and respond to the client
After the user gets the domain name ip address, he visits the site server
The site server responds to the request and returns the content to the client.
2.2. CDN access process (using caching service)
The CDN network adds a Cache layer between the user and the server, which is mainly achieved by taking over the DNS and directing the user's request to the Cache to obtain the data of the source server.
The access process of the website after using the CDN cache becomes:
The user enters the domain name to be accessed, and the operating system queries LocalDns for the ip address of the domain name.
LocalDns queries ROOT DNS for the authorized server of the domain name (here assumes that the LocalDns cache expires)
ROOT DNS responds with domain name authorization dns records to LocalDns
After LocalDns obtains the authorized dns record of the domain name, it continues to query the ip address of the domain name from the authorized dns of the domain name
After the domain name authorization dns queries the domain name record (usually CNAME), it responds to LocalDns
After LocalDns gets the domain name record, it queries the smart dispatch DNS for the ip address of the domain name
Intelligent scheduling DNS responds to LocalDns with the most suitable CDN node ip address according to certain algorithms and strategies (such as static topology, capacity, etc.)
LocalDns will get the domain name ip address and respond to the client
After the user gets the domain name ip address, he visits the site server
The CDN node server responds to the request and returns the content to the client. (On the one hand, the cache server saves locally for later use, and on the other hand, it returns the acquired data to the client to complete the data service process)
Through the above analysis, we can get that in order to achieve transparent access to ordinary users (the user client does not need to make any settings after using the cache) access, DNS (domain name resolution) needs to be used to guide users to access the Cache server to achieve transparent acceleration services. Since the first step for users to access a website is domain name resolution, it is the simplest and most effective way to guide users to visit by modifying dns.
2.3. The components of the CDN network
For ordinary Internet users, each CDN node is equivalent to a website server placed around it. Through the takeover of dns, the user's request is transparently directed to the nearest node, and the CDN server in the node will be like the original website Like the server, it responds to user requests. Since it is closer to the user, the response time must be faster.
Intelligent scheduling DNS (such as f5's 3DNS) Intelligent scheduling DNS is a key system in CDN services. When users visit websites that join CDN services, the domain name resolution request will be ultimately
"Intelligent scheduling DNS" is responsible for processing. It provides the user with the node address closest to the user through a set of pre-defined strategies, so that the user can get fast service. At the same time, it needs to maintain communication with CDN nodes distributed in various places, track the health status, capacity and other information of each node, and ensure that user requests are distributed to nearby available nodes.
Cache function service Load balancing equipment (such as lvs, F5 BIG/IP) Content Cache server (such as squid)
Shared storage (determine whether it is needed according to the amount of cached data)