Many websites do not need a large scale at the beginning, but as a website designer, expansion must be taken into account from the very beginning to build a highly scalable architecture. The so-called scalability means that the system can improve the carrying capacity of the system through scale expansion. After all, the vertical expansion of the server itself will soon be restricted, and the single machine will soon be unable to meet our needs, therefore, this capability is often achieved by adding physical servers or cluster nodes. The stronger the capability, the larger the space for improving the load capacity. Server Load balancer is the most common method for horizontal scaling of web sites. The following describes several load balancing methods.
HTTP redirection
HTTP redirection is believed to be unfamiliar to all web programmers. For example, when we request a page, we are redirected to the login page, and then to a page. Generally speaking, after a browser requests a URL, the server returns a new URL by marking the location in the HTTP Response Header. In this way, the browser master will continue to request the new URL to complete automatic jump. It is precisely because HTTP redirection has the ability of Request Transfer and automatic jump, so we can use it to achieve load balancing to achieve web expansion.
For example, you can access the Web browser and enter. This not only disperses requests, but also implements nearby access and shortens the network transmission time.
Below we often see the redirection scheduling policy to analyze how to implement an excellent scheduling policy:
1. Redirection by region,For example, Google will turn requests from China. However, in some cases, partitioning by region is not so reasonable. Even in the same region, multiple servers may serve together and other load balancing methods are required.
2. Random allocation.We maintain a list of available servers. Each request is sent, a server is randomly assigned to provide services. In general, each server can evenly distribute requests.
3. Round Robin.All requests are transferred to each server in order to achieve an average allocation in an absolute sense. However, we must maintain the serial number of the previous request so that each HTTP request can be accessed when it arrives, and this serial number can only be modified by one request at a time, such a policy is a disaster for highly concurrent Web requests.
Therefore, in our policy, we try to make every request independent of external conditions, and we can calculate the server serial number and distribute it sufficiently, so the efficiency will be relatively high. For example, the server serial number is calculated based on the hash of the user's IP address. This will not only avoid the price of Round-Robin, but will be processed by the same server each time the user requests are processed, some user statuses are easy to handle.
But in the actual environment, can we achieve a real balance?
Now, we call the server that processes the transfer request as the master server. From the above we can see that the master server can only ensure that the requests arriving at it are evenly allocated, however, the master server cannot know how long the user will stay on the actual processing server after being redirected and how many pages are accessed. Some users may save the actual processing server addresses, in the future, access will bypass the transfer server, and so on. So many factors may lead to the failure of Real Server Load balancer. However, for those files, such as downloading and AD display, there will be only one-time requests, the main site's programs always grasp control, which can ensure load balancing.
What are the major bottlenecks?
In addition, the overhead of request processing and request transfer should be considered. The higher the overhead of request processing, the more suitable the rewrite direction is. For example, after the main site completes the process quickly, you can process other requests, but the actual download takes some time. If the request processing overhead is small, when the master server is busy, the actually processed server will be idle, resulting in a waste of resources.
When all requests must pass through the master server to be processed by each actual server, the master server becomes the center of our system, and its capabilities are largely determined by the system's capabilities, when requests are too high to be transferred in time, we should consider other strategies.
DNS load balancing
We know that DNS is responsible for domain name resolution. When we use a domain name to access the site, we will actually go through the DNS server to obtain the IP address that the domain name directs to. In fact, the DNS server completes the ing from the domain name to the IP address, similarly, this ing can be one-to-many, that is, DNS can allocate requests to domain names to different servers according to certain policies, so that we can achieve Load Balancing accordingly. It seems like HTTP redirection, but the implementation mechanism is completely different. (In the window, you can use the NSLookup command to query the IP address list corresponding to the domain name. This command will return not necessarily all records cached by the DNS server closest to you)
Compared with the redirection-based load balancing method, the DNS solution saves the so-called primary site or replaces the functions of the primary site, but whether you are using a DSN service provider or your own DNS server, there is almost no need to worry about the performance of DNS servers. In fact, DNS records will be cached by users' browsers or DNS servers of Internet access service providers, only when the cache expires will the DNS server of the domain name be re-resolved. Therefore, even if the round-robin scheduling policy is adopted, we will hardly encounter the performance bottleneck caused by the DNS server.
Although the Server Load balancer policy of DNS has almost no restrictions on HTTP redirection, it does not have the flexibility of redirection. After all, HTTP redirection is implemented by our own program. We can develop the most appropriate policy based on business needs. We can even filter or transfer URLs after fully understanding HTTP content, developing custom Scheduling Policies for DNS servers is not that easy, and HTTP content cannot be introduced into policy development. Fortunately, the DNS server also provides some scheduling policies for you to choose from. For example, you can find the server closest to the user in the list based on the Smart Resolution Policy of the IP address, however, it is too far behind the flexibility of redirection.
In addition, DNS, as a request scheduler, cannot take the load capacity of each server, the current Load Status of the server, and other factors into account when the request is balanced, real Load Balancing may not be implemented. There is also the ability to respond to faults. For HTTP redirection, We can automatically maintain the list of available servers and make it take effect immediately. For the DNS load system, we have to modify the DNS ing, in general, this takes a while to be effective.
Reverse Proxy Server Load balancer
In our previous cache introduction, we introduced reverse proxy servers, which can also be used as a scheduler to implement the Load Balancing System. The core work of the reverse proxy server is to forward HTTP requests. It works at the HTTP layer, that is, the layer 7 Application Layer of TCP. Therefore, the Load Balancing System Based on the reflection proxy is also called Layer 7 load balancing. Currently, almost all mainstream Web servers support reverse proxy-based load balancing, so it is not difficult to implement it.
Compared with the first two methods we introduced, they are actually request transfer, while reverse proxy request forwarding: All requests must be uniformly scheduled by the reverse proxy server, wait for the server to respond to the actual request, and then return the response content to the user.
This method can solve almost all the limitations of DNS-based load balancing: first, it is http-level forwarding, which can be understood to make appropriate processing after HTTP content; many reverse proxy servers can set weights for each actual server, which means that we do not have to allocate them evenly. The larger the capacity, the greater the responsibility ", allocates appropriate workload based on the actual capacity of the server. We can detect the health status of the actual server and handle it in a timely manner.
But it also has the lack of redirection. Request forwarding requires overhead, such as creating a thread, establishing a connection with the backend server, and receiving the results returned by the actual server. If the request processing overhead is small, the forwarding overhead will be particularly obvious, and the most competent requirement for file downloading redirection is basically a disaster for reverse proxy; as a reverse proxy for all requests, the performance of the proxy determines the system performance to a large extent.
IP Server Load balancer
An IP-based Server Load balancer system operates on the transport layer and modifies the IP address and port information in the packets. This is also called layer-4 Server Load balancer. It will complete the forwarding before the data arrives at the application layer, because some of this work is done by the system kernel, the application can do nothing about this, of course, the performance will be greatly improved.
A Brief Introduction of the working principle: the scheduling server has two NICs, one for the Internet, configured with an Internet IP address, responsible for receiving user requests, and the other for configuring an intranet IP address, forwards requests to the actual server in the internal network and receives data. The actual servers are all in the intranet and invisible to the outside world. You only need to configure the Intranet IP addresses, but their default gateways must be the content IP addresses of the scheduling server. In this way, when the user requests the data packet, it modifies the data packet and changes the target address to the actual IP address of the internal network processing service. After the server completes processing, the destination address is determined based on the source address of the data packet, but they must go through the default gateway, that is, the scheduling server. Then it modifies the data packet again, replace the source address with your own Internet IP address to complete the request process.
At this time, the scheduler becomes the key point of the entire system, which determines the system performance and scalability.
Direct routing
Different from IP Server Load balancer, the Server Load balancer height is directly routed to the data link layer. It modifies the target MAC address of the data packet and forwards the data packet to the actual server, these processing results are directly sent to the user without passing through the scheduler. In this case, the actual server must be directly connected to the Internet, and the scheduling server is not the default gateway.
Here is an IP address, which means you can configure multiple IP addresses for a network card and correspond to the same MAC address. In this way, After configuring the Internet IP addresses for the scheduler server and the actual processing server respectively, we also need to configure the same IP alias (another IP address) and bind this IP address to the domain name, then, add the IP alias to loopback interface Lo and set routing rules so that the scheduler can only search for servers with this alias, in addition, the actual server is prohibited from responding to ARP broadcasts from IP aliases in the network. In this way, when a request arrives at the scheduler, it only modifies the MAC address of the packet and forwards it to the actual server. Because the scheduler and the actual server have aliases, the actual server can process these packets, then, the response is directly returned to the user.
IP tunneling
Simply put, the scheduler encapsulates the received data packet into a new IP data packet and forwards it to the Implementation server. Then, the actual server can process the data packet and directly respond to the client.
Because of the last three methods, the book is mainly based on the Linux kernel and has little experience. It just briefly introduces the implementation principles within the scope of understanding. If you want to have a deeper understanding, check the original document or find other materials. As a matter of fact, after SLB achieves scale expansion and spof, it faces another challenge, if the user's status or other status data can be resolved using cookies and the distributed cache mentioned in the previous cache, how can the files be shared? In the next article, share files and synchronize content distribution.