Concepts of learning Linux clusters by Cainiao

Last Update:2014-06-13 Source: Internet

Author: User

Tags haproxy

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

I learned the application of the cluster two days ago. to sum it up, the cluster is not very difficult to understand. as long as I master its principles, it is not very difficult to implement it. Next, let's take a simple look at the cluster.
What is a cluster?

Clusters or clusters: the purpose is to combine multiple computers to complete specific tasks, such as weather forecasts and large-scale online games, which require a large amount of computing, the implementation cost of a single computer is too high and is not displayed. Then we need to combine obsolete or in use computers in the form of clusters to solve these problems with the overall strength.
The cluster types are roughly divided into three types:
1. LB Load Balancing (Load Balancing cluster)
2. HA High Availability (High Availability Cluster)
3. HP High Performance (High-Performance cluster)
Briefly describe these three types of clusters:
1. the purpose of a server load balancer cluster is to improve the service concurrency. for example, if three WEB servers are combined into one cluster, we need to use this cluster to achieve load balancing, it not only averages the load of the three WEB servers, but also prevents idle servers.
2. the high availability cluster aims to provide service capability and prevent service interruption caused by macro machines by providing redundant servers.
3. high-performance clusters aim to solve a large number of complex computing, common weather forecasting systems, scientific exploration, census, and so on in a short period of time.
At the same time, clusters provide excellent scalability and scalability, allowing you to conveniently increase or reduce servers.
Cluster implementation method
LB load balancing
F5 (hardware)
Lvs
Haproxy
HA High Availability
Heartbeat *** is classified into several small projects.
Corosync + openais: currently used by RHCS *** REDHAT6.0, it is configurable and shares better than the above
Ultramokey
Keepalive
HP high performance
Bowerful
= ==================================
Next we will introduce in detail the implementation principles of server load balancer clusters and high-availability clusters.
Server load balancer cluster
---- Load balancing is actually the command of a band, directing the following band

To achieve load balancing, we need a front-end server load balancer --------- Direcor forwarding server (or professional hardware) to accept client requests, forward these requests to the backend servers. In this forwarding process, the server load balancer is evenly distributed based on the server load (by using algorithms.
As mentioned above, we can implement the role of a server load balancer by constructing a dedicated server. so what software can be used to implement this service?
There are two main types:
1. LVS Linux Virtual Server LVS is an open-source cluster software developed by Zhang Wenyu, a Chinese speaker. it is one of the widely used cluster software.
2. haproxy
LVS is optimized to provide close-to-hardware performance and is widely used as an open source. However, when Director runs on the host, the entire cluster becomes invalid. this is a single point of failure. Therefore, we need to use a combination of other clusters to implement cluster functions.
Benefits:
1. Load Balancing
2. provide high-availability functions. for example, if the first WEB server is on the machine, the request is forwarded to the second server through an algorithm.
3. easy scalability, convenient scalability, and easy to add hosts
High-availability cluster
----- Think that there is no error in any part, and it will make an error ---- Mo Fei's law

As mentioned above, high-availability clusters are designed to provide 7x24 online services. what do you need to accomplish this?
To complete the high-availability cluster, you must:
1. Once a service fails, the service will be transferred to another server.
Assume that server A has A "heart". when it is providing services, the remaining two servers will detect its "heartbeat" to determine if it is "alive". if the "Heartbeat" is stopped, immediately listen again. if the "Heartbeat" is still stopped, immediately replace it to provide services.
2. Data Synchronization
Data Synchronization implementation:
1) share services similar to NFS, but NFS also requires network transmission, which is less efficient.
2) send the changed content of node A to node B through A mechanism (file synchronization rsync.
# Rsync is a command, but now there are professional tools to synchronize data through rsync server. It is highly efficient, but the disadvantage is that files must be saved in two copies.
The above are all file-level synchronization, which is less efficient but much higher than NFS. The disadvantage is that the data village has two copies.
3) DRBD: it is a block-level sharing solution in the kernel, similar to rsync, which has a lower level of work than rsync, and is more efficient. In addition, the new version of the kernel has been made into the kernel, which is cheap.
4) use professional-level SAN to synchronize block devices through optical fiber. (storage area network) this storage level is very high, through block devices.
However, there is a problem with data synchronization:
Node A is very busy. B thinks that node A is ready for service, and node A is not ready for service. in this way, node A is also ready for service, b. read and write the same file in the shared file system at the same time, causing the file system to crash.

Solution:
After B snatched the service, B (through the power switch) directly shut down. Of course there are other implementation methods.
The above is just a description. it is true that there are more than 100 nodes, and heartbeat detection is advertised through broadcast. once there is no broadcast, it is determined that the node is dead, in this case, dozens of nodes compete for services, and other mechanisms are required to restrict such competition, such as queuing, who is at the beginning, who is replaced, and others continue to listen, of course there are other methods.
High-Performance Clusters
Similar to LVS, but its front-end divides requests into N small tasks, processes the requests to different backend hosts, and returns the processing results to the front-end

It is implemented through the software bowerfull, which is not described here, because we do not learn this .........
It is suitable for websites with a large number of online sites, games, cloud computing, and other fields that require a large number of complex operations in a short time.

Next I will introduce the Principles, Algorithms, and simple implementations of LVS ~ Thank you for watching
Author: "Dean's Linux"

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More