Key Points of Linux cluster technology

Source: Internet
Author: User

Nowadays, many enterprises and websites are applying the Linux operating system. The advantages of Linux have abandoned Microsoft. Today we will talk about the Linux cluster technology, so that you can learn more about Linux and understand the powerful functions of the Linux cluster technology. Select a system for your reference.

One of the reasons why Linux is highly competitive is that it can run on a very popular PC without buying expensive dedicated hardware devices. On Several PCs running Linux, you only need to add the corresponding cluster software to form a Linux cluster with super reliability, load capacity and computing capability. Each server in the cluster is called a node.

Linux clusters can be divided into three types based on their different focuses. One is a high-availability cluster that runs on two or more nodes. The purpose is to continue providing external services in the case of system faults. The design idea of high availability clusters is to minimize service interruption time. These clusters are named Turbolinux TurboHA, Heartbeat, and Kimberlite. The second type is the Server Load balancer cluster, which aims to provide load capabilities directly proportional to the number of nodes. This cluster is suitable for providing Web services with high traffic volumes. Server Load balancer clusters also have high availability characteristics. Turbolinux Cluster Server and Linux Virtual Server all belong to the Server Load balancer Cluster. Another type is super computing clusters, which can be divided into two types based on the degree of Computing Association. One method is to divide a computing task into task slices, assign the task slices to each node, calculate the task slices on each node, and then summarize the results, generate the final calculation result. The other is the parallel computing mode. nodes exchange data in a large amount during the computing process and can perform computation with Strong Coupling Relationships. These two super computing clusters are applicable to different types of data processing. With the super computing cluster software, enterprises can use several PCs to complete computing tasks that can only be completed by the super computer. Such software includes Turbolinux EnFusion and SCore.

High-availability clusters and load balancing clusters work in different ways and are suitable for different types of services. Generally, Server Load balancer clusters are applicable to services that provide static data, such as HTTP Services. high-availability clusters are applicable to services that provide static data, such as HTTP Services and dynamic data, such as databases. A high-availability cluster is applicable to dynamic data services because nodes share the same storage medium, such as RAIDBox. that is to say, in a high-availability cluster, each service has only one user data and is stored on a shared storage device. At any time, only one node can read and write the data.

Take Turbolinux TurboHA as an example. The cluster has two nodes A and B. This cluster only provides Oracle services, and user data is stored in the partition/dev/sdb3 of the shared storage device. Under normal conditions, node A provides the Oracle database service. The partition/dev/sdb3 is attached to/mnt/oracle by node. When the system encounters a fault and is detected by the TurboHA software, TurboHA stops the Oracle service and uninstalls the partition/dev/sdb3. Then, the TurboHA software on Node B loads the partition on Node B and starts the Oracle service. The Oracle service has A virtual IP address. When the Oracle service switches from node A to Node B, the virtual IP address is bound to Node B, therefore, you can still access this service.

The above analysis shows that a high-availability cluster does not have the load balancing function for a service. It can improve the reliability of the entire system, but cannot increase the load capacity. Of course, A high-availability cluster can run A variety of services and be allocated to different nodes as appropriate. For example, node A provides Oracle services and Node B provides Sybase services, this can also be seen as load balancing in a sense, but it is for the allocation of multiple services.

A server Load balancer cluster is applicable to services that provide relatively static data, such as HTTP Services. Generally, there is no shared storage medium between nodes in the Server Load balancer cluster. User data is copied into multiple copies and stored on each node that provides the service. The following uses the Turbolinux Cluster Server as an example to briefly introduce the working mechanism of the Server Load balancer Cluster. There is a master node in the cluster called Advanced Traffic Manager (ATM ). Assume that this cluster is only used to provide an HTTP service, and all other nodes are set as HTTP service nodes. All user requests to the page are sent to the ATM, because the ATM is bound to the external IP address of this service. The ATM sends the received requests to each service node on average. After receiving the requests, the service node directly sends the corresponding Web page to the user. In this way, if there are 1000 HTTP page requests in one second and 10 service nodes in the cluster, each node will process 100 requests. In this way, it seems that a 10-fold high-speed computer is processing user access. This is the real load balancing.

However, if the ATM processes all 1000 page requests, will it become the bottleneck of the Cluster's processing speed? This method is very efficient because the amount of data requested by the page is relatively small and the amount of data returned to the page content is relatively large. If an ATM fault occurs, the entire system will not work. In Turbolinux Cluster Server, one or more computers can be set as the backup ATM node. When the primary ATM node fails, a new primary ATM will be generated in the backup ATM to take over its work. It can be seen that this load balancing Cluster also has a certain high availability.

HTTP pages are relatively static, but sometimes need to be modified. Turbolinux Cluster Server provides a data synchronization tool to easily synchronize page changes to all nodes that provide this service.

The following describes the combination of high availability clusters and Server Load balancer clusters. If a user has a minimum cluster composed of two nodes, can the benefits of a high-availability cluster and a load balancing cluster be obtained at the same time? The answer is yes. Since high-availability clusters are suitable for providing dynamic data services, while load balancing clusters are suitable for providing static data services, we may assume that Oracle and HTTP services should be provided at the same time. You must install the TurbolinuxTurboHA and TurbolinuxClusterServer software on nodes A and B. Use node A as the normal working node of Oracle and Node B as the backup node of Oracle service. This is for TurboHA software. For ClusterServer software, you need to set Node B as the primary ATM node, node A as the backup ATM node, and node A and Node B are both HTTP service nodes at the same time.

In this way, both node A and Node B have two roles, and the user obtains A high-availability Oracle service and an HTTP service with the load balancing function. Even if one node fails, the Oracle service and HTTP service will not be interrupted.

However, for the same service, high availability and load balancing capabilities cannot be achieved at the same time. For a service, either only one copy of data is stored on a shared storage device and accessed by one node at a time for high availability, or the data is replicated to multiple copies, stored on the local hard disk of each node. users' requests are sent to multiple nodes at the same time to obtain the Server Load balancer capability.

High Availability clusters are designed to minimize service interruption time. Therefore, service switching has attracted a lot of attention. When a service fault occurs on a node, it is quickly detected and switched to another node. However, data integrity protection cannot be ignored during switchover.

Under what circumstances will data integrity be damaged? A high-availability cluster has at least two nodes connected to a shared storage device. For non-bare partitions, if the two nodes read and write at the same time, the file system will be damaged. Therefore, we need to use the I/O barrier to prevent this event.

The I/O barrier aims to ensure that the faulty node cannot continue to read and write the shared partitions of a service. There are multiple implementation methods. Kimberlite uses hardware switches. When a node fails, if the other node can detect the node, commands are issued through the serial port to control the hardware switches connected to the power supply of the faulty node, the faulty node is restarted by means of temporary power-off and later power-on.

The I/O barrier has multiple forms. For storage devices that support the SCSI Reserve/Release command, you can also use the SG command to implement the I/O barrier. Normal nodes should use the SCSI Reserve Command to "Lock" the shared storage device to ensure that it is not read or written by the faulty node. If the cluster software on the faulty node is still running, if it is found that the shared storage device has been locked by the other party, You should restart yourself to restore the normal working status.

The above describes the basic principles of Linux cluster technology, as well as several well-known software. In short, the Linux cluster technology gives full play to the advantages of PCs and networks, which can bring considerable performance and is a promising technology. I hope you will learn more about the Linux cluster technology through this article.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.