Analysis of key points for Linux cluster technology

Source: Internet
Author: User
Tags file system switches web services oracle database backup

Now a lot of companies, sites are in the application of Linux operating system, the advantages of Linux to make people abandon Microsoft. Let's talk about Linux cluster technology today, so that you can learn more about Linux, know the Linux cluster technology powerful features. It gives you a reference to the selection system.

One of the reasons Linux is competitive is that it can run on very popular PCs without the need to buy expensive dedicated hardware devices. On several PCs running Linux, as long as the corresponding cluster software, can be composed of super reliability, load capacity and computing power of the Linux cluster. Each server in a cluster is called a node.

Depending on the focus, the Linux cluster can be divided into three categories. One is a high-availability cluster that runs on two or more nodes, with the aim of continuing to provide services in the event of certain failures in the system. The idea of a high-availability cluster is to minimize service outage times. These clusters are relatively famous for Turbolinux Turboha, Heartbeat, kimberlite and so on. The second category is a load-balancing cluster, which is designed to provide load capacity that is proportional to the number of nodes, and is ideal for Web services that provide large traffic. Load-balanced clusters often have some high availability features. Turbolinux Cluster Server, Linux Virtual server are load-balanced clusters. The other is the Supercomputing cluster, which can be divided into two types according to the different degree of calculation. One kind is the task slice way, wants to divide the computation task into the task slice, then assigns the task slice to each node, calculates each node separately and then summarizes the result, produces the final computation result. The other is a parallel computing method, in which the data are exchanged in a large amount in the computation process, and can be calculated with strong coupling relationship. These two supercomputing clusters are suitable for different types of data processing work respectively. With supercomputing cluster software, companies can use a number of PCs to complete computing tasks that are usually only supercomputers. This kind of software has turbolinux enfusion, score and so on.

The high availability cluster and the load Balancing cluster work in different ways and are suitable for different types of services. Typically, load-balancing clusters are suitable for services that provide static data, such as HTTP services, while high-availability clusters are suitable for services that provide static data, such as HTTP services, and services that provide dynamic data, such as databases. The high availability cluster can be applied to services that provide dynamic data. is because nodes share the same storage medium, such as Raidbox. That is, in a high-availability cluster, the user data for each service is stored on a shared storage device, and only one node can read and write the data at any one time.

In the case of Turbolinux Turboha, there are two nodes A and b in the cluster, which provide only Oracle services, and user data is stored on the partition/DEV/SDB3 of the shared storage device. In normal state, Node A provides Oracle database services, and partition/DEV/SDB3 is loaded on/mnt/oracle by Node A. When a system fails and is detected by the Turboha software, Turboha stops the Oracle service and unloads the partition/dev/sdb3. After that, the Turboha software on Node B loads the partition on Node B and starts the Oracle service. There is a virtual IP address for the Oracle service, and when the Oracle service switches from Node A to Node B, the virtual IP address is also bound to Node B, so the user can still access the service.

As can be seen from the above analysis, the high availability cluster does not have load balancing function for a service, it can improve the reliability of the whole system, but can not increase the load capacity. Of course, a high-availability cluster can run multiple services and be allocated appropriately on different nodes, such as Node A, which provides Oracle services, while Node B provides Sybase services, which can also be seen as a sense of load balancing, but it is for the allocation of multiple services.

Load-balancing clusters are useful for services that provide relatively static data, such as HTTP services. Because typically there is no shared storage medium between nodes of a load-balanced cluster, user data is replicated in multiple copies and stored on each node that provides the service. The following takes Turbolinux Cluster server as an example to briefly describe the working mechanism of a load-balanced cluster. There is a master node in the cluster, called the Advanced Traffic Manager (ATM). Assuming that this cluster is used only to provide an HTTP service, the remaining nodes are set to the HTTP service node. The user requests for the page are all sent to the ATM because the ATM has a binding IP address on the service. The ATM sends the received request to each service node, and the service node sends the corresponding Web page to the user directly after receiving the request. Thus, if there are 1000 HTTP page requests within 1 seconds, and 10 service nodes in the cluster, each node will handle 100 requests. In this way, it looks as if there is a high-speed computer with a speed of 10 times times processing the user's access. This is the true meaning of the load balance.

But does ATM handle all 1000 page requests, and does it become a bottleneck in the processing speed of the cluster? Because of the relatively small amount of data requested on the page, the amount of data returned to the page content is relatively large, so this approach is efficient. ATM failure does not cause the entire system to work. Turbolinux Cluster Server can set up one or more computers as backup ATM nodes, and when the primary ATM node fails, a new primary ATM will be generated in the backup ATM to take over its work. It can be seen that this kind of load Balancing cluster also has a certain high availability.

HTTP pages are relatively static, but sometimes they need to be changed. Turbolinux Cluster Server provides data synchronization tools that make it easy to sync changes to pages to all the nodes that provide the service.

The following is an introduction to the combination of high-availability clusters and load balancing clusters. If a user has a minimal cluster of two nodes, can the benefits of a high-availability cluster and a load-balanced cluster be achieved at the same time? The answer is yes. Since a high-availability cluster is suitable for services that provide dynamic data, and a load-balancing cluster is suitable for services that provide static data, we might as well assume that Oracle and HTTP services are provided simultaneously. Users will install Turbolinuxturboha and turbolinuxclusterserver software on nodes A and B. To use Node A as a normal Oracle node, Node B serves as the backup node for the Oracle service, which is Turboha software. For Clusterserver software, to set Node B as the primary ATM node, Node A is the backup ATM node, and Node A and Node B are both HTTP service nodes.

As a result, both Node A and Node B are both jobs, while users get a high-availability Oracle service and an HTTP service with load balancing capabilities. Even if one node fails, neither the Oracle service nor the HTTP service will be interrupted.

But for the same service, high availability and load balancing are not available at the same time. For a service, either a single piece of data, placed on a shared storage device, accessed by one node at a time, gets high availability, or replicates data to multiple copies, stored on the local hard disk of each node, and the user's request is sent to multiple nodes to gain load-balancing capability.

For high-availability clusters, service switching is a major concern because it is designed to minimize service outage times at the design time. When a service failure occurs on a node, it is quickly detected and switched to another node. However, when you switch, you cannot ignore the protection of data integrity.

Under what circumstances will data integrity be compromised? Because there are at least two nodes in a high-availability cluster, connected to a shared storage device, the file system can be corrupted if you read and write to a two node at the same time for non-naked partitions. It is therefore necessary to use the I/O barrier to prevent this event from occurring.

The purpose of the I/O barrier is to ensure that failed nodes can no longer read and write to a shared partition of a service in a variety of ways. kimberlite use hardware switches to achieve, when a node failure, another node if it can detect, will be issued through the serial port command, control connected to the fault node power supply hardware switch, through the temporary power off, and then on the power of the way to make the fault node is restarted.

There are several forms of I/O barrier. For storage devices that support SCSI reserve/release commands, I/O barriers can also be implemented with the SG command. The normal node should use the SCSI reserve command to "lock" the shared storage device so that it is not read or written by the failed node. If the cluster software on the failed node is still running, and if a shared storage device is found to be locked by the other, it should be restarted to return to normal working condition.

The above introduces the basic principles of Linux cluster technology, and also introduces several famous software. In short, the Linux cluster technology to maximize the advantages of the PC and network, can bring considerable performance, is a promising technology. I hope you will learn more about Linux cluster technology through this article.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.