What is cluster technology?

Source: Internet
Author: User
Cluster technology has been developing for many years, but there is no very accurate definition and classification. Different people have different understandings.

In fact, it doesn't matter what it is, as long as it can benefit users.

In terms of personal understanding, cluster has the following types. Of course, as mentioned above, different people have different understandings, which can be fully discussed. My classification focuses more on engineering than technology.

1. HA cluster
High Availability is achieved, but the performance of a single application is not improved. Most of the products on the market belong to this type, and the technology is relatively simple.

2. IP Server Load balancer Cluster
IP technology is used to support general IP applications. This technology is not very new. It was first adopted on hardware. After the emergence of Linux, many pure software models were available. This is also the benefit of open source.

3. Parallel Computing Cluster
Including information transfer mechanisms and API libraries such as PVM and Beowulf, as well as task scheduling products. Of course, the most technically difficult part is more intelligent products such as parallel compilation and parallel systems.

4. Application Load Balancing Cluster
Although the highest goal of cluster is to achieve true and ApplicationProgramUnrelated dynamic load balancing, but due to technical restrictions, it can only be implemented in special applications and needs to be modified, so there is no universal product, most vendors have their own parallel versions. For example, Oracle paraller server.

The above are basically divided by project or product, which should be different from Technical Division.

The following is an article that was written a long time ago. It was written for media propaganda. It has some commercial taste in it and is technically incorrect in some places. I want to exchange ideas. It does not promote turbolinux's products (I am a turbolinux employee), but it is really too lazy to change, although this kind of commercialArticleIn the publicCommunityHe made a big mistake. For your reference, we will not discuss the advantages and disadvantages of turbolinux products. Please understand.

I have never participated in the discussion in the Linux community. This time, I registered the cluster because I had been in contact with the cluster for a long time and had a great deal of interest.

With the increasing popularity of Internet/Intranet applications, the importance of computer systems is also increasing. Low failure rate and high performance have always been people's main goals, but for a single server, these two problems cannot be solved.

L availability-many servers claim to have reached 99% availability. What does this number mean? That is to say, there will be 1% of the unanticipated downtime each year. Let's take a look. 365 (day/year) × 24 (hour/day) × 1% = 87.6 (hour/year ). This 87.6-hour downtime is a disaster for enterprises that require 24x7 continuous services.

L high performance-Assume that a desktop machine can process thousands of requests per second, and the IA server can process tens of thousands of requests per second. For enterprises that need to process hundreds of thousands of requests per second, if cluster technology is not used, the only choice is to purchase more upscale medium and small computers. If this is done, although the system performance is only increased by 10 times, its purchase price and maintenance cost will increase by dozens of times or even more.

The emergence and development of cluster technology have effectively solved these two problems.
I. Clusters
A cluster is a parallel or distributed system composed of computers connected to each other. Externally, a cluster is only a system that provides unified services.

The cluster technology itself has many categories, and there are also many products on the market, with no standard definition. It can be divided into the following types:

1. Redundant Cluster

Strictly speaking, such a redundant system cannot be called a real cluster because it can only improve the availability of the system, but cannot improve the overall performance of the system.

There are several types.

A. Fault Tolerant Machine

The feature is that all hardware components of a machine are redundant (including hard disk, controller card, bus, power supply, and so on ).

It has nothing to do with the software system and can achieve seamless switching, but the price is extremely expensive.

Typical market products: Compaq nonstop (TANDEM), micron (netframe), Straus

B. Dual-host System Based on System Images

The feature is that the system data and running status (including data in memory) are mirrored using the dual machine to achieve hot backup.

Seamless switchover is possible, but software control is used to occupy a large amount of system resources, and because the two machines need identical configurations, the performance price is too low.

Typical product markets: Novell sft iii, Marathon endurance 4000 for NT

C. Dual-host system based on system switching

The feature is that the dual-host system is used to mirror the system data (only the hard disk data). When the host fails, the system-level switch will be performed from the host.

The price is moderate, but seamless switching is not possible.

Typical market products: legato (Vinca) standbyserver for NetWare, savoir (small micro) savwareha (Sentinel), Compaq standbyserver

2. Application-switched Cluster

The feature is that when a node in the cluster fails, other nodes can perform application-level switching, so all nodes can provide their own services under normal conditions, it is also a static load balancing method.

High performance and price, but it cannot achieve seamless switching, and it cannot achieve Load Balancing for a single application itself.

Typical market products: legato (Vinca) co-standbyserver for NT, Novell ha server, Microsoft Cluster Server, Dec cluster for NT, legato octopus, legato fulltime, neohigh Rose HA, Sun clusters, veritas Cluster Server (firstwatch), Ca attached VIT, 1776

3. Parallel Computing-based clusters

It is mainly used in scientific computing, computing with large volumes of tasks, and other environments. There are multiple implementation methods such as parallel compilation, Process Communication, and task distribution.

Typical market products: turbolinux enfuzion, Beowulf, supercomputer ubuntures, Platform

4. Clusters Based on Dynamic Load Balancing

All nodes provide the same external services, which can achieve Load Balancing for a single application and provide high availability.

The performance price is very high, but the database cannot be supported currently.

Typical market products: turbocluster server, Linux virtual server, F5 bigip, Microsoft Windows NT load balance service

Ii. Server Load balancer
Server Load balancer is a cutting-edge technology to improve system performance. In the preceding example, the processing capacity of an IA server is tens of thousands per second. Obviously, it cannot process hundreds of thousands of requests in one second, however, if we can have 10 such servers to form a system, if we can evenly allocate all requests to all servers, the system can process hundreds of thousands of requests per second. This is the basic idea of Server Load balancer.

In fact, there are many Server Load balancer products on the market. Because of the different main technologies of its application, it has different characteristics and different performance.

1. Round-Robin DNS
The Round-Robin DNS solution is the simplest and most intuitive technical solution. Of course, this solution can only implement the load balancing function, but cannot guarantee high availability.
The principle is to set the ing between multiple IP addresses of the same Internet host name in the DNS server. In this way, when the DNS receives a request to query the host name, it cyclically returns all corresponding IP addresses one by one. In this way, different client connections can be located on different IP hosts, and simple Load Balancing functions can be implemented. However, this solution has two fatal disadvantages:

L Server Load balancer can only achieve Load Balancing for Internet host name requests. requests directly based on IP addresses are powerless.

L if a node in the cluster fails, the DNS server will still return the IP address of the node to the query party, in this case, customers are still constantly requesting to establish connections with faulty power-saving instances. In this case, even if you manually modify the corresponding settings of the DNS server, the faulty IP address will be deleted because all DNS servers on the Internet have a caching mechanism, there will still be thousands of clients unable to connect to the cluster, unless all DNS Cache times out.
2. hardware solutions
Some manufacturers provide hardware solutions for Server Load balancer, and create high-end routers or switches with NAT (Network Address Translation) to achieve load balancing. The principle of NAT is to convert multiple private IP addresses to a single public IP address. The representative products are some of the high-end hardware switch series of cicso and Alteon companies. This solution has the following Disadvantages:

L due to the use of special hardware, there are non-industrial standard components in the entire system, which greatly affects system expansion, maintenance and upgrade.

L The price is extremely expensive, which is an order of magnitude different from the software solution.

L generally, you can only perform status checks at the node system level, but cannot refine the inspection at the service level.

L due to the adoption of the NAT mechanism, the cluster management Node itself has to complete a lot of work, it is easy to become the bottleneck of the entire system.

L this special hardware is a single fault point.

L it is very difficult to implement clusters with remote nodes.
3. Negotiate processing (parallel filtering)
The principle of this scheme is that the customer request will be received by all nodes at the same time, and then all nodes negotiate with each other according to certain rules to determine which node will process the request. The obvious feature of this solution is that there are no significant management nodes in the entire cluster, and all decisions are made by the working nodes through mutual negotiation. Microsoft load balancing service, a representative product, has the following features:

L because the amount of communication required between nodes is too large, the network burden is increased. Generally, you need to increase the dedicated network for node communication, which increases the difficulty and cost of installation and maintenance.
L because each node must receive and analyze all customer requests, it greatly increases the burden on the network drive layer and reduces the efficiency of the node itself, at the same time, the network driver layer can easily become the bottleneck of the node system.

L because you want to change the network driver layer program, it is not a common solution, but only supports the special platform.

L The efficiency of negotiation is acceptable in the case of a small number of nodes. Once the number of nodes increases, communication and negotiation will become extremely complex and inefficient, and the performance of the entire system will be significantly reduced. Therefore, such a scheme generally only allows up to a dozen nodes theoretically.

L clusters with remote nodes cannot be implemented.

L as there are no unified managers in the cluster, chaos and exceptions may occur.
4. Traffic Distribution
The principle of traffic distribution is that all user requests first reach the management node of the cluster. The management node can distribute the requests to a service node based on the processing capability and status quo of all service nodes. When a service node fails due to hardware or software, the management node can automatically detect and stop distributing traffic to the service node. In this way, the performance and processing capability of the entire system are increased by sharing the traffic, and the availability of the system can be improved.

By creating a sub-cluster of the Management Node itself, you can eliminate the single point of failure caused by the singularity of the Management Node itself. Some traditional technical personnel believe that because all customer traffic will be managed through the node, the management node can easily become the bottleneck of the entire system. However, turbocluster server cleverly solves the problem through advanced direct routing or IP tunneling forwarding mechanisms. So that all the traffic that responds to the customer is directly returned to the client by the service node, instead of managing the node again. As we all know, for service providers, the incoming traffic is much smaller than the outgoing traffic, so the management Node itself is no longer a bottleneck.

The specific implementation methods of traffic distribution include direct routing, IP tunneling, and network address translation. Turbocluster server currently supports the first two most efficient types. Due to this advanced structure and technology, the number of service nodes in the turbocluster server cluster is not limited, and the efficiency of collaboration among a large number of nodes is also guaranteed.
Iii. Market Prospects
The cluster technology has been developed for many years, with many branches. At present, the cluster technology is gradually moving towards a layered structure. In the future, there will certainly be specialized user frontend and backend cluster products.

With the increasing position of computer applications and the increasing importance of system security, cluster technology will surely have a very broad application prospect.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.