Scalability of clusters and its distributed architecture (3): hierarchy, classification and elements (I)

Source: Internet
Author: User
Chen Xun software studio R & D department manager
This article is the third article in cluster scalability and distributed architecture. This section mainly introduces the hierarchical structure model of the Cluster's hardware and software structure, the main classification methods, and four main elements that determine the cluster design: Ha, SSI, job management and communication. The author aims to build an abstract cluster model through several different analysis points, so that readers can provide some reference for analyzing and designing clusters in reality.

Layered Model

First, let's take a look at the main components of the Cluster Computer System:

  • Multiple high-performance computers (PCs, workstations, or SMP servers)
  • Excellent Operating System (hierarchical or microkernel-based)
  • High-performance network switch (Gigabit Ethernet or vswitch network similar to Myrinet)
  • Network interface card Nic
  • Fast communication protocols and services (such as activity messages and fast messages)
  • Cluster middleware (Single System Image and availability Support)
    • Hardware, such as DEC memory channel, hardware DSM, and SMP Technology
    • Operating system kernel or bonding layer, such as MC and glunix
    • Applications and subsystems
      • Applications, system management tools, and workbooks
      • Real-time systems, software DSM and parallel file systems
      • Resource Manager and scheduling software, such as LSF Server Load balancer and codine distributed network environment computing
  • Parallel Programming Environment and tools, compiler, PVM, MPI
  • Applications: Serial and parallel

These components are not required in every cluster system. Most systems only implement several of them, mainly based on specific requirements. However, from the perspective of reference models, to understand the location of each component and its role, creating a layered structure description helps you understand the cluster as a whole.

Here we found that the composition of the cluster covers almost every aspect from software to hardware. If you think of a cluster as having a hierarchy similar to the OSI Interconnection Reference Model, then from the underlying layer to the top layer, it covers the existing hardware architecture, network design, operating system, middleware, application environment, and other technical aspects. Then, it is important to choose a technology at each layer. Using existing mature product technologies can greatly reduce the technical and financial risks of cluster construction, and select appropriate levels and technical points as a breakthrough, it is often the key to solving performance, security, or other special needs.

The benefits of cluster design have been described in detail earlier. We will not discuss them here. To put it simply, a cluster must have the following features compared to a low cost:

  • High Performance
  • Scalability
  • High Throughput
  • Ease of use
  • High Availability

Cluster Classification

In fact, products in reality are often a combination of several features. For these features, the cluster can have the following centralized classification methods based on different reference factors:

Cluster category

Purpose

When I classify clusters, clusters are generally divided into three types based on their purposes:

  • The famous beawulf cluster is an excellent example of a high-performance computing (HP) cluster that emphasizes computing power.
  • Commercial ha clusters that emphasize availability. The Mon project in the open source community and the piranaha of RedHat are both inexpensive and cost-effective HA cluster solutions. Of course, do not forget the old brands such as SP2, trucluster, and Solaris MC.
  • Both ha capabilities and HP clusters can be implemented, emphasizing high-throughput integrated clusters, such as mosix and LVS.

Applications are always changing. The earliest cluster was born only to solve the computing problem. With the development of the demand, the HA business machine group and the later Comprehensive High-throughput system were created, in the future, there will certainly be updated clusters.

Node attribution

If you look at the attribution of nodes, you can divide them into dedicated clusters and "part-time" clusters (also called dedicated clusters and enterprise clusters ).

Dedicated clusters are often used for super computing tasks or combination of large workstations with low-cost PCs. They have the following features:

  • Generally, it is installed on the rack of the data center.
  • Most nodes of the same type are combined in the same way.
  • Access through a front-end

This cluster is mainly used to replace traditional hosts or supercomputers. A dedicated cluster can be installed, used, and managed as a single computer. Multiple users are allowed to log on to the cluster for interactive jobs or batch jobs. It can greatly increase throughput and shorten response time.

Creating a "part-time" cluster is mainly to save costs and make full use of the idle resources of nodes. The features are as follows:

  • The node must be a complete SMP, workstation, or PC, connected to all necessary peripheral devices
  • Nodes are geographically distributed and do not have to be in the same space.
  • A node can have multiple "owners" (that is, owners ). Cluster administrators have limited management permissions on nodes. In addition, the local task priority of the owner is usually higher than that of the cluster.
  • Clusters are mostly heterogeneous, and the interconnection mode is based on standard communication networks.

From the above comparison, we can easily find that the use of node resources leads to the emergence of two different belonging clusters. In a private cluster, a specific individual does not have a workstation or a node. Resources are shared within the cluster. Parallel Computing can be executed on the entire cluster. In a "part-time" cluster, an individual owns a workstation, and applications perform computation by "Stealing" CPU time. The main cause of this situation is that the CPU time of most workstations is idle, and it rarely exceeds 50% even during peak hours. We also call the parallel program running on this non-dedicated dynamic change cluster adaptive parallel computing. In some articles, high-throughput clusters also belong to this category.

Assembly Method

The assembly method mainly depends on the interconnection technology and the computer space technology. In a loosely coupled cluster, a node is generally a relatively independent PC or workstation and has a complete peripheral device: keyboard, mouse, and display. Connect to each other through a LAN. The distance can be in a data center, cross-building, or even extend to a campus network (such as a campus network. With the development of bandwidth technology, the current loosely coupled clusters can integrate resources across regions. For example, in the network load balancing cluster environment, there are many solutions that balance the cluster traffic across several cities, such as Netease's Site Server and 263 of the mail server, load Balancing can be performed between different cities. However, it can show strict "consistency" to form a unified resource.

Tightly coupled clusters often consider Cluster Interconnection from the perspective of space utilization and effective bandwidth. We all know that, to some extent, the distance between the network and the bandwidth is inversely proportional. The shorter the distance, the higher the bandwidth. Therefore, dedicated clusters often use high-bandwidth and Low-latency communication networks, remove unnecessary peripheral devices of nodes, and only reserve necessary hosts (CPU, memory, and hard disk ), placed in one or more racks. In this way, we can not only make full use of the Effective Bandwidth of short-distance communication, such as Gigabit Ethernet or even 10-Gigabit Ethernet, but also greatly save the space occupied by nodes and facilitate centralized management. In the development of tightly coupled technology, even the product that works as a cluster in a single chassis-the Blade Cluster Server, also knownBlade Server.

The Blade Server is a low-cost server platform with high availability and high density. It is specially designed for special application industries and high-density computer environments. Each "Blade" is actually an enhanced system motherboard. They can start their own operating systems through a local hard disk, such as Windows NT/2000, Linux, Solaris, and so on, similar to independent server nodes. In this mode, each motherboard runs its own system and serves different user groups and is not associated with each other. However, the SSI software can be used to assemble these boards into a single cluster image.

In cluster mode, all the mainboards can be connected to provide high-speed backboard bandwidth, and resources can be shared to serve the same user group. Insert a new "Blade" into the cluster to improve overall performance. Because each "Blade" is hot-swappable, it is as convenient as a video card, so the system can easily replace and minimize the maintenance time. This cluster facilitates management, saves valuable space for Data Center racks, and makes full use of short-distance high-performance communication technology.

These are two typical cluster assembly methods. The loose coupling cluster on the left is installed in a LAN, which is usually covered by a GB-class Ethernet. The tightly coupled cluster on the right is installed in a rack, it can use communication technologies with higher bandwidth and a high degree of electrical integration. As for the blade server, I am afraid it will be more compact. In addition, we can see that both clusters should be dedicated clusters.

Control Mode

There are two types of control: centralized control and decentralized control. Most clusters under centralized control are tightly coupled clusters. For convenience of space and management, Administrators are allowed to centrally control all nodes. the control interface can be a character terminal or a graphical GUI. The Beowulf cluster used for parallel computing adopts a centralized control mode. The Administrator uses shell tools or X interfaces to manipulate the master server, and the specific computing nodes cannot be directly accessed.

For loosely coupled clusters, the distributed and centralized control hybrid approach is adopted. Since centralized control in a loosely coupled structure requires the support of a special system middle layer, it is difficult to implement it. Mature management protocols, such as SNMP, can be used to change resource allocation and scheduling in the environment. In addition, in loosely coupled non-dedicated clusters ("part-time" clusters), the daily control is still performed by their respective "owners", and the idle computing time is handed over to the Controller.

Equivalence

The homogeneous structure is relative, and the complete homogeneous structure is only the ideal of the theoretical model. As we mentioned in the previous chapter, SMP is the best in distributed systems and is directly reflected in an important indicator of a single system image. In most cases, cluster nodes use the same operating system or compatible hardware platform to ensure binary code portability as much as possible. For example, in a Beowulf cluster, both the cluster node and the server adopt the Linux core operating system. Combined with the standard PVM and mpi2 interfaces, the computing task can span the address space of each node, the code is consistent with the data expression so that the data can be smoothly migrated between nodes.

Heterogeneous features are increasingly important in the development of clusters. With the enhanced OS extension API or middleware layer software, tasks can be freely moved between heterogeneous nodes to implement SSI capabilities at a certain level. To a certain extent, SSI is required in the Server Load balancer environment and availability support. However, because binary code and data structures cannot be compatible, intermediate code with lower performance, interpreter programs, or extended dynamic link libraries must be used to achieve heterogeneous "homogeneous" and Java languages, PVM parallel databases are well applied in this field. With the development of WebService and XML technology, it is possible to achieve satisfactory SSI capabilities in a variety of programming languages and runtime environments in the future.

Security

Security depends on the degree of interconnection between the cluster nodes and the outside world. If the nodes in the cluster are exposed, both physical connections and IP addresses, and the communication within the cluster is completely unprotected, we believe that such a cluster system is not secure. The destruction of hackers or malicious users will cause cluster unavailability. However, such clusters are easy to implement because they do not need to be considered too much during planning.

If the cluster nodes are hidden through firewall and other protection technologies, the nodes inside the cluster cannot be accessed illegally from the outside, or the security capability of the node operating system is enhanced, then the cluster system has certain security. Because many factors in the security environment need to be considered, it is also necessary to increase the difficulty of constructing the system. Currently, most commercial cluster products either use proprietary internal communication protocols to achieve efficiency and security, or integrate with existing security products to expand security functions at the system or network protocol layer.

Author profile:

Lin Fan is currently engaged in Linux-related scientific research at Xiamen University. With great interest in cluster technology, I hope to communicate with like-minded friends. You can contact him by email iamafan@21cn.com.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.