What is the difference between a distributed and a cluster?

Source: Internet
Author: User

Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

http://blog.csdn.net/cutesource/article/details/5811914

On IDF05 (Intel Developer Forum 2005), Intel chief executive Craig Barrett to cancel the 4GHz chip plan, half joking in public on one knee apology, to the vast number of software developers a clear signal, Simply relying on vertical lifting hardware performance to improve the performance of the era has ended, the era of distributed development has been quietly become the mainstream of the times, very hot cloud computing is actually just packaged in a business concept outside the distributed, many developers (including me) want to join the study of cloud computing this trend, On Google through the "cloud computing" this keyword to query the information, the search is a few conceptual or commercial propaganda materials, in fact, really need to go deep or that early to be familiar with the concept of------distributed.

Distributed can be simplified, the simplest distribution is the most common, in the Load Balancer server after adding a heap of Web servers, and then a cache server on the above to save the temporary state, after sharing a database, in fact, a lot of people who are known as distributed experts will stay here, the approximate structure as shown:

This environment is really distributed just Web server, and there is no connection between Web server, so the structure and implementation are very simple.

In some cases, the need for distribution is not so simple, there are distributed requirements at each link, such as load Balance, DB, cache and files, and when there is a correlation between distributed nodes, but also to consider the communication between, in addition, the node is very many times, have to have monitoring and management to support. This seems to be a very large system of distribution, except that you can tailor it to your specific needs. In terms of the most complete distributed system, the following modules can be composed:

Distributed Task Processing services: Responsible for specific business logic processing

Distributed node Registration and query: Responsible for the management of all distributed node naming and physical information registration and query, is the link between the nodes of the bridge

Distributed DB: Distributed structured data access

Distributed cache: Distributed cached data (non-persistent) access

Distributed files: Distributed file access

Network communication: Network data communication between nodes

Monitoring management: Collects, monitors, and diagnoses all node running states

Distributed programming languages: for proprietary programming languages in distributed environments, such as Elang, Scala

Distributed algorithm: A Paxos algorithm for solving some peculiar problems in distributed environment, such as solving the consistency problem

Therefore, in order to delve into cloud computing and distribution, we have to delve into the above areas, and each of these areas is very deep and requires a very low level of knowledge and technology to support, so it is very good for developers who want to improve technology to use distributed as a point of entry, which can be a clue. Explore every corner of the computer world.

A cluster is a physical form, and distributed is a way of working.

As long as a bunch of machines, you can call the cluster, they are not working together to work, this one does not know; A program or system, as long as it runs on different machines, it can be called distributed, Well, C/s architecture can also be called distributed.

Clusters are generally physically centralized and managed uniformly, while distributed systems do not emphasize this point.


Therefore, a cluster may run one or more distributed systems, or it may not run a distributed system at all; a distributed system may run on a cluster, or it may run on multiple machines (2 or more) that do not belong to a cluster. Cloth is relatively centralized, emphasizing that the task is performed on multiple physically isolated nodes. The main problem of centrality is reliability, if the central node is down the whole system is not available, distributed in addition to solve some of the problem of centralized, but also tend to distribute the load, but the distribution will bring a lot of other problems, the most important is consistency.
Cluster is the logical processing of the same task of the machine collection, can belong to the same computer room, but also belong to different computer room. The concept of distribution can be run in a cluster, and a cluster can be used as a node of the distributed concept.
A word, that is: "Separate work" and "a bunch of people" the difference between cold night
Links: http://www.zhihu.com/question/20004877/answer/61025046
Source: Know
Copyright belongs to the author, please contact the author for authorization.

Distributed refers to the distribution of different businesses in different places. Clustering, however, refers to centralizing several servers together to achieve the same business.

Each node in the distribution can be a cluster. Clusters are not necessarily distributed.

Example: For example, Sina, the number of people who visit, he can do a cluster, the front of a response server, the next few servers to complete the same business, if there is business access, the response server to see which server load is not very heavy, will be to which to complete.

and distributed, from the narrow meaning of understanding, but also similar to the cluster, but its organization is relatively loose, unlike clustering, there is an organizational, a server collapsed, the other servers can be top up.

distributed to each node, the completion of a different business, a node collapsed, which business is inaccessible.

2: To put it simply, distributed is to improve efficiency by shortening the execution time of a single task, while clustering increases efficiency by increasing the number of tasks executed per unit of time.

For example:

If a task consists of 10 subtasks, each of which takes 1 hours to execute separately, it takes 10 hours to perform the task on a single server.

The distributed scheme provides 10 servers, each server is responsible for only one sub-task, regardless of the dependencies between subtasks, it takes one hours to complete this task. (A typical representation of this mode of operation is the Map/reduce distributed computing model of Hadoop)

The use of cluster scheme, also provides 10 servers, each server can handle this task independently. Assuming there are 10 tasks to arrive at the same time, 10 servers will work at the same time, 1 hours later, 10 tasks completed at the same time, so that the whole body, or 1 hours to complete a task!
Clusters are generally divided into three types, high-availability clusters such as RHCS, Lifekeeper, and so on, load-balanced clusters such as LVS, high-performance computing clusters, distributed should be high-performance computing clusters within the category. Distributed: Different business modules are deployed on different servers or the same business module splits multiple sub-services, deployed on different servers, to solve high concurrency problems
Cluster: The same business deployed on multiple machines, improve system availability small hotel originally only a chef, cut vegetables and vegetables for cooking all dry. Later the guests, the kitchen a chef busy, and invited a chef, two cooks can fry the same dish, the relationship between the two chefs is a cluster. In order to let the chef concentrate on cooking, the dishes to achieve the ultimate, and asked a vegetable division responsible for cutting vegetables, preparation, preparation, chef and the relationship between the food division is distributed, a food division is too busy to come over, and invited a vegetable division, two with the relationship is cluster

What is the difference between a distributed and a cluster?

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.