Design of HDFS data block multi-copy storage

Source: Internet
Author: User
Keywords DFS can save multiple dot on

Hadoop can be so widely used, and the hdfs behind it silently is inseparable. As a file system that can run on hundreds of nodes, HDFs has taken a very careful look at reliability design.

Design of 3.2.1 HDFS data block multi-copy storage

As a distributed file system, HDFs uses the means to hold multiple replicas in the system (multiple replicas below), and multiple copies of the same block of data are stored on different nodes, as shown in Figure 3-2. The use of this multiple-copy method has the following advantages: 1 The use of multiple copies, allowing customers to read data from different blocks, speed up the transmission speed; 2 because the HDFs of the datanode between the network transmission data, if the use of multiple copies can determine whether the data transmission error; 3 Multiple replicas can guarantee that a datanode is invalidated without losing data.

HDFs randomly select storage nodes according to block, in order to determine whether the file error, the number of replicas defaults to 3 (note: If the number of pairs of 1 or 2, it is not able to determine the data right and wrong). Due to the cost of data transmission and error recovery, the preservation of replicas is not evenly distributed among the clusters, with respect to the distribution and maintenance of datanodehttp://www.aliyun.com/zixun/aggregation/13996.html. "> Load balancing of the more detailed content, you can refer to the following 3.4.4 section on the introduction of balancer."

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.