HDFs of Hadoop: Data block Recovery and file upload test

Source: Internet
Author: User
Keywords Nbsp; dfs each name

Question Guide


1, block recovery operations are mainly affected by what?


2, what does the data block recovery test scenario need?


3, by analyzing the communication performance between client and Datanode, what is the relationship between reading and writing small files and performance?











1. Data Block Recovery


&http://www.aliyun.com/zixun/aggregation/37954.html ">nbsp; When a datanode process on a machine is down, HDFs restores the data block to ensure that the copy of the file satisfies the set number of replicas. Block recovery operations are mainly affected by two parameters:


a) Dfs.namenode.replication.work.multiplier.per.iteration Namenode calculates the number of data blocks per datanode average recovery for each cycle of the cluster; If the parameter is configured too small, Then the Dfs.namenode.replication.max-streams configuration is no use;





b) dfs.namenode.replication.max-streams a single datanode maximum simultaneous recovery of the number of blocks, can indirectly control the restoration of data block Datanode network pressure;


at the same time, data block recovery is not limited by the restriction of moving block parameters, as the file system reads and writes, and this parameter does not work when balance:


HDFs dfsadmin-setbalancerbandwidth 62914563





1.1 Data Block recovery test Scenario


all of the above test scenario file size is 1mb,3 as Datanode machine for memory size 16GB, network card is 1000Mb. (All of the following network diagrams take the rightmost waveform as the network value of the test)





1.1.1 Test Scenario 1


parameter dfs.namenode.replication.max-streams=600, you need to restore the number of blocks of data 18016, two Datanode nodes participate in the recovery, the average need for each node to restore 9008.


start time: 14:18


End time: 14:27


1 minutes per node to repair nearly 1000 blocks, 30 nodes repair 3,000 blocks per minute, then 3 million blocks need 100 minutes





from the point of view of the network card, each machine 1 seconds output data volume of 20MB, then 1 minutes output data is 20*60=1200MB, then 10 minutes output flow is 12000MB, because a file size of 1MB, then about 12,000 files, It is possible that some of these blocks have failed to transmit and then retransmit, so it is almost as good as 9,008 pieces of data to evaluate each node.








1.1.2 Test Scenario 2


parameter dfs.namenode.replication.max-streams=3000, you need to restore the number of blocks of data 24618, two Datanode nodes participate in the recovery, the average need for each node to restore 12308.


start time: 16:07


End time: 16:13


time of 6 minutes, 2051 per minute (200 per 3 seconds), 30 nodes per minute 61530, 50 minutes to complete 3 million data block repair.


This network data is twice times the data of test Scenario 1 network card, so the amount of data is doubled in the same time.


One of the machines network situation





From the first two test results, it can be seen that with the increase of dfs.namenode.replication.max-streams configuration parameters, the speed of data block recovery is accelerating and the network speed is increasing.





1.1.3 Test Scenario 3


parameter dfs.namenode.replication.max-streams=3000, need to restore the number of blocks of 34780, 3 Datanode nodes participate in recovery, of which two datanode in the same machine, All single machine-involved fixes can reach 3000*2, and each node needs to recover an average of 11593.


start time: 15:26


End time: 15:41

The
spends 15 minutes, averaging about 772 blocks per minute for each node.


One of the machines network situation





The third test shows that although there are 3 three datanode involved in block repair, because the number of recovered threads is too large, resulting in a partial block recovery timeout, this part of the block of data need to be restored after a few minutes, a little more than a period of time, from the second wave diagram above can be seen in the 15 : 33, the output of network traffic almost to 0, about 4 minutes later there is output network traffic, if you ignore this part of time, then the recovery block time to spend 7 minutes, the average of each node to recover data blocks per minute is 1656 blocks, is the previous calculation of the result of twice times, So the number of threads that are running at the same time to recover blocks is not as much as possible. You need to find a reasonable value based on the cluster situation.





1.1.4 Test Scenario 4


parameter dfs.namenode.replication.max-streams=6000, you need to restore the number of blocks of data 25947, 3 Datanode nodes participate in recovery, each datanode on a separate machine, All single machine-involved fixes can reach 3000*2, and each node needs to recover an average of 8649.


Start time: 10:32


End time: 10:40


total time is 8 minutes, each node repair 1235


One of the machines network situation











from the third, 42 tests can be seen, although each datanode 6,000 threads participate in block recovery, a single machine to start a datanode than a single machine to boot multiple datanode performance is better, of course, if the number of threads is relatively small, because the results are roughly equal.





2. Single machine installation multiple DN Test


another on a single machine to start two datanode and start a datanode environment, 10KB size file size of the write test, the result is the same, by observing network data, the largest can only reach 13MB, According to the previous client and Datanode communication performance analysis, read and write small file process, more time consumption is disk interaction, so write small file traffic, should be related to disk performance, test environment using SATA disk, if you use SSD, or SAS disk effect should be better.

Original link: http://www.aboutyun.com/thread-9349-1-1.html

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.