Manual handling of uneven use between Datanode disks

Source: Internet
Author: User
Tags hadoop wiki

Http://wiki.apache.org/hadoop/FAQ#On_an_individual_data_node.2C_how_do_you_balance_the_blocks_on_the_disk.3F

On this issue, Hadoop does not provide automatic solutions for the time being, it has been put on the agenda, Jira on the record.

A manual processing solution is mentioned on the Hadoop wiki. As shown on the link above.

Problem description, the datanode.dir of a Datanode node configures multiple disks or directories, and if for some reason, such as bad disk replacement or disk selection policy issues,

Cause some disk directories to use a lot, and some or some of the disk usage is very low, this time can be manually processed.

1. The datanode process of this datanode node is stopped first, so that no data is written before the next step can be done.

2. For example, from/data1/block move to/data2/, as far as possible to ensure that the directory structure consistent, wiki refers to the new version of the sub-directory inconsistency is not.

MV, to pay attention to the permissions, do not use the root user mv end, do not change permissions, which will also cause unreadable problems.

I myself tested the environment cdh5.0.2 hadoop2.3

Data.dir/hdp2/dfs/data, add a new path,/HDP2/DFS/DATA2

Because the Data2 directory is empty, if startup Datanode initializes the directory, such as creating a version file.

I created the data pool and other directories directly in the DATA2 to finalized, then moved the first block and meta below data, and didn't move the SubDir directory.

After I started the datanode process directly, I found that the data I moved was Deleted (why)

But fortunately, the secondary factor is 2, and the data will be synced over. I stopped the datanode process and put the SUBDIR10 (which has data) directly to the/DATA2 corresponding directory,

Start the Datanode process again, WebUI on the check found that the block was found.

To be safe, work in a production environment, make sure to back up the data, or upload a large file and then test the blocks for this file before you scale it.

Manual handling of uneven use between Datanode disks

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.