hadoop storage strategy of different storage paths of a single data node source code analysis.

Source: Internet
Author: User
Keywords Different will include first time
Tags .mall analysis anchor code code analysis configuration data design

The problem arises in the data cluster number of nodes storage disk size is different, resulting in a period of time after the small capacity of the disk space is tight.

In fact, the early configuration of the disk using the storage strategy, you can solve the problem, some networks to say that this strategy is invalid, and then hadoop2.0.1 this version is valid, the version applies to CHD4.6.

In order to find an accurate program anchor point, refer to the following Hadoop design documents.

reference

Append / Hflush / Read Design Files for HDFS File Systems in Hadoop:

http://blog.csdn.net/chenpingbupt/article/details/7972589

Document is given:

In a DN disk, each DN has three directories: current em bw, current contains finallized replica, tmp contains temporary replica, rbw contains rbw, rwr, rur replicas. When a replica is created by the dfs client for the first time, it is placed in rbw. When the first creation is initiated during block replication and clust balance, the replica is placed in tmp. Once a replica has been finallized, he will be moved to current. When a DN restart, replica in tmp will be deleted, rbw will be loaded as rwr state, current will load for finallized state

We start from tmp or rbw file creation.

See java class BlockPoolSlice

From the description of the class, BlockPoolSlice is the foundation for creating a cluster data block.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.