Configure CDH and manage services turn off Datanode before HDFs is tuned

Source: Internet
Author: User

configuring CDH and Managing services

Tuning of HDFs before closing Datanode

Role requirements: Configurator, Cluster Administrator, full Administrator

when a datanode is closed, Namenode ensures that each block in each Datanode is still available based on the replication factor (the replication factor) across the cluster. This process involves the block duplication of small batches between datanode. In this case, a datanode has thousands of blocks, and it may take several hours for the backup to be restored between clusters. Before shutting down the Datanode host, you should first adjust HDFs:

1 , increase The stack size of the datanode. Datanode should have at least 4 GB of stack size to allow for an increase in iteration and maximum flow

a , go to the HDFs service page;

b , Click configuration (config) tab;

C , in each Datanode role Group (datanode default group and additional Datanode role group) go to the Resource Management (resourcemanagement) category and set DataNode Java stack size (bytes) (Java Heap size of DataNode in Bytes);

D , click Save Changes ( Save changes) commit the changes.

2 , set Datanode balance Bandwidth

a , expand DataNode Default group (DataNode ) > Performance (performance) category;

b , Configure DataNode balanced bandwidth based on your disk and network performance (DataNode balancing Bandwidth);

C , Click Save Changes ( Save changes) commit the changes.

3 , increase the value of the copy work multiplier based on iteration settings (the default value is 2, but the recommended value is 10)

a , expand NameNode Default group (NameNode) > Advanced category;

b , sets the replication work multiplier for the configuration by iteration ( Replication work Multiplier Per iteration) set to ten;

C , Click Save Changes ( Save changes) commit the changes.

4 , increase the maximum number of threads for replication and maximum replication thread limit

a , expand NameNode Default group (NameNode) > Advanced category;

b , Configure The maximum number of Datanode replication threads (Maximumnumber of replication threads on a Datanode) and Datanod Number of replication threads limit (Hardlimit on the replication threads on a datanod) are 50 and 100 respectively;

C , Click Save Changes ( Save changes) commit the changes.

5 , Restart the HDFs service.


Translation level is limited, the following is the hand of the original English:

Configuring CDH and Managed Services

Tuning HDFS Prior to decommissioning datanodes

Required Role:configurator, Cluster Administrator, full Administrator

When a DataNode isdecommissioned, the NameNode ensures this every that every block from the Datanodewill still is availabl E across the cluster as dictated by the replicationfactor. This procedure involves copying blocks off the DataNode in Smallbatches. In cases where a DataNode have thousands of blocks,decommissioning cantake several hours. Before decommissioning hosts with Datanodes,you Shouldfirst tune HDFS:

1. Raise The heap size of the datanodes.datanodes should be configured in least 4 GB heap size to allow for Theincreas E in iterations and Max streams.

A, Go to the HDFS service page.

B, Click the Configuration tab.

C, under each DataNode role group (Datanodedefault Group and additional DataNode role groups) go to theResource Managem ENT category, and setthe Java Heap Size of DataNode in Bytesproperty as recommended.

D, Click SaveChanges to commit the changes.

2, Set the DataNode balancing bandwith:

A, Expand the DataNode Default Group > Performancecategory.

B, Configure the DataNode balancing Bandwidth property to the bandwisth you have onyour disks and network.

C, Click SaveChanges to commit the changes.

3. Increase the replication work multiplierper iteration to a larger number (the default was 2, however is recommended):

A, Expand the namenodedefault Group > Advanced catrgory.

B, Configure the replicationwork Multiplier Per Iteration property to a value such as 10.

C, Click SaveChanges to commit the changes.

4, increase the replication Maximim threadsand maximum replication thread hard limits:

A, Expand the namenodedefault Group > Advanced category.

B, Configure the Maximum number of replication threads on a Datanode and hard limit on the number ofReplicati Onthreads on a Datanode properties to respectively.

C, Click SaveChanges to commit the changes.

5, Restart the HDFS service.


Configure CDH and management services to turn off Datanode prior to HDFs tuning

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.