Hadoop Rack Awareness-enhancing cluster robustness, how to configure Hadoop rack awareness

Source: Internet
Author: User
Tags switches
We know that the Hadoop cluster is fault-tolerant, distributed and so on, why it has these characteristics, the following is one of the principles.

Distributed clusters typically contain a very large number of machines, and due to the limitations of the rack slots and switch ports, the larger distributed clusters typically span several racks, and the machines on multiple racks form a distributed cluster. The network speed between the machines in the rack is usually higher than the network speed across the rack machines, and the network traffic between the racks is usually limited by the network bandwidth between the upper switches.

Specific to the Hadoop cluster, because Hadoop's HDFs to the data file distributed storage is based on the chunked block block store, each block will have multiple replicas (default is 3), and for data security and efficiency, so hadoop default to 3 copies of the storage policy is:

The first block copy is placed in the node where the client resides (the first node is randomly selected if the client is not in the cluster scope).

The second copy is placed in node (randomly selected) in a different rack than the first node.

The third copy is placed on another node in the same rack as the node where the second replica resides

If there are more copies, they will be randomly placed in the cluster node.

Such a strategy ensures that access to the block's file is given precedence over this rack, and if an exception occurs throughout the rack, a copy of the block can be found on another rack. This is efficient enough, and simultaneously achieves the fault tolerance of the data.

However, Hadoop perception of the rack is not adaptive, that is, the Hadoop cluster to identify a slave machine belongs to which rack is not only perceived, but the need for Hadoop managers to inform Hadoop which machine belongs to which rack, This will keep the corresponding information of these machines and rack in memory when the Namenode of Hadoop starts initializing, and is used as a datanode list for all subsequent HDFS write operations (such as 3 Block corresponds to three datanode) of the choice Datanode strategy, to achieve Hadoop allocate block strategy: As far as possible to distribute three copies to different rack.
The next question is: In what way can you tell Hadoop namenode which slaves machines belong to which rack. The following are the configuration steps.

--------------------------------------------------------------------------------------------------------------- ----------------------


By default, the rack-aware of Hadoop is not enabled. So, in general, the Hadoop cluster of HDFs in the selection of the machine, is randomly selected, that is, it is likely to write data, Hadoop will be the first piece of data block1 written to the Rack1, and then randomly selected under the Block2 write to the Rack2, At this time two rack between the data transmission flow, and then, in the case of random, and then block3 back to the Rack1, at this time, two rack between the production of a data flow. When the amount of data in the job processing is very large, or the amount of data pushed to Hadoop is very large, this situation will cause the network traffic between rack to multiply, become the bottleneck of performance, and then affect the performance of the whole Cluster service.

To enable the Hadoop rack-aware feature, the configuration is very simple and an option is configured in the Hadoop-site.xml configuration file of the machine where the Namenode is located: <property>
</property Copy Code The value of this configuration option is specified as an executable program, typically a script that accepts a parameter and outputs a value. The accepted parameters are usually the IP address of a datanode machine, and the output value is usually the rack of the datanode that corresponds to the IP address, such as "/rack1". When Namenode starts, it determines whether the configuration option is empty, if not NULL, indicates that a rack-aware configuration is in use, and Namenode looks for the script based on the configuration, and when it receives each datanode heartbeat, Pass the IP address of the Datanode as an argument to the script and save the resulting output as the rack that the datanode belongs to in a map of memory.

As for scripting, it is necessary to understand the real network topology and rack information so that the IP address of the machine can be correctly mapped to the appropriate rack. A simple implementation is as follows: #!/usr/bin/python
Import Sys

Rack = {"hadoopnode-176.tj": "Rack1",
"HADOOPNODE-178.TJ": "Rack1",
"HADOOPNODE-179.TJ": "Rack1",
"HADOOPNODE-180.TJ": "Rack1",
"HADOOPNODE-186.TJ": "Rack2",
"HADOOPNODE-187.TJ": "Rack2",
"HADOOPNODE-188.TJ": "Rack2",
"HADOOPNODE-190.TJ": "Rack2",
"": "Rack1",
"": "Rack1",
"": "Rack1",
"": "Rack1",
"": "Rack2",
"": "Rack2",
"": "Rack2",
"": "Rack2",

If __name__== "__main__":
Print "/" + Rack.get (sys.argv[1], "rack0") copy the code because no exact document is found to indicate whether the host name or IP address will be passed to the script, so in the script is best compatible with the host name and IP address, if the room architecture is more complex, The script can return similar strings as:/dc1/rack1.

Execute command: chmod +x rackaware.py

Restart Namenode, if the configuration is successful, the Namenode boot log will output: 2011-12-21 14:28:44,495 INFO org.apache.hadoop.net.NetworkTopology:Adding a new Node:/rack1/ Copy Codethe distance between network topology machines

This is based on a network topology case, which describes the distance between each machine in a Hadoop cluster in a complex network topology.

2013010315170913.jpg (20.83 KB, download number: 1)

Download attachments to albums

2014-2-3 12:42 Upload

With rack-aware, Namenode can draw the Datanode network topology diagram shown in the figure above. D1,R1 are switches, the bottom of which is datanode. Then the parent of the H1 Rackid=/d1/r1/h1,h1 is R1,r1 is D1. These rackid information can be configured through Topology.script.file.name. With these rackid information, you can calculate the distance between any two datanode.

Distance (/D1/R1/H1,/D1/R1/H1) =0 the same datanode
Distance (/D1/R1/H1,/D1/R1/H2) =2 different datanode under the same rack
Distance (/D1/R1/H1,/D1/R1/H4) =4 different datanode under the same IDC
Distance (/D1/R1/H1,/D2/R3/H7) =6 under different IDC Datanode

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.