Implement rack aware in hadoop)

Source: Internet
Author: User
Tags dns names

Principle

The declaration in hadoop is an organic framework-aware function that can improve hadoop performance. The hadoop cluster we use has never actually used this function.
The implementation of rack awareness in hadoop is actually as follows:

  • When hadoop is started, one configuration option in the hadoop-default.xml and hadoop-site.xml is checked
    Item: topology. Script. file. Name
    When jobtracker is connected, the slave IP address is passed as a parameter to this script, and the returned value of this script is expected to return the rack name described in this slave. And this
    Specifically, how does one determine the ing between slave and rack hadoop. Therefore, Which machine belongs to the rack is determined by the person who wrote the script.
  • In addition, there is another configuration option corresponding to topology. Script. file. Name.
    Item: topology. Script. Number. args. This option sets the maximum number of parameters that the above script can accept because more than one parameter is accepted when the script is called.
    Number. Each parameter is the IP address of a machine.

Steps
  • 1. Add the configuration options in the jobtracker's hadoop-site.xml configuration file:
<property>
<name>topology.script.file.name</name>
<value>/path/to/rackmap.sh</value>
<description> The script name that should be invoked to resolve DNS names to
NetworkTopology names. Example: the script would take host.foo.bar as an
argument, and return /rack1 as the output.
</description>
</property>

<property>
<name>topology.script.number.args</name>
<value>1000</value>
<description> The max number of args that the script configured with
topology.script.file.name should be run with. Each arg is an
IP address.
</description>
</property>
  • Write the rackmap. Sh script to output the rack to each address.
  • Restart jobtracker

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.