Hadoop installation and hadoop environment (APACHE) version

Source: Internet
Author: User

This morning, I helped a new person remotely build a hadoop cluster (1. in versions X or earlier than 0.22), I am deeply touched. Here I will write down the simplest Apache hadoop construction method and provide help to new users. I will try my best to explain it in detail. Click here to view the avatorhadoop construction steps.

1. Environment preparation:

1 ). machine preparation: the target machine must be pinged to each other. Therefore, the virtual machines on different machines must be configured with a "Bridge Connection" (if it is a host, disable the host machine firewall first. For the specific configuration method of the Internet access method, Google vmvair Internet access configuration, KVM bridge Internet access, xen can manually configure the lan ip address during installation, please leave a message); disable the machine firewall:/etc/init. d/iptables stop; chkconfig iptables off; we recommend that you use hadoopservern to modify the Host Name of the machine. N indicates the number of the machine you actually assigned, because if the host name contains '_''. 'and other special symbols will cause startup problems. Modify the/etc/hosts of the machine and add the ing between IP address and hostname.

2). Download and decompress the stable version of hadoop package and configure the Java environment (for Java environment, generally ~ /. Bash_profile, considering Machine security issues );

3). No key. Here is a small trick: On hadoopserver1

Ssh-kengen-t rsa-p'; press ENTER

Ssh-copy-ID user @ host;

Then ~ /. Copy id_rsa and id_rsa.pub under the ssh/directory to other machines;

SSH hadoopserver2; run SCP-R ~ /. Ssh/authorized_keys hadoopserver1 :~ /. SSH/; in this way, all the key-free operations are completed, and mutual SSH can be performed. There are many practical and multi-learning methods. the Internet does not say that hadoop uses ssh-copy-ID for key-free operations to simplify the operation.

2. steps:

1). Modify the following files in the conf directory of hadoop Extract on hadoopserver1 (namenode:

Core-site.xml:

      

<?xml version="1.0"?><?xml-stylesheet type="text/xsl" href="configuration.xsl"?><!-- Put site-specific property overrides in this file. --><configuration><property>          <name>fs.default.name</name>          <value>hdfs://hadoopserver1:9000</value></property><property>          <name>hadoop.tmp.dir</name>          <value>/xxx/hadoop-version/tmp</value></property></configuration>

 

Hdfs-site.xml:

<?xml version="1.0"?><?xml-stylesheet type="text/xsl" href="configuration.xsl"?><!-- Put site-specific property overrides in this file. --><configuration>    <property>  <name>dfs.permissions</name>  <value>false</value></property>        <property>          <name>dfs.replication</name>          <value>3</value>        </property>        <property>          <name>dfs.name.dir</name>          <value>/xxx/hadoop-version/name</value>        </property>        <property>          <name>dfs.data.dir</name>          <value>/xxx/hadoop-version/data</value>        </property>        <property>          <name>dfs.block.size</name>          <value>670720</value>        </property><!--<property>  <name>dfs.secondary.http.address</name>  <value>0.0.0.0:60090</value>  <description>    The secondary namenode http server address and port.    If the port is 0 then the server will start on a free port.  </description></property><property>  <name>dfs.datanode.address</name>  <value>0.0.0.0:60010</value>  <description>    The address where the datanode server will listen to.    If the port is 0 then the server will start on a free port.  </description></property><property>  <name>dfs.datanode.http.address</name>  <value>0.0.0.0:60075</value>  <description>    The datanode http server address and port.    If the port is 0 then the server will start on a free port.  </description></property><property>  <name>dfs.datanode.ipc.address</name>  <value>0.0.0.0:60020</value>  <description>    The datanode ipc server address and port.    If the port is 0 then the server will start on a free port.  </description></property><property>  <name>dfs.http.address</name>  <value>0.0.0.0:60070</value>  <description>    The address and the base port where the dfs namenode web ui will listen on.    If the port is 0 then the server will start on a free port.  </description></property>--><property>  <name>dfs.support.append</name>  <value>true</value>  <description>Does HDFS allow appends to files?               This is currently set to false because there are bugs in the               "append code" and is not supported in any prodction cluster.  </description></property></configuration>

Mapred-site.xml

      

<?xml version="1.0"?><?xml-stylesheet type="text/xsl" href="configuration.xsl"?><!-- Put site-specific property overrides in this file. --><configuration>        <property>          <name>mapred.job.tracker</name>          <value>hadoopserver1:9001</value>        </property>        <property>          <name>mapred.tasktracker.map.tasks.maximum</name>          <value>2</value>        </property>        <property>          <name>mapred.tasktracker.reduce.tasks.maximum</name>          <value>2</value>        </property><!--<property>      <name>mapred.job.tracker.http.address</name>  <value>0.0.0.0:50030</value>  <description>    The job tracker http server address and port the server will listen on.    If the port is 0 then the server will start on a free port.  </description></property><property>  <name>mapred.task.tracker.http.address</name>  <value>0.0.0.0:60060</value>  <description>    The task tracker http server address and port.    If the port is 0 then the server will start on a free port.  </description></property>--></configuration>

The hostname of secondname is entered in the master to inform hadoop to start secondname on this machine;

Slaves indicates the datanode node, with one hostname in one row

2). Modify hadoop-env.sh:

Specify java_home to your Java installation directory

Add a startup environment: Export hadoop_opts = "-djava.net. preferipv4stack = true ". Used to ensure binding of ipv4ip;

3) manual distribution: SCP-r hadoop directory hadoopserver1. .. n:/directory with the same prefix/

4). Start:

Bin/hadooop namenode-format

Bin/start-all.sh

5) Enter http: // hadoopserver1 IP: 50070 in the browser to view the machine status.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.