The Linux command I used--install Hadoop

Last Update:2015-03-05 Source: Internet

Author: User

Tags gz file tmp folder

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1. Hadoop software delivered to virtual machines

Or use WINSCP to put the Hadoop software installation package in the Linux downloads folder.

2. Select the installation directory

Copy the Hadoop installation package into this installation directory, where we select the/usr/local directory in CentOS.

3. Unzip the installation package

See the suffix for the Hadoop installation package is. tar.gz. So unzip it directly with the tar command.

#tar-zxvf xxx    // unzip. tar.gz file

After the decompression will produce a folder, named hadoop-1.1.2, the name is too long, not good, rename

#mv hadoop-1.1. 2 Hadoop

4. Hadoop Setting Environment variables

After the decompression, the Hadoop directory must have a directory structure, regardless of what, there must be a bin directory , which contains a variety of commands can be executed. So add the Bin directory to the environment variable.

#vi/etc/profile

In the configuration file, add:

Export hadoop_home=/usr/local/hadoopexport PATH=.: $JAVA _home/bin: $HADOOP _home/bin: $PATH

To make the configuration file effective:

#source/etc/profile

5. Modify the configuration file

In order to fit the pseudo-distributed installation of Hadoop, here are some configuration files to modify. The directory for the HADOOP configuration file is under the $hadoop_home/conf directory. The files to be modified are:

hadoop-enc.sh, Core-site.xml, Hdfs-site.xml, Mapred-site.xml

To facilitate the modification, we do not need VI to modify the file, but use WINSCP directly under Windows to modify.

Find this file in WinSCP and edit the file.

(1) hadoop-env.sh File modification content:

Export JAVA_HOME=/USR/LOCAL/JDK

(2) Core-site.xml File modification content:

<configuration>    <property>    <name>fs.  Default.name</name>    <value>hdfs://hadoop:9000</value>    < Description>change your own hostname</description>    </property>    <property>    < name>hadoop.tmp.dir</name>    <value>/usr/local/hadoop/tmp</value>    </property ></configuration>

Note: The value in the above <value> Hadoop is the host name of the machine, to make the corresponding changes according to its own settings.

(3) Hdfs-site.xml File modification content:

<configuration>    <property>        <name>dfs.replication</name>        <value>1 </value>    </property>    <property>        <name>dfs.permissions</name>        <value>false</value>    </property></configuration>

(4) Mapred-site.xml File modification content:

<configuration>    <property>        <name>mapred.job.tracker</name>        <value >hadoop:9001</value>        <description>change your own hostname</description>    </property></configuration>

Note: The value in the above <value> Hadoop is the host name of the machine, to make the corresponding changes according to its own settings.

6. Format Hadoop

Re-organizing the HDFs file system

#hadoop Namenode-format    // Format the HDFs file system for Hadoop

7. Start Hadoop

The start command script is in the $hadoop_home/bin directory, so it can be run directly

#start-all.sh

As mentioned earlier, Hadoop is running some Java processes, so starting Hadoop can see the corresponding Java process, view the way:

#jps    // View the currently running Java process

This command is not an operating system, it is located in the JDK and is designed to view the Java process

8. View Hadoop through a browser

Enter hadoop:50070 in the Linux browser to see Namenode, stating that the Namenode process is alive and that Namenode itself is a Web server.

Enter hadoop:50030 in the Linux browser to see Jobtracker, stating that the jobtracker process is alive and that Jobtracker itself is a Web server.

Also in the Windows host machine through the IP address: 50070, IP address: 50030 can see the same content. If you want to access Hadoop by hostname, bind the IP address to the hostname of Hadoop:

In the C:\Windows\System32\drivers\etc\hosts of Windows, add this file:

192.168. 80.100 Hadoop

You can then access Hadoop through the hostname: port number under Windows.

Note: This can be accessed, the first to be able to ping each other.

9. Problem correction

<1>namenode process did not start successfully?

(1) No formatting

(2) configuration file only copy, no host name modified

(3) hostname is not bound to IP address

(4) SSH password-free login not configured successfully

<2> is it wrong to format Hadoop multiple times?

Workaround: Delete the/usr/local/hadoop/tmp folder and reformat it to resolve.

The Linux command I used--install Hadoop

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More