hadoop-1.x Installation and Configuration

Source: Internet
Author: User

1. Before installing Hadoop, you need to install the JDK and SSH first.

Hadoop is developed in Java language, and the operation of MapReduce and the compilation of Hadoop depend on the JDK. Therefore, you must first install JDK1.6 or later (JDK1.6 is generally used in a real-world production environment, because some components of Hadoop do not support JDK1.7 and above). Hadoop uses SSH to start the daemon on the slave machine, and Hadoop uses the same processing as the cluster for pseudo-distributed running on a single computer. So SSH must also be installed.

Installation configuration steps for JDK1.6:

(1) Download JDK1.6 installation package from the Internet

(2) Installation JDK1.6

Unzip the installation package to/library/java/javavirtualmachines/.

(3) Configuring environment variables

Add configuration to the Java environment in. Bash_profile

Export Java_6_home=/library/java/javavirtualmachines/jdk1.6.0.jdk/contents/homeexport JAVA_HOME= $JAVA _6_ Homeexport path= $PATH: $JAVA _home/bin
Under terminal input: source. bash_profile, load configuration.

(4) Verify that the JDK is installed successfully

In the terminal input command: java-version

Displays the following information indicating that the JDK installation was successful:

[Email protected] ~$ Java-versionjava version "1.6.0_37" Java (TM) SE Runtime Environment (build 1.6.0_37-b06-434) Java Hots Pot (TM) 64-bit Server VM (build 20.12-b01-434, Mixed mode)

Installation and configuration of SSH

(1) Install SSH, enter command:

$sudo Apt-get Install Openssh-server

(2) Configure the machine for password-free login.

SSH key is built , and after that it will generate Id_rsa and id_rsa.pub two files in the ~/.ssh/directory, which is a pair of public private keys for SSH.

$ ssh-keygen-t Dsa-p "-F ~/.SSH/ID_DSA $ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
(3) Verify that SSH is installed successfully

Input command: ssh-version

Show Results:

[Email protected] ~$ ssh-versionopenssh_6.2p2, Osslshim 0.9.8r 8 Dec 2011Bad escape character ' rsion '.
Login to the machine, using SSH wucloud or ssh localhost, the first time to enter a yes, and then quit, and then log in, found no password, this step is completed.


2. Installing and configuring Hadoop

(1) Download the installation package from the Hadoop website and unzip it. The version I'm using here is hadoop-1.2.1.

$sudo TAR-ZXVF hadoop-1.2.1.tar.gz

(2) Hadoop configuration

You can now start a Hadoop cluster in one of the following three supported modes:

Single-machine mode, pseudo-distributed mode, fully distributed mode.

Pseudo-distribution does not reflect the benefits of cloud computing compared to a fully distributed deployment, but facilitates the development and testing of programs. This is a pseudo-distributed configuration of Hadoop, due to conditional constraints. You need to go to the Conf directory of Hadoop and modify the next few files.

Specify the JDK installation location in hadoop-env.sh:

Export Java_home=/library/java/javavirtualmachines/jdk1.6.0.jdk/contents/home

In Core-site.xml, configure the address and port number of the HDFs:

<configuration>      <property>          <name>fs.default.name</name>         <value>hdfs ://127.0.0.1:9000</value>     </property> </configuration>
In Hdfs-site.xml, configure how HDFs is backed up. The general default value is 3, which needs to be 1 in a single-machine version of Hadoop.

  <configuration>      <property>          <name>dfs.replication</name>          <value>1< /value>      </property>  </configuration>
In Mapred-site.xml, configure the address and port of the Jobtracker.

<configuration>      <property>          <name>mapred.job.tracker</name>          <value> localhost:9001</value>      </property> </configuration>
The first time you run Hadoop, you want to format the file system for Hadoop.

In the Hadoop directory, enter:

$ bin/hadoop Namenode-format

To start the Hadoop service:

$ bin/start-all.sh

If there is no error, it means that the launch was successful.

(3) Verify that Hadoop is installed successfully. Enter the following URL in your browser:

http://localhost:50030 (Web page for MapReduce)

http://localhost:50070 (HDFS Web page)

If you can view it correctly, the installation is successful. you can then run the WordCount example to check if Hadoop can run the job.

in the Hadoop directory, Hadoop-examples-1.2.1.jar This is a test program that contains a lot of code for testing. Create a directory, such as/home/hadoop/input/, to copy some text files to this directory

Run the following command:

$ bin/hadoop jar Hadoop-examples-1.2.1.jar wordcout/home/hadoop/input//home/hadoop/output/
After the end of the run will generate an output directory under/home/hadoop/, there are part-r-00000 and _success two files, see _success will know that has been successful, open part-r-00000, You can see how many occurrences of each word are counted.

hadoop-1.x Installation and Configuration

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.