Installing Hadoop under Windows platform

Source: Internet
Author: User
Tags xsl

1. Install JDK1.6 or later

Download the JDK, installation note, it is best not to install the path with a blank name, for example: programe files, or you will not find the JDK when configuring Hadoop configuration file (according to the relevant argument, the path in the configuration file is quoted to solve, but I did not test successfully).

2, installation Cygwin

Cygwin is a tool for simulating the UNIX environment under the Windows platform and needs to be installed on the basis of Cygwin installation hadoop,:http://www.cygwin.com/

Download the 32-bit or 64 installation files as required by the operating system.

1), double-click the downloaded installation file, click Next, select Install from the Internet

2), select the installation path

3), select Local Package Directory

4), select your Internet connection mode

5), select the appropriate installation source, click Next

6), in the Select Packages interface, category expand NET, choose the following OpenSSH and OpenSSL two

  

If you want to compile Hadoop on eclipe, you need to install SED under category base

  

If you want to modify the Hadoop configuration file directly on Cygwin, you can install vim under editors

  

7), click "Next", wait for the installation to complete.

3. Configure Environment variables

Right click on "My Computer", select "Properties" in the menu, click on the Advanced tab on the Properties dialog, click "Environment Variables" button, double click "Path" variable in the system variable list, enter the bin directory of installed Cygwin after the variable value, for example: D:\hadoop\ Cygwin64\bin

4. Installing SSHD Service

Double-click the Cygwin icon on the desktop to start Cygwin, execute the ssh-host-config-y command

After execution, you will be prompted to enter a password, otherwise you will exit the configuration, enter the password and Confirm password, return. Finally, the host configuration finished appears. The fun! indicates that the installation was successful.

Enter net start sshd to start the service. Or find and start the Cygwin sshd service in the system's service.

You may experience problems with the inability to install and start the sshd service, which you can refer to with this connection http://www.cnblogs.com/kinglau/p/3261886.html.

In addition, if the WIN8 operating system, start Cygwin, you need to run as administrator (right-click icon, choose to run as an administrator), or because of permissions issues, prompted "system error 5 occurred."

5. Configure SSH password-free login

Execute the Ssh-keygen command to generate the key file

As shown, enter: Ssh-keygen-t dsa-p '-f ~/.SSH/ID_DSA, note that the-t-p-f parameter is case sensitive.

Ssh-keygen is the Generate key command

-T means the specified generated key type (DSA,RSA)

-P indicates the provided passphrase

-f Specifies the generated key file.

Note: ~ Represents the current user's folder,/home/user name

After you execute this command, the. SSH folder is generated under your cygwin\home\ username path, which can be viewed by command ls-a/home/user name, Ssh-version command to view the version.

After executing the Ssh-keygen command, you can then execute the following command to generate the Authorized_keys file.

CD ~/.ssh/

CP Id_dsa.pub Authorized_keys

As shown in the following:

Then execute the exit command to exit the Cygwin window

6, again on the desktop double-click the Cygwin icon, open the Cygwin window, execute the SSH localhost command, the first time to execute the command will be prompted, enter Yes after the carriage return. As shown

7. Install Hadoop

Download http://hadoop.apache.org/releases.html on the Hadoop website.

Unzip the Hadoop package to the/home/user catalog, the folder name is changed to Hadoop, can not be modified, but behind the execution of the command is a bit cumbersome.

(1) Stand-alone mode configuration method

Standalone mode does not require configuration, in this way, Hadoop is considered a separate Java process, which is often used for debugging.

(2) pseudo-distribution mode

Pseudo-distribution mode can be regarded as a cluster with only one node, in this cluster, this node is both master and slave, both Namenode and Datanode, both Jobtracker and Tasktracker.

This mode modifies several configuration files.

Configure hadoop-env.sh, Notepad to open the change file, set the Java_home value for your JDK installation path, for example:

Java_home= "D:\hadoop\Java\jdk1.7.0_25"

Configure Core-site.xml

 <?xml version= 1.0   "? ><?xml-stylesheet type="  text /xsl   href="  configuration.xsl  ?><!--Put site-specific property overrides Span style= "color: #0000ff;" >in  this  file. --><configuration> <property> <name>fs. default  .name</name> <value>hdfs:     LOCALHOST:9000</VALUE>  </property> <property> <name>mapred.child.tmp</name> <value>/home/u/hadoop/tmp</value> </property></ Configuration> 

Configure Hdfs-site.xml

<?xml version="1.0"? ><?xml-stylesheet type="text/xsl" href="configuration.xsl" in the this file. --><configuration> <property> <name>dfs.replication</name> <value>1 </value></property></configuration>

Configure Mapred-site.xml

<?xml version="1.0"? ><?xml-stylesheet type="text/xsl" href="configuration.xsl" in the this file. --><configuration> <property> <name>mapred.job.tracker</name> <value>localhost:< c11>9001</value> </property> <property> <name>mapred.child.tmp</name> <valu E>/home/u/hadoop/tmp</value> </property></configuration>

8. Start Hadoop

Open the Cgywin window, execute the CD ~/hadoop command, and go to the Hadoop folder, such as:

Before starting Hadoop, you need to format Hadoop's file system HDFs and execute the command: Bin/hadoop Namenode-format

Note that the namenode is smaller, otherwise if you enter Namenode, you will be prompted with an error, unable to find or load the main class Namenode. Execute the correct command as shown:

Enter the command bin/start-all.sh to start all processes, such as:

Next, verify that the installation is successful

Open the browser, enter the following URL, if you can browse correctly, the installation is successful.

http://localhost:50030, enter open the MapReduce Web page, as in (Page section):

http://localhost:50070, enter open HDFS Web page, such as (Page section):

After the first boot, if you cannot browse or browse for one, exit Cygwin, reopen Cygwin, and execute the bin/start-all.sh command.

If you only want to start the MapReduce, you can execute the bin/start-mapred.sh command.

If you want to start HDFs only, you can execute the bin/start-dfs.sh command.

Reference documents:

This article references and references the chapter "2.3 Installing and configuring Hadoop on Windows" in the "Hadoop Combat" (Lu Jiaheng) section.

It is hereby stated that if copyright issues are involved, please inform us.

Installing Hadoop under Windows platform

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.