Build a Hadoop development environment on Windows

Last Update:2016-05-07 Source: Internet

Author: User

Tags xsl

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Build a Hadoop development environment on Windows

Objective

There are usually two ways to run Hadoop under Windows: One is to install a Linux operating system with a VM, which basically enables Hadoop to run in a full Linux environment, and the other is to emulate the Linux environment through Cygwin. The advantage of the latter is that it is easy to use and the installation process is simple, this article is about the second way of Cygwin simulating Linux environments.

Preparatory work

(1) Install JDK1.6 or later, when installing, it is best not to install to a path with a space name, for example: programe files, otherwise you will not find the JDK when configuring Hadoop configuration files.

(2) Hadoop http://hadoop.apache.org/releases.html is downloaded from the website.

Installing Cygwin

Cygwin is a tool for simulating the UNIX environment under the Windows platform and requires the installation of Cygwin on the basis of Hadoop: http://www.cygwin.com/ Download the 32-bit or 64 installation files as required by the operating system.

First, double-click the downloaded installation file, click Next to go to the Program Boot Installation page, here are three options, select a network installation:

Network installation: Download and install packages over the network
Download but not install: Download packages over the network
Local installation: Is installed with a local package

Second, choose Install from the Internet

Third, select the installation path

Third, select Local package Directory

Iv. Choose your Internet connection

V. Select the appropriate installation source and click Next

Six, this step is more important, the following package to ensure that the installation:

In the Select Packages interface, category expands to NET and selects the following OpenSSH and OpenSSL two items

If you want to compile Hadoop on eclipe, you need to install SED under category base

If you want to modify the Hadoop configuration file directly on Cygwin, you can install vim under editors

Click "Next" to wait for the installation to complete.

Eight, configure environment variables

Right click on "My Computer", select "Properties" in the menu, click on the Advanced tab on the Properties dialog, click "Environment Variables" button, double click "Path" variable in the system variable list, enter the bin directory of installed Cygwin after the variable value, for example: D:\cygwin64\bin

Installation of SSHD Services

Double-click the Cygwin icon on the desktop to start Cygwin, execute the ssh-host-config-y command, and then prompt for a password.

Enter the password and Confirm password at this time, enter. Finally, the host configuration finished appears. The fun! indicates that the installation was successful.

Enter net start sshd to start the service. Or find and start the Cygwin sshd service in the system's service.

Installing Hadoop
The previous section is operated on the company computer, the following installation operation is in native operation, the process is not affected.

Download Hadoop

Hadoop website: http://hadoop.apache.org/releases.html.

Unzip the Hadoop package to the/home/user catalog, the folder name is changed to Hadoop, can not be modified, but behind the execution of the command is a bit cumbersome.

(1) stand-alone mode configuration method

Standalone mode does not require configuration, in this way, Hadoop is considered a separate Java process, which is often used for debugging.

(2) pseudo-distribution mode

Pseudo-distribution mode can be regarded as a cluster with only one node, in this cluster, this node is both master and slave, both Namenode and Datanode, both Jobtracker and Tasktracker.

Pseudo-distribution mode only needs to modify several configuration files.

Configure hadoop-env.sh, Notepad to open the change file, set the Java_home value for your JDK installation path, for example:

Java_home= "D:\javatools\jdk1.6.0"

Configure Core-site.xml

<?xml version= "1.0"? ><?xml-stylesheet type= "text/xsl" href= "configuration.xsl"?><!--Put Site-specific property overrides the this file. --><configuration><property> <name>fs.default.name</name> <value>hdfs:// localhost:9000</value> </property> <property> <name>mapred.child.tmp</name> <valu E>/home/u/hadoop/tmp</value> </property></configuration>

Configure Hdfs-site.xml

<?xml version= "1.0"? ><?xml-stylesheet type= "text/xsl" href= "configuration.xsl"?><!--Put Site-specific property overrides the this file. --><configuration><property> <name>dfs.replication</name> <value>1</value ></property></configuration>

Configure Mapred-site.xml

<?xml version= "1.0"? ><?xml-stylesheet type= "text/xsl" href= "configuration.xsl"?><!--Put Site-specific property overrides the this file. --><configuration><property>   <name>mapred.job.tracker</name>   <value> localhost:9001</value>    </property>    <property>   <name>mapred.child.tmp</ name>   <value>/home/u/hadoop/tmp</value>    </property></configuration>

start Hadoop

open cgywin window, execute CD ~/hadoop command, Enter the Hadoop folder, Before starting Hadoop, you need to format Hadoop's file system HDFs and execute the command: Bin/hadoop Namenode-format, ( note :namenode to be smaller, otherwise if the input Namenode, will prompt the error, cannot find or cannot load the main class Namenode. )

Enter the command bin/start-all.sh to start all processes.

Verify that the installation is successful

Open the browser, enter the URL: http://localhost:50030 and then enter, if access is available, the installation succeeds. Access is as follows:

Reference Documents: the Hadoop Combat "

Build a Hadoop development environment on Windows

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More