Cygwin build Hadoop development environment

Source: Internet
Author: User
Tags ssh ssh access

This article does not specifically say some thin concept things, such as to understand Cygwin and Hadoop can refer to other articles, this article describes from the download Cygwin to build Hadoop environment, the image department from the online data, because I was deployed at that time did not save their own run pictures, But the steps are the same.

For Hadoop is a huge ecosystem, there are some technical points in the light of up to dozens of kinds of, but the so-called a single step, for me such a technical white, if you want to get a completely distributed Hadoop environment is impossible, first of all, I do not understand Linux, Besides, it's not that much. Machines build a fully distributed environment. But Cygwin's appearance can let me not install in my native Linux environment, Cygwin is a Windows environment simulation Linux. The next step is to get into the Cygwin download, which can be downloaded from the website

One: Install Cygwin

Click EXE file


Click Next



The difference here is that it is directly downloaded and then installed, the other is downloaded to the local but not installed, the recommended default is the first, click Next



This default is placed in the C disk, you can put other places, click Next




This is the download content, and specify a disk, click Next



Default selection First, click Next



Here choose to download the server address, with the default can but very slow, there may be half a day specific depending on the network situation, recommend a http://mirrors.163.com this is the fastest I use, there is no words click the Add button to add, and then select it, click Next




Here select the specific components of the specific click on the previous skip will appear in the current version, on behalf of you have been selected, I have under:

Devel inside the binutils, Gcc-core, gcc-g++, Gcc-mingw-core, gcc-ming-g++, GDB

NET OpenSSH and OpenSSL components for the SSH access required by Hadoop, with the same operation methods as above

Base sed, for Eclipse connection Hadoop development

can also download some vim and so on, this according to their own needs, and even if this is not selected, installation can be added after the package, it is best to choose the first download of the Web site, choose another I did not try not to know if there is no problem, finish these click Next to start downloading, I use that 163 address download time does not exceed 10 minutes.


The desktop generates a shortcut after the installation is complete, click on the icon

Execute the Cygwin ssh-host-config

and follow the prompts step-by-step.

Info:generating missing SSH host keys ssh-keygen:generating new host KEYS:RSA1 RSA DSA ECDSA ED25519 * * * info:cre  ating default/etc/ssh_config File * * * * * * * * info:creating default/etc/sshd_config File * * * * info:strictmodes is set to ' Yes '
By default. Info:this is the recommended setting, but it requires this POSIX * * * info:permissions of the user ' s home directo ry, the user ' s. SSH * * info:directory, and the user ' s SSH key files are tight so-* info:only the user has write
Permissions. Info:on the other hand, strictmodes don ' t work as is, with default * * Info:windows permissions of a home directory mo Unted with the * * * Info: ' noacl ' option, and they don ' t work at all if the home * * * info:directory is on a FAT or FAT32 p
Artition. Query:should strictmodes be used?  (yes/no) No * * * * * * Info:privilege separation is set to ' sandbox ' by default since * * INFO:OPENSSH 6.1.
This are unsupported by Cygwin and has to be set * * * * * * * * * * * * * * * * * * * * * * * Info:hoWever, using privilege separation requires a non-privileged account * * * info:called ' sshd '.
Info:for More info on privilege separation read/usr/share/doc/openssh/readme.privsep. Query:should privilege separation be used?
(yes/no) No * * * * * * * * * * * * * * * * query:do you want to install sshd as a service? Query: (Say "No" if it is already installed as a service) (yes/no) Yes * * Query:enter the value of CYGWIN for the DA Emon: [] * * * * Info:on Windows Server 2003, Windows Vista, and above, the * * Info:system account cannot setuid to other U  Sers-A capability * * * INFO:SSHD requires.  You are need to have or to create a privileged * * * info:account.

This script would help with you.  Info:you appear to be running Windows XP 64bit, Windows 2003 Server, * * * info:or later. On these systems, it's not possible to use the LocalSystem * * * info:account for services that can change the user ID with Out A * * * * info:explicit password (such as Passwordless logins [e.g. public key * * * info:authentication] via sshd). Info:if you want to enable that functionality, it's required to create * * * INFO:A new account with special privilege S (unless a similar account * * * Info:already exists).

This are then used to run special * * * * info:servers.

Info:note that creating a new user requires this current account * * * info:have Administrator privileges itself.

Info:no privileged account could is found.
Info:this script plans to use ' cyg_server '.
Info: ' Cyg_server ' is used by registered services. Query:do you want to use a different name? (yes/no) No * * * * * query:create new Privileged user account ' Cyg_server '?  (yes/no) Yes * * * * * Info:please Enter a password for new user cyg_server.
Please be sure * * * info:that This password matches the password rules given on your system.
Info:entering no password would exit the configuration. Query:please Enter the password: * * * query:reenter: ** * info:user ' cyg_server ' has been created with password ' cyg_server '. Info:if you are the password, please remember also to change the * * * Info:password for the installed services WHI

Ch Use (or would soon use) * * * info:the ' cyg_server ' account. Info:also keep in mind that the user ' cyg_server ' needs Read permissions * * * info:on All users ' relevant files for T
He services running as ' cyg_server '. Info:in particular, for the SSHD server all users. Ssh/authorized_keys * * Info:files must have appropriate Permiss Ions to allow public key * * * * info:authentication. (re-) running Ssh-user-config for each user would set * * * Info:these permissions correctly.


[Similar restrictions apply, for * * * info:instance, for. rhosts files If the rshd server is running, etc].  Info:the sshd Service has been installed under the ' cyg_server ' * * info:account.  To start the service now, call ' net start sshd ' or * * * Info: ' cygrunsrv-s sshd '. Otherwise, it'll start automatically
Info:after the next reboot. Info:host configuration finished.
 Have fun!

The above will prompt to create a user cyg_server, and prompts you to enter the user's password, we entered here and the user name the same password cyg_server, later will use.
Please note that the creation of cyg_server users is mandatory, no this user even if sshd installed is not good, the back of the use of the time will appear connection closed error, I fell in here, wasted a lot of time.

All right, look in the service. One more Cygwin sshd, you can set it to start manually, and then we start it.

Go back to the Cygwin environment and execute the SSH localhost command.



Enter Yes in the first step, enter the user password when the second step requires a password, the password is set above

Enter Ssh-keygen in Cygwin and return all the way


Then, under Cygwin, execute the following command:

CD ~/.ssh
CP Id_rsa.pub Authorized_keys

Completion of the way exit Cygwin environment, and then open the Cygwin environment, the implementation of SSH localhost, found that the following figure does not need a password to enter, on behalf of success.


II: Deploying Hadoop

I use the first generation of Hadoop products here, is simple namenode,datanode,jobstracker,taskstracker,secondenamenode. I am here to provide a downloaded 0.20.2 version, the specific can also go to the Apache hadooop website download

Put the Hadoop package into the Cygwin directory after decompression


Configure some information, first JDK is necessary, here to say a priority, in general, our JDK is placed in the C-disk program files inside here will be set to space, so that there will be errors behind. Quote on the Internet, set up soft links, what backslashes and so on, I have tried to read does not work well, here I suggest simply a little directly to the previous JDK out of the Program Files folder a separate folder, here can no longer use the name of the space.

environment variable configuration of JDK I'm not going to say that. Add in path, this installs itself according to its own directory

; C:\cygwin64\bin; C:\cygwin64\usr\sbin;

--> the new variable-->cygwin--> in the environment variable the corresponding value is: Ntsec TTY


Modify some of the configuration files for Hadoop:

hadoop-env.sh, get rid of the front # number.

Export java_home=/java/jdk1.7.0_45

Core-site.xml

<configuration> 
<property> 
<name>fs.default.name</name> 
<value>hdfs ://localhost:9000</value> 
</property> 
</configuration>

Hdfs-site.xml

<configuration> 
<property> 
<name>dfs.replication</name> 
<value>1< /value> 
</property>
</configuration>

Mapred-site.xml

<configuration>
<property>
<name>mapred.job.tracker</name> 
<value >localhost:9001</value> 
</property> 
</configuration>


Open the Cygwin icon again, and then switch to the Hadoop command line


Enter Hadoop Namenode-format This is the format HDFs system, and then start all start-all.sh



I started a problem here, Namenode did not get up, only Mr Up, access to the address can not be accessed, here is a log to see, in the hadoop/logs inside has a special Namenode log


Inside the hint, my 9000 port has been used, I opened the Super Admin interface to perform netstat-aon|findstr "9000" found that there is a PPAP process to occupy Port 9000, open Task Manager directly to find this process, originally a PPTV process, and end the process. Of course, I encountered this situation you will not necessarily encounter, but if a node does not start the first time to look at the log, the occurrence of port occupancy must advance the process of completion.

Start again



This time namenode up, in order to enter the access address in the browser http://localhost:50070 http://localhost:50030 can respectively access HDFs and Mr





Here a simple Hadoop environment has been built, for their own such a small white finish this step, you can drum a palm.









Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.