Cygwin Building a Hadoop development environment

Source: Internet
Author: User
Tags ssh access

This article does not specifically describe some of the fine concept of things, such as to understand Cygwin and Hadoop can refer to other articles, the article describes from the download Cygwin to build the Hadoop environment, the picture department from the online information, because I was deployed without saving their own running pictures, But the steps are the same.

For Hadoop is a huge ecosystem, inside light some of the technical points of up to dozens of kinds, but the so-called beginning, a single step, for me such a technology small white, if want to get a completely distributed Hadoop environment is no wonder, first I do not understand Linux, Besides, there are not so many machines to build a fully distributed environment. But the advent of Cygwin can leave me out of my native Linux environment, Cygwin is a Linux simulation in a Windows environment. The next step is to get into the Cygwin download, which can be downloaded on the website

One: Installation Cygwin

Click EXE file


Click Next



The difference here is directly downloaded and then installed, the others are downloaded to the local but not installed, the recommended default is the first one, click Next



This default is placed on the C drive, can also be placed elsewhere, click Next




This is the download content, specify another disk, click Next



Select the first one by default, click Next



Here choose to download the server address, with the default can but very slow, it may be half a day depending on the network situation, recommended a http://mirrors.163.com this is I use the fastest, there is no words click the Add button to add, and then select it, click Next




Here to select specific components specific click on the previous skip will appear the current version, on behalf of you have selected, I have:

Devel inside the binutils, Gcc-core, gcc-g++, Gcc-mingw-core, gcc-ming-g++, GDB

NET OpenSSH and OpenSSL components for the SSH access required by Hadoop, as in the previous procedure

Base sed for eclipse-connected Hadoop development

Also can download some vim and so on, this according to their own needs, and even if this is not selected, after installation or can add components package, the best choice to download the first URL, choose Other I did not try to know there is no problem, complete these after clicking on the next step to start downloading, I use that 163 address download time not more than 10 minutes.


After the installation is complete, the desktop generates a shortcut, click the icon

Execute the Ssh-host-config of Cygwin

Then follow the prompts step-by-step to
Info:generating missing SSH host keysssh-keygen:generating new host KEYS:RSA1 RSA DSA ECDSA ed25519*** Info:creati ng Default/etc/ssh_config file*** info:creating default/etc/sshd_config file*** info:strictmodes is set to ' Yes ' by de fault.*** Info:this is the recommended setting, but it requires that the posix*** info:permissions of the user's home di Rectory, the user ' s. ssh*** info:directory, and the user's SSH key files is tight so that*** info:only the user has WRI Te permissions.*** Info:on The other hand, strictmodes don ' t work well with default*** info:windows permissions of a Hom E directory mounted with the*** Info: ' noacl ' option, and they don ' t work at all if the home*** info:directory are on a FA T or FAT32 partition.***query:should strictmodes be used? (yes/no) NoInfo:privilege separation is set to ' sandbox ' by default since*** info:openssh 6.1. This was unsupported by Cygwin and have to be set*** info:to ' yes ' or ' no '. * * * info:however, using privilege separation re Quires a non-privileged account*** info:called ' sshd '. * * * info:for more Info on privilege separation Read/usr/share/doc /openssh/readme.privsep.***query:should privilege separation be used? (yes/no) NoInfo:updating/etc/sshd_config file*** query:do you want to install sshd as a service?* * * Query: (Say "No" if it is already installed as a service) (yes/no) yes*** Query:enter The value of CYGWIN for the Daemon: []Info:on Windows Server 2003, Windows Vista, and above, the*** Info:system account cannot setuid to other users--a  capability*** info:sshd requires.  You need to has or to create a privileged*** info:account. This script would help you does so.*** info:you appear to be running Windows XP 64bit, Windows 2003 server,*** info:or late R. On these systems, it's not possible to use the localsystem*** info:account for services so can change the user ID W Ithout an*** info:explicit password (such as passwordless logins [e.g. public key*** info:authentication] via sshd). * * * Info:if want to enable that functionality, it's required to create*** info:a new account with special privileges (UN Less a similar account*** info:already exists). This account was then used to run these special*** info:servers.*** info:note that creating a new user requires that the Current account*** info:have Administrator privileges itself.*** info:no privileged account could being found.*** info:thIs script plans to use ' cyg_server '. * * * Info: ' cyg_server ' would only be used by registered services.*** Query:do you want To use a different name? (yes/no) no*** query:create new Privileged user account ' Cyg_server '?  (yes/no) yes*** info:please Enter a password for new user cyg_server. sure*** info:that This password matches the password rules given on your system.*** info:entering no password Would exit the configuration.*** Query:please enter the password:*** query:reenter:*** info:user ' Cyg_server ' has been c reated with password ' cyg_server '. * * * info:if the password, please remember also to change the*** info:passwo RD for the installed services which use (or would soon use) * * * info:the ' cyg_server ' account.*** info:also keep in mind t Hat the user ' cyg_server ' needs read permissions*** Info:on All users ' relevant files for the services running as ' Cyg_se RVer '. * * * info:in particular, for the SSHD server all users '. ssh/authorized_keys*** info:files mUST has appropriate permissions to allow public key*** info:authentication. (re-) running Ssh-user-config for each user would set*** Info:these permissions correctly. [Similar restrictions apply, for*** info:instance, for. rhosts files If the rshd server is running, etc].*** info:the SS  HD Service has been installed under the ' cyg_server ' * * * * info:account.  To start the service now, call ' net start sshd ' or*** Info: ' Cygrunsrv-s sshd '. Otherwise, it'll start automatically*** info:after the next reboot.*** info:host configuration finished. Have fun!

Above will be prompted to create a user cyg_server, and prompts you to enter the user's password, we enter here and the user name of the same password cyg_server, will be used later.
Please note that cyg_server user's creation is mandatory, no this user even if sshd is not good, the later use will appear connection closed error, I planted a somersault here, wasted a lot of time.

OK to the service to see, will be more out of a Cygwin sshd, you can set it to start manually, and then we start it

Go back to the Cygwin environment and execute the SSH localhost command.



In the first step of the inquiry enter Yes, in the second step to enter the password, enter the user password, the password has been set up

Enter Ssh-keygen in Cygwin, go all the way


Then, under Cygwin, execute the following command in turn:

CD ~/.ssh
CP Id_rsa.pub Authorized_keys

Completed all the way exit Cygwin environment, and then open the Cygwin environment, the implementation of SSH localhost, found that if you do not need a password to enter, on behalf of success.


II: Deploying Hadoop

I use the first generation of Hadoop products, is simple namenode,datanode,jobstracker,taskstracker,secondenamenode. I am here to provide a downloaded 0.20.2 version, specific can also go to Apache Hadooop official website download

Unpack the Hadoop package into the Cygwin directory


Configure some information, first JDK is necessary, here to say a point in advance, generally our JDK is put in C disk program files inside here will be set to space, so there will be errors. On-line quote, set up soft links, what anti-slash and so on, I have tried to read not so, I suggest simply a little bit directly to the previous JDK to take out the Program Files folder, a separate folder, here can no longer use the name of the space.

environment variable configuration for JDK I won't say it, add it in path, and install the directory according to your own

; C:\cygwin64\bin; C:\cygwin64\usr\sbin;

The corresponding value for the environment variable---New variable-->cygwin--> is: ntsec TTY


Modify some of the Hadoop configuration files:

hadoop-env.sh, remove the # from the front.

Export java_home=/java/jdk1.7.0_45

Core-site.xml

<configuration> <property> <name>fs.default.name</name> <value>hdfs://localhost :9000</value> </property> </configuration>

Hdfs-site.xml

<configuration> <property> <name>dfs.replication</name> <value>1</value> </ Property></configuration>

Mapred-site.xml

<configuration><property><name>mapred.job.tracker</name> <value>localhost:9001 </value> </property> </configuration>


Open the Cygwin icon again, then switch to the Hadoop command line


Enter Hadoop Namenode-format This is the format HDFs system, and then start all start-all.sh



I started here the problem, Namenode did not get up, only Mr Up, access address also can not access, here is a log to see, in Hadoop/logs there is a special Namenode log


Inside the hint, my 9000 port has been used, I opened the Super Admin interface to execute NETSTAT-AON|FINDSTR "9000" found that there is a PPAP process occupies port 9000, open Task Manager directly find this process, it is a process of PPTV, And then end the process. Of course, I have encountered this situation we do not necessarily encounter, but if a node does not come the first time to look at the log, the port occupies a certain early end of the process.

Start again



This time namenode up, in order to enter the access address in the browser http://localhost:50070 http://localhost:50030 can access HDFs and Mr separately





To here a simple hadoop environment to build, for their own such a small white to finish this step, you can give yourself a palm!









Cygwin Building a Hadoop development environment

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.