HADOOP4 using VMware to build its own Hadoop cluster

Source: Internet
Author: User
Tags pscp

Objective:

Some time ago to learn how to deploy a pseudo-distributed model of the Hadoop environment, because the work is busy, learning progress stalled for some time, so today to take the time to learn the results of the recent and share with you.

This article is about how to use VMware to build your own Hadoop cluster. If you want to know about pseudo-distributed people and Hadoop programming in Eclipse, you can refer to my previous three articles.

1. In a Linux environment, a pseudo-distributed deployment of Hadoop (SSH-free login), running the WordCount instance successfully. Http://www.cnblogs.com/PurpleDream/p/4009070.html

2. Pack the plugins for Hadoop in Eclipse yourself. Http://www.cnblogs.com/PurpleDream/p/4014751.html

3. Access to Hadoop in Eclipse runs WordCount success. Http://www.cnblogs.com/PurpleDream/p/4021191.html

=============================================================== Long split-line ========================================== ==========================

Body:

  In previous Hadoop articles, I focused on how to deploy Hadoop pseudo-distributed mode to a Linux environment when I first learned about Hadoop, and how to compile a Hadoop eclipse plugin myself. And how to build a Hadoop programming environment in Eclipse. If you need it, you can click on the link to the top three articles I listed in the preface.

The purpose of this time is to use VMware to build a Hadoop cluster that belongs to you. This time we choose is VMware10, the specific installation steps we can go online search, a lot of resources.

If you re-install the process, encountered I did not mention the error, you can refer to the article at the bottom of the list of three questions to see if the solution is not in it, if not, then self-surfing internet search.

The first step is to determine the target:

Master 192.168.224.100 CentOS

Slave1 192.168.224.201 CentOS

Slave2 192.168.224.202 CentOS

Where Master is the Namenode and Jobtracker nodes, slave1 and Slave2 are Datanode and Tasktracker nodes.

Second, configure the virtual network, click "Edit" in the VMware toolbar, then select "Virtual Network Editor", set the options in the pop-up box, then click "Nat Settings", also follow the picture settings, detailed reference such as:

The third step is to confirm that the VMware service has been started, which is very important, or it will have an impact on the operations behind you, such as:

Fourth step, build a CentOS6.5 virtual machine in VMware, and refer to my other article for details: http://www.cnblogs.com/PurpleDream/p/4263465.html

Fifth step, after the fourth step our first Master virtual machine has been established, the following for this virtual machine, network, host configuration, detailed steps are as follows:

(1). Turn off Selinux:vi/etc/selinux/config, set selinux=disabled, save exit, such as:

(2). Turn off the firewall:/sbin/service iptables stop;chkconfig--level iptables off; after execution, call service iptables status to view the status of the firewall, such as:

(3). Modify the IP address to static address: Vi/etc/sysconfig/network-scripts/ifcfg-eth0, change its contents to as shown, notice that the HWADDR line, the value of the virtual machine you created is likely to be different, keep the original value, do not modify it.

(4). Modify Host Name: Vi/etc/sysconfig/network, such as:

(5). Modify the hosts map: vi/etc/hosts, here we will also add slave1 and slave2 host IP mapping relationship, easy to use behind, such as:

(6). Perform service network restart, restart the network, this step is necessary, please note.

Seventh step, install the Putty tool, you can search directly on Baidu, download extract to their own directory, we will use the directory of Pscp.exe.

The eighth step, install the JDK, the detailed steps are as follows:

(1). I downloaded from the Internet is jdk-6u45-linux-i586.bin, put in my directory below is D:\SettingUp\ITSettingUp\Java\JDK\JDK1.6 (linux32), Note This directory everyone can choose according to their own situation, here I put my own directory to stick out, is for the convenience of the back to explain PSCP upload.

(2). Open cmd, navigate to the Putty directory, call the following command, if prompted to enter a password, enter the password of the root account in the virtual machine. For the command below, we are using the PSCP command, two parameters: the first parameter is the local JDK path, the following parameter is our virtual machine path, here I set up in advance on the virtual machine two parent-child folder:/myself_settings/jdk1.6

pscp D:\SettingUp\ITSettingUp\Java\JDK\JDK1.6 (linux32) \jdk-6u45-linux-i586.bin [email protected]:/myself_ settings/jdk1.6

(3). Go to the directory where the virtual machine jdk is located /myself_settings/jdk1.6, execute the command:./jdk-6u45-linux-i586.bin, wait for the installation to complete.

(4). Modify Environment variables: VI ~/.bash_profile, added at the end, as shown in:

(5). Enter the command source ~/.bash_profile make the configuration effective, and then perform java-version to determine if the JDK has been configured successfully

Nineth step, install Hadoop in the following steps:

(1). Download Hadoop, I downloaded from the Internet is hadoop-1.0.1.tar.gz. Put it on my machine: D:\SettingUp\ITSettingUp\Hadoop\hadoop-1.0

(2). Open cmd, navigate to the Putty directory, call the following command, if prompted to enter a password, enter the password of the root account in the virtual machine.

PSCP D:\SettingUp\ITSettingUp\Hadoop\hadoop-1.0\hadoop-1.0.1.tar.gz [email protected]:/myself_settings/ hadoop1.0

(3). Go to the directory where Hadoop resides on the virtual machine/myself_settings/hadoop1.0, invoke command: TAR-XZVF hadoop-1.0.1.tar.gz unzip the file.

(4). After entering the extracted directory in (3), go to the Conf folder to configure, use the command: VI hadoop-env.sh, remove the comments from the Java_home line, and change to the following settings:

(5). Add environment variable VI ~/.bash_profile, such as:

(6). Open conf file: VI core-site.xml, for editing, such as:

(7). Open conf file: VI hdfs-site.xml, for editing, such as:

(8). Open conf file: VI mapred-site.xml, for editing, such as:

(9). Open conf file: VI Masters, for editing, such as:

(10). Open conf file: VI slaves, for editing, such as:

Tenth step, after the above steps, the first virtual machine has been configured, below we want to clone two virtual machines out, as slave1 and Slave2, detailed steps are as follows:

(1). In the list of virtual machines on the left side of VMware, select the first virtual machine, right-click on "Manage", select "Clone" in the "Management" panel, select "Next = =" virtual machine in the current state, next = = = "Create Full clone, next = =" Set the virtual machine name and installation directory = = = "Click Done", then proceed as follows on both virtual machines.

(2). Execution: Rm-f/etc/udev/rules.d/70-persistent-net.rules

(3). Perform reboot restart virtual machine

(4). Execute vi/etc/sysconfig/networking/devices/ifcfg-eth0 to modify the hwaddr of the new virtual machine to the address of the network card, the virtual Machine network card address to see the way: Select the virtual machine, mail Select "Settings", In the Panel that pops up, set it up as shown, such as:

(5). Also change the IPADDR (4) file to 192.168.224.201 (for slave1) or 192.168.224.202 (for slave2).

(6). Modify the/etc/sysconfig/network file for slave1 and slave2 and change the hostname to SLAVE1 or slave2

(7). Two virtual machines perform service network restart Restart Network

11th step, after the above steps, three virtual machines have been basically configured, but there is an important step, that is SSH-free landing configuration, this piece I was a problem, so here to explain in detail:

Note: Because of the first time I set up a problem here, so there was no time to record, and now for the demonstration, I re-build two VMs, respectively, Testone and testtwo, what I want to do is from Testone to Testtwo. And so on, and so on, with our article to do the master to avoid landing to Slave1 and Slave2 is the same.

(1). First in the Testone virtual machine, through the CD ~/.ssh into the ~/.ssh directory, you will see a known_hosts file,

(2). In the ~/.ssh folder, enter Ssh-keygen-t DSA, and then let you enter the name of the key store file, which I entered is ID_DSA. In front of these two can refer to the picture below, note the picture with a red rectangular frame in the section:

(3). Enter Cat Id_dsa.pub >> Authorized_keys in the ~/.ssh folder as shown in:

(4). In the ~/.ssh folder, copy the key you just generated to the Testtwo machine, enter the command: SCP Authorized_keys TESTTWO:~/.SSH, the process needs to enter the Testtwo password, detailed reference, Note the section in the chart with a red rectangular bezel:

(5). After the above 4 steps, enter SSH Testtwo, you should not need to enter the Testtwo login password, you can directly log in from Testone to Testtwo.

12th, at this point, the virtual machine configuration is complete, we followed the Hadoop Namenode-format, Hadoop Datanode-format, and then in the Hadoop installation directory, into the bin directory, execute the following command:.  Start-all.sh. You can then open the browser in the host, view the contents of the 192.168.224.100:50070, if normal display, the boot is normal. Note that it is also possible to enter the JPS command separately in the master and slaves to verify that the startup was successful, such as:

After 12 steps above, I believe that your own Hadoop cluster has been successful, behind you can refer to my article at the beginning of the article listed, add their own DFS location in eclipse, point to our cluster. In the above process, you may encounter some problems, you can refer to the following articles I listed:

1. Org.apache.hadoop.security.AccessControlException:Permission denied:user=drwho occurs when accessing a Hadoop cluster in eclipse, access= WRITE This error, refer to the following:

Solution : http://www.cnblogs.com/acmy/archive/2011/10/28/2227901.html

2. When you start HADOOP, you have this hint Warning: $HADOOP _home is deprecated. This will not affect the use, if you want to solve the case, refer to the following:

Solution : http://chenzhou123520.iteye.com/blog/1826002

3. If you call service network restart when you are finished setting up the network, the problem with device eth0 does not seem to be present appears as follows:

solution : Reopen Vi/etc/sysconfig/network-scripts/ifcfg-eth0, change the value of the device to eth1 or something, and then restart the network, it should not be an error.

HADOOP4 using VMware to build its own Hadoop cluster

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.