Build Hadoop cluster environment under Linux

Source: Internet
Author: User
Tags hadoop fs

Small written in front of the words

"The World martial arts, only fast not broken", but if not clear principle, fast is also futile. In this age of material desire, data explosion, bigdata era, if you are familiar with the entire Hadoop building process, we can also grab a bucket of gold?!

Pre-preparation

L two Linux virtual machines (this article uses Redhat5,ip, 192.168.1.210, 192.168.1.211, respectively)

L JDK Environment (this article uses jdk1.6, online many configuration methods, omitted in this article)

L Hadoop installation Package (this article uses Hadoop1.0.4)

Setting goals

210 as the host and Node machine, 211 as a node machine.

Build Step 1 Modify the Hosts file

Added in/etc/hosts:

12 192.168.1.210 hadoop1192.168.1.211 HADOOP2
2 implement SSH no password login 2.1 host (master) no password native login
1 Ssh-keygen-t Dsa-p "-F ~/.SSH/ID_DSA

A direct carriage return will generate two files in ~/.ssh/after completion: ID_DSA and Id_dsa.pub. These two are paired to appear, similar to keys and locks.

Append the id_dsa.pub to the authorization key (currently no authorized_key s file):

1 Cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

Experiment:

1 SSH localhost hostname

Or to enter the password, generally this situation is because of the directory or file permissions problems, look at the system log, is indeed a permission problem,

The Authorized_keys permission under SSH is 600, and its parent directory and grandparent directory should be 755

2.2 No Password login node machine (slave)

Execute on Slave:

1 Ssh-keygen-t Dsa-p "-F ~/.SSH/ID_DSA

Build the. SSH directory.

Copy the Authorized_keys on master to Slave:

1 SCP Authorized_keys hadoop2:~/.ssh/

Experiment: Perform on Master

1 SSH HADOOP2

Achieve no password login.

3 Configuring Hadoop3.1 Copy Hadoop

Copy the hadoop-1.0.4.tar.gz to the Usr/local folder and unzip it.

Decompression command:

1 TAR–ZXVF hadoop-1.0.4.tar.gz
3.2 Views Cat/etc/hosts
12 192.168.1.210 hadoop1192.168.1.211 HADOOP2
3.3 Configuring Conf/masters and Conf/slaves

Conf/masters:

1 192.168.1.210

Conf/slaves:

12 192.168.1.211192.168.1.211
3.4 Configuring conf/hadoop-env.sh

Join

1 Export JAVA_HOME=/HOME/ELVIS/SOFT/JDK1.7.0_17
3.5 Configuring Conf/core-site.xml

Join

1234 <property><name>fs.default.name</name><value>hdfs:// 192.168.1.210:9000</value></property>
3.6 Configuring Conf/hdfs-site.xml

Join

12345678910111213141516 <property><name>dfs.http.address</name><value> 192.168.1.210:50070</value></property><property><name> Dfs.name.dir</name><value>/usr/local/hadoop/namenode</value>&lt ;/property><property><name>dfs.data.dir</name><value >/usr/local/hadoop/data</value></property><property>< Name>dfs.replication</name><value>2</value></property >
3.7 Configuring Conf/mapred-site.xml

Join

1234 <property><name>mapred.job.tracker</name><value> 192.168.1.50:8012</value></property>

3.8 Creating a related directory

/usr/local/hadoop///hadoop data and Namenode directory

Note: Create only the Hadoop directory, and do not create the data and Namenode directories manually.

This directory is also created by other node machines.

3.9 Copying Hadoop files to other node machines

Remote copy of the Hadoop file to another node (so that the previous configuration is mapped to the other nodes),

Command:

1 Scp-r hadoop-1.0.4 192.168.1.211:/usr/local/
3.10 Formatting Active master (192.168.201.11)

Command:

1 Bin/hadoop Namenode-format
3.11 Start the cluster./start-all.sh

Now that the cluster has started up, take a look at the command:

1 Bin/hadoop Dfsadmin-report

2 Datanode, open the web and take a look.

Browser input: 192.168.1.210:50070

After the end of the call, the cluster installation is complete!

FAQ 1 Bad connection to FS. Command aborted

Need to view the log, which is shown in my log:

2013-06-09 15:56:39,790 ERROR Org.apache.hadoop.hdfs.server.namenode.NameNode:java.io.IOException:NameNode is not Formatted.

At Org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead (fsimage.java:330)

At Org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage (fsdirectory.java:100)

At Org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize (fsnamesystem.java:388)

At Org.apache.hadoop.hdfs.server.namenode.fsnamesystem.<init> (fsnamesystem.java:362)

At Org.apache.hadoop.hdfs.server.namenode.NameNode.initialize (namenode.java:276)

At Org.apache.hadoop.hdfs.server.namenode.namenode.<init> (namenode.java:496)

At Org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode (namenode.java:1279)

At Org.apache.hadoop.hdfs.server.namenode.NameNode.main (namenode.java:1288)

Namenode is not formatted!!!

Workaround:

The reason is that I manually built the/usr/local/hadoop/data and/usr/local/hadoop/namenode, the two directories to remove the reformatting of Namenode.

2 WARN org.apache.hadoop.hdfs.server.datanode.DataNode:Invalid directory in Dfs.data.dir:Incorrect permission for/usr /local/hadoop/data, Expected:rwxr-xr-x, while actual:rwxrwxrwx

Workaround:

/usr/local/hadoop/directory permission is too high, change to chmod 755.

3 Eclipse Plugin Issues

Exception 1:2011-08-03 17:52:26,244 INFO Org.apache.hadoop.ipc.Server:IPC Server handler 6 on 9800, call Getlisting (/home/fish/t Mp20/mapred/system) from 192.168.2.101:2936:error:org.apache.hadoop.security.accesscontrolexception:permission Denied:user=drwho, Access=read_execute, inode= "System": ROOT:SUPERGROUP:RWX-WX-WX

Org.apache.hadoop.security.AccessControlException:Permission denied:user=drwho, Access=read_execute, inode= " System ": ROOT:SUPERGROUP:RWX-WX-WX

At Org.apache.hadoop.hdfs.server.namenode.PermissionChecker.check (permissionchecker.java:176)

At Org.apache.hadoop.hdfs.server.namenode.PermissionChecker.checkPermission (permissionchecker.java:111)

At Org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission (fsnamesystem.java:4514)

At Org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPathAccess (fsnamesystem.java:4474)

At Org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing (fsnamesystem.java:1989)

At Org.apache.hadoop.hdfs.server.namenode.NameNode.getListing (namenode.java:556)

At Sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)

At Sun.reflect.NativeMethodAccessorImpl.invoke (nativemethodaccessorimpl.java:39)

At Sun.reflect.DelegatingMethodAccessorImpl.invoke (delegatingmethodaccessorimpl.java:25)

At Java.lang.reflect.Method.invoke (method.java:597)

At Org.apache.hadoop.ipc.rpc$server.call (rpc.java:508)

At Org.apache.hadoop.ipc.server$handler$1.run (server.java:959)

At Org.apache.hadoop.ipc.server$handler$1.run (server.java:955)

At java.security.AccessController.doPrivileged (Native Method)

At Javax.security.auth.Subject.doAs (subject.java:396)

At Org.apache.hadoop.ipc.server$handler.run (server.java:953)

Workaround: Add the following in the Hdfs-site.xml

1234 &lt;property&gt;&lt;name&gt;dfs.permissions&lt;/name&gt;&lt;value&gt;false &lt;/value&gt;&lt;/property&gt;

HDFs common commands to create folders
1 ./hadoop Fs–mkdir/usr/local/hadoop/godlike
Uploading files
1 ./hadoop fs–put/copyfromlocal 1.txt/usr/local/hadoop/godlike
See what files are in the folder
1 ./hadoop Fs–ls/usr/local/hadoop/godlike
View File Contents
1 ./hadoop Fs–cat/text/tail/usr/local/hadoop/godlike/1.txt
deleting files
1 ./hadoop Fs–rm/usr/local/hadoop/godlike
Delete a folder
1 ./hadoop Fs–rmr/usr/local/hadoop/godlike

Related article recommended "win08+tomcat to achieve load balancing"

Japanese currency Bonus:

This article for the original article, starting, according to the points rules of the site to give a total of 6 coins.

AD: This site open contributions and points (Japanese yen), yen can be redeemed in kind reward, monthly top3 can get a gift.

    • This article is from: Linux Tutorial Network

Build Hadoop cluster environment under Linux

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.