Pseudo-distributed Hadoop 2.6.4

Source: Internet
Author: User

Description

Task: Build a Hadoop pseudo-distributed version.

Objective: To quickly build a learning environment, skip this environment, quickly enter the state, and use some Hadoop components to do some tasks

No choice 2.7, the bug is more, unstable.

Simple and fast selection of pseudo-distributed

Environment:

Win 7 8G RAM, 4 cores

VM 12,1 virtual Machine 3G memory

Ubuntu 4.4.0 x86-64

Hadoop 2.6.4

JDK 1.7.0_80

1. Virtual Machine Linux Preparation

Install the virtual machine, (you can choose how to clone), and the network chooses NAT.

Create user Hadoop, configure sudo command, file settings (to be refined: Baidu)

All subsequent operations are performed with Hadoop users, without permission on sudo

1.1 Network IP configuration (lazy, with default allocation, if multiple nodes to set, to be refined)
[Email protected]:~$ifconfigens33 Link encap:ethernet HWaddrxx: 0c: in: 2e:0f: theinet Addr:192.168.249.144Bcast:192.168.249.255Mask:255.255.255.0inet6 addr:fe80:: -:d d35:2b5d:4dba/ -scope:link up broadcast RUNNING multicast MTU: theMetric:1RX Packets:145870Errors0Dropped0Overruns:0Frame0TX Packets:12833Errors0Dropped0Overruns:0Carrier0Collisions:0Txqueuelen: +RX Bytes:209812987(209.8MB) TX Bytes:1827590(1.8MB)

1.2 Host name Settings

Modify the following three places:

A

sudo vi /etc/hostname[email protected]: more/etc/hostnamessmaster 

B

hostname Ubuntu [email protected]: sudo hostname Ssmaster[email protected]: hostname Ssmaster

C

sudo VI /etc/hosts

After modification:

127.0.0.1 localhost
#127.0.1.1 Ubuntu
192.168.249.144 Ssmaster

2. Installing the JDK

Configuring Environment variables

Vi/etc/profile add save at the end

Export JAVA_HOME=/HOME/SZB/HADOOP/JDK1. 7 . 0_80export jre_home= $JAVA _home/jreexport PATH= $PATH: $JAVA _home/binexport CLASSPATH =./ : $JAVA _home/lib: $JAVA _home/jre/lib

Execute the command to take effect source/etc/profile

The following installation is successful

[Email protected]:~$ java-"1.7.0_80"1.7. 0_80-   24.80-b11, Mixed mode)
3. SSH Settings

Test SSH ssmaster First (current hostname, previous settings)

A password is required to indicate that it is not set.

Execute the following command to enter the carriage.

[Email protected]:~$ CD ~[email protected]:~$Ssh-keygen-t Rsa[email protected]:~/.SSH$CPid_rsa.pub authorized_keys[email protected]:~/.SSH$lsAuthorized_keys Id_rsa id_rsa.pub known_hosts[email protected]:~/.SSH$ MoreAuthorized_keysSSH-rsa aaaab3nzac1yc2eaaaadaqabaaabaqcxjtffupsmtnnhj4+ 4subfrnez7teyu3hhvq7lq0cowxej6r53za9lcawdykusrv5pnly4bqlt6swjselysieu+Wgpvl6unwroueubdagbnurviuvt6dxlccolqscvy0aqsk+yivs+qqhme839x4w+zd5xbzgulgiqs1whxbcs8shiho09rxa0mibxblyvkfwmh71ubxny6gqhh3zriyrzo0krcmgwphgsc/83fzsujnw5bkiesjkplhejmco8m+eqw1hcmj7ofmnabaih86rqunae4rnrjnquin73kgufkqehwngrl3cpwr/kxdnvoeyuphc/eew0hhfk8gcwlq/p [email protected]

Test, should no password login successful

SSH Ssmaster

Exit

3. Preparing the Hadoop installation package

Download to any directory

Extract

TAR-ZXVF hadoop-2.6.4.tar.gz

Mobile Unpacking Package

sudo mv hadoop-2.6.4/opt/

4. Configure Hadoop

4.1 Adding Hadoop paths to environment variables

sudo vi/etc/profile modified as follows

Export hadoop_home=/opt/hadoop-2.6. 4 export Java_home=/home/szb/hadoop/jdk1. 7 . 0_80export jre_home= $JAVA _home/jreexport PATH= $PATH: $JAVA _home/bin: $HADOOP _home/bin: $HADOOP _ home/sbinexport CLASSPATH=./: $JAVA _home/lib: $JAVA _home/jre/lib

Source/etc/profile Effective

4.2 Creating an HDFs data store directory

Create Dfs/name Dfs/data in the Hadoop installation directory

[Email protected]:/opt/hadoop-2.6.4$pwd/opt/hadoop-2.6.4[email protected]:/opt/hadoop-2.6.4$mkdirDfs[email protected]:/opt/hadoop-2.6.4$lsBin Dfs etc include Lib Libexec LICENSE.txt logs NOTICE.txt README.txt sbin share Tmp[email protected]:
    /opt/hadoop-2.6.4$ cd Dfs[email protected]:/opt/hadoop-2.6.4/dfs$mkdirname Data[email protected]:/opt/hadoop-2.6.4/dfs$lsData Name

4.3 Adding the JDK path to the Hadoop xxxx.sh script file

Location [Email protected]:/opt/hadoop-2.6.4/etc/hadoop$

Add in the following file

Export java_home=/home/szb/hadoop/jdk1.7.0_80

hadoop-env.sh

yarn-env.sh

mapred-env.sh

4.4 Modify Slaves File

Location [Email protected]:/opt/hadoop-2.6.4/etc/hadoop$

Modify the contents of the slaves file as the hostname, after modification:

[Email protected]:/opt/hadoop-2.6. 4  More Slavesssmaster

4.5 Configuration XML file

4.5.1 Core-site.xml

Post-modification content

<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://ssmaster:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/hadoop-2.6.4/tmp</value>
</property>

</configuration>

Note:

Directory of Fs.defaultfs Namenode

Hadoop.tmp.dir Intermediate Temporary Results Storage Directory

At present, the Core-site.xml file minimizes configuration, core-site.xml each configuration can refer to: http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/ Hadoop-common/core-default.xml

4.5.2 Hdfs-site.xml

Post-modification content

<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/opt/hadoop-2.6.4/dfs/name</value>
</property>
<property>
<name>dfs.namenode.data.dir</name>
<value>/opt/hadoop-2.6.4/dfs/data</value>
</property>

</configuration>

Note:

Dfs.replication number of replicas, pseudo-distributed to 1, distributed generally 3

Dfs.namenode.name.dir namenode Data Catalog

Dfs.namenode.data.dir datanode Data Catalog

The above is the Hdfs-site.xml file minimization configuration, hdfs-site.xml each configuration can refer to: http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/ Hadoop-hdfs/hdfs-default.xml

4.5.3 Mapred-site.xml

First copy Mapred-site.xml.template to Mapred-site.xml

Add Content:

<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>

Note:

Mapreduce.framework.name MapReduce Resource Management component, other values can exist

The above is mapred-site.xml minimized configuration, mapred-site.xml each configuration can refer to: http://hadoop.apache.org/docs/r2.6.0/hadoop-mapreduce-client/ Hadoop-mapreduce-client-core/mapred-default.xml

4.5.4 Yarn-site.xml

Post-modification content

<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>ssmaster</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>

Note:
Yarn.resourcemanager.hostname the ResourceManager node. (problem guessing if it is distributed, can be different from Namenode node, to be verified)

Yarn.nodemanager.aux-services not clear meaning, free to understand

The above content is the minimization configuration of Yarn-site.xml, the contents of yarn-site file configuration can be referenced as follows: http://hadoop.apache.org/docs/r2.6.0/hadoop-yarn/ Hadoop-yarn-common/yarn-default.xml

5. Start Hadoop5.1 format HDFs
[Email protected]:/opt/hadoop-2.6.4$  Bin/hdfs Namenode-format
The last log has this, indicating success 16/10/22 19:40:40 INFO Common. Storage:storage Directory/opt/hadoop-2.6.4/dfs/name has been successfully formatted.

5.2 Starting HDFs
[Email protected]:/opt/hadoop-2.6.4$ SBIN/START-DFS.SHStarting namenodes on [ssmaster]ssmaster:starting namenode, logging to/opt/hadoop-2.6.4/logs/hadoop-hadoop-namenode-ssmaster.outssmaster:starting Datanode, logging to/opt/hadoop-2.6.4/logs/hadoop-hadoop-datanode-ssmaster.outstarting secondary Namenodes [0.0.0.0]the authenticity of Host'0.0.0.0 (0.0.0.0)'Can't be established. ECDSA Key fingerprint is sha256:adblljhq7xybjrfqpw9t5oya7+q7yo50s+ok7lianuk.are you sure you want to continue connecting ( yes/no)? Yes0.0.0.0:warning:permanently added'0.0.0.0'(ECDSA) to the list of known hosts.0.0.0.0:starting Secondarynamenode, logging To/opt/hadoop-2.6.4/logs/hadoop-hado Op-secondarynamenode-ssmaster.out[Email protected]:/opt/hadoop-2.6.4$ JPS11151DataNode11042NameNode11349Secondarynamenode11465Jps

http://192.168.249:144:50070/

Note:

Starting secondary namenodes [0.0.0.0]
The authenticity of host ' 0.0.0.0 (0.0.0.0) ' can ' t be established.

Secondary Namenode IP is 0, Next prompts yes/no, select Yes.

Do not know how to configure here. Have time to go back to study [left small problem]

5.3 Starting HDFs
[Email protected]:/opt/hadoop-2.6.4$ sbin/start-yarn.SHstarting yarn daemonsstarting ResourceManager, logging to/opt/hadoop-2.6.4/logs/yarn-hadoop-resourcemanager-ssmaster.outssmaster:starting NodeManager, logging to/opt/hadoop-2.6.4/logs/yarn-hadoop-nodemanager-Ssmaster.out[email protected]:/opt/hadoop-2.6.4$ JPS11151DataNode11042NameNode11714Jps11349Secondarynamenode11675NodeManager11540ResourceManager

http://192.168.249.144:8042/
http://192.168.249.144:8088/

Port grooming for the Hadoop Web console page:

50070:hdfs File Management

8088:resourcemanager

8042:nodemanager

JPS view each node started, the WEB can open a variety of pages, logo installation success

6. Save the virtual machine image

Z Summary: Hadoop pseudo-distribution builds initial success z.1 existence: [Legacy research]
    • Network configuration is not intentionally set up, automatically assigned by the virtual machine, there may be potential IP change issues
    • Hostname chatty to set the function of each file command
    • HDFs Boot is secondname node IP display as 0000, prompting connection rejection, certain places can be set

z.2 Follow-up:

    • Focus on Hadoop use, install Eclipse, common operations, Jar calls
    • Build spark environment, common operations
    • Be free to study pure distributed building
    • Be free to study the meaning of each parameter in Hadoop configuration, configure
Q Other:

Copy files from different Linux systems

SCP hadoop-2.6.4.tar.gz [Email protected]:~/

Various configuration files are packaged and uploaded:

The files that are involved after the native Hadoop installation. rar task: Upload to a place, link over [legacy Perfect]

C Reference:

Ref 1

The main reference for this tutorial

hadoop2.6.0 version build pseudo-distributed environment

http://blog.csdn.net/stark_summer/article/details/43484545

Pseudo-distributed Hadoop 2.6.4

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.