Hadoop-hdfs Distributed File System

Source: Internet
Author: User

Use 3 machines to build HDFS fully distributed cluster 201 (NameNode), 202 (DataNode), 203 (DataNode)

Overall architecture

NameNode (192.168.1.201)

DataNode (192.168.1.202,192.168.1.203)

Secondarynamenode (192.168.1.202)

1. Download the Hadoop package from the official website and upload it to the Linux system

Hadoop-1.2.1.tar.gz

Extract

TAR-ZXVF hadoop-1.2.1.tar.gz Linux Server requires a JDK environment

Because the name is long, you can add a soft connection

ln-sf/root/hodoop-1.2.1/home/hodoop-1.2

2. Modify the Core-site.xml configuration file

Vi/home/hadoop-1.2/conf

Configure Namenode host and port number, configure working directory

  <configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://192.168.1.201:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/hadoop-1.2</value>
</property>

</configuration>

  The default working directory is in the TMP directory, and the TMP directory is emptied when the Linux system Qidong

After extracting the Hadoop compression pack

/hadoop-1.2.1/docs/core-default.html

  

The working directory for HDFs is based on the TMP temp directory

  

3. Configure Conf/hdfs-site.xml

Configure Dfs.replication, configure the number of replicas for Datanode 202,203 as Datanode, so the number of replicas <= 2

<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>

4. Configure the Datanode node

Vi/conf/slaves (can use host name without IP address)

5. Configure Secondarynamenode, note that you cannot be on the same machine as the Namenode

Vi/conf/masters

192.168.1.202

6. Configure Password-free login

Password-Free login You can enter a command on any machine to start a process on all machines

If you do not password-free login, you need to enter the boot process command on each machine

Password-free login on configuration 201

Generate secret key on 201

Ssh-keygen-t Dsa-p "-F ~/.SSH/ID_DSA

Generated under the. SSH Directory

[email protected]. ssh]# lsauthorized_keys  ID_DSA  id_dsa.pub  

ID_DSA is the private key, Id_dsa.pub is the public key

Configure password-free login for a single machine

Execute the following command

" -F ~/.ssh/~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

Configure password-free logon across nodes

Execute first

‘‘

Generate Id_dsa.pub Public key

Copy the id_dsa.pub to the machine that will be password-free to log on

SCP id_dsa.pub [email protected]192.168. 1.202: ~

Append id_dsa.pub to the Authorized_keys day file on 192.168.1.202

$ cat ~/id_dsa.pub >> ~/.ssh/authorized_keys

Use the more Authorized_keys to view

Log on to 202 on 201 using SSH 192.168.1.202:22

Need to do a local password-free login, and then do cross-node password-free login

The result of the configuration is 201-->202,201-->203, if the opposite is necessary, the main reverse process is repeated above

7. All nodes are configured identically

Copy Compressed Package

Scp-r ~/hadoop-1.2.1.tar.gz [Email protected]:~/

Extract

TAR-ZXVF hadoop-1.2.1.tar.gz

Create a soft connection

ln-sf/root/hadoop-1.2.1/home/hodoop-1.2

To format

[Email protected] bin]#/hadoop Namenode-format

Configure Java_home

# Set hadoop-Specific environment variables here.# the only required environment variable isJava_home.  All others are# optional. When running a distributed configuration it isBest to#SetJava_homeinch  ThisFile, so, it iscorrectly defined on# remote nodes.# the Java implementation to use. Required. Export Java_home =/usr/java/jdk1.7 . 0_75# Extra Java CLASSPATH elements. optional.# Export Hadoop_classpath=# The maximum amount of heap to use,inchMB. Default is  +. # Export Hadoop_heapsize= -# Extra Java runtime options. Empty bydefault. # Export Hadoop_opts=-server# Command Specific options appended to hadoop_opts when specified"hadoop-env.sh" 57L, 2433C

Copy the configured configuration file to another machine (copy to 202,203)

[Email protected] conf]# SCP./* [Email protected]:/home/hadoop-1.2/conf/

Start

[Email Protected]ogon bin]#./start-dfs.sh

Need to shut down the firewall before booting

Service Iptables Stop

After starting, you can use JPS to see if it started successfully.

Hadoop-hdfs Distributed File System

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.