Hadoop 2.6.0 Fully Distributed installation

Source: Internet
Author: User
Tags static class openssh server hadoop fs
1. Installation Pre-Installation Preparation: There are three ubuntu14.04 systems equipped with OpenSSH server (you can also prepare 1 units, followed by a clone of the virtual machine, or import export). Three machines are needed here in the same network segment. Start Installation 1 start three virtual machines, modify the hostname separately
sudo vim/etc/hostname

Named respectively as:
Hadoopmaster
HadoopSlave1
HadoopSlave2

PS: Effective after reboot 2 install JDK (3 machines installed)

Here's the Apache JDK.

sudo add-apt-repository ppa:webupd8team/java
sudo apt update
sudo apt install oracle-java7-installer

Configure environment variables after installation

sudo vim ~/.BASHRC
join the Export JAVA_HOME=JDK installation path The
JDK path installed in the above manner is:/usr/lib/jvm/java-7-oracle
export path= $JAVA _home/bin: $PATH
export classpath=.: $JAVA _home/lib/dt.jar: $JAVA _home/lib/tools.jar
source ~/. BASHRC (make configuration Effective)
3) Modify the Hosts file (3 machines are the same change)
sudo vim/etc/hosts 
in the back add
10.13.7.10 hadoopmaster
10.13.7.11 HadoopSlave1
10.13.7.12 HadoopSlave2
Note: Change the IP address to its own host name corresponding to the IP
4 ssh-free login (three machines in the same operation)

The following instructions are entered on the 10.13.7.10, they are changed

Ssh-keygen (knocks in return, will prompt you to enter, all knocks the carriage return skips)
Ssh-copy-id persistence@10.13.7.10
Ssh-copy-id persistence@10.13.7.11  
Ssh-copy-id persistence@10.13.7.12 (persistence is user name, followed by other machine's IP)

Three machines have to do the above, so that the three machines to avoid each other SSH 5 download hadoop2.6.0 (three machines have to do)

wget http://apache.fayea.com/hadoop/common/hadoop-2.6.0/hadoop-2.6.0.tar.gz
6 extract Hadoop and configure related environment variables (three machines to do)
sudo tar-zxvf hadoop-2.6.0.tar.gz-c/usr/local (extract to/usr/local directory)
sudo mv/usr/local/hadoop-2.6.0/usr/local/ Hadoop (renaming files)
sudo chown-r persistence:persistence/usr/local/hadoop (modify files to belong to users and groups) (here's to change persistence to your own user, The following same)
/usr/local/hadoop/bin/hadoop (check if Hadoop is installed successfully)
Add the following in ~/.BASHRC (three machines are to be done)
sudo vim ~/.bashrc

export hadoop_install=/usr/local/hadoop
export path= $PATH: $HADOOP _install/bin
Export path= $PATH: $HADOOP _install/sbin
export hadoop_mapred_home= $HADOOP _install
export Hadoop_common_ Home= $HADOOP _install
export hadoop_hdfs_home= $HADOOP _install
export yarn_home= $HADOOP _install

SOURCE ~/.BASHRC

Validation: Enter HDFs, if you see a hint, installation success 7 Create a directory of Hadoop needs (three machines do)

sudo mkdir/home/hadoop
sudo chown-r persistence:persistence/home/hadoop
mkdir/home/hadoop/hadoop-2.6.0
mkdir/home/hadoop/hadoop-2.6.0/tmp
Mkdir/home/hadoop/hadoop-2.6.0/dfs
mkdir/home/hadoop/ Hadoop-2.6.0/dfs/name
Mkdir/home/hadoop/hadoop-2.6.0/dfs/data
8 Modify the configuration file (important, do not make mistakes) (three machines have to do)

vim/usr/local/hadoop/etc/hadoop/hadoop-env.sh
Add Export java_home=/usr/lib/jvm/java-7-oracle

Ii

Vim/usr/local/hadoop/etc/hadoop/core-site.xml
Add the following content to <configuration></configuration>
< property>
        <name>hadoop.tmp.dir</name>
        <value>/home/hadoop/hadoop-2.6.0/tmp</ Value> <description>abase for the other
        temporary directories.</description>
    </property>
    <property>
                <name>fs.default.name</name>
                <value>hdfs://hadoopmaster:9000 </value>
    </property>

Vim/usr/local/hadoop/etc/hadoop/hdfs-site.xml
Add the following content to <configuration></configuration>
< property>
    <name>dfs.name.dir</name>
    <value>/home/hadoop/hadoop-2.6.0/dfs/name< /value> <description>path on the local
    filesystem where the Namenode stores the namespace and transactions Lo GS persistently.</description>
</property>
<property>
    <name>dfs.data.dir </name>
    <value>/home/hadoop/hadoop-2.6.0/dfs/data</value>
    <description>comma Separated list of paths on the local filesystem to a datanode where it should store its blocks.</description>
& lt;/property>
<property>
    <name>dfs.replication</name>
    <value>1</ Value>
</property>

Vim/usr/local/hadoop/etc/hadoop/mapred-site.xml.template

in <configuration></configuration> Add the following
<property>
    <name>mapred.job.tracker</name>
    <value>hadoopmaster :9001</value>
    <description>host or IP and port of jobtracker.</description>
</property >

cp/usr/local/hadoop/etc/hadoop/mapred-site.xml.template/usr/local/hadoop/etc/hadoop/mapred-site.xml

Vim/usr/local/hadoop/etc/hadoop/slaves
will localhost deleted, add the following content
HadoopSlave1
HadoopSlave2

Vim/usr/local/hadoop/etc/hadoop/masters
Add the following content
Hadoopmaster
9 Format HDFs file system Namenode (three machines have to do)
Cd/usr/local/hadoop && Bin/hdfs Namenode-format
10 Start the Hadoop cluster (note: This step is only done on the Hadoopmaster)
/USR/LOCAL/HADOOP/SBIN/START-DFS.SH//This is the boot
/usr/local/hadoop/sbin/stop-dfs.sh  //This is off

Perform JPS view output after startup completes
If there are three processes in master, slave has two processes, that is, startup is successful

The above is the installation configuration Hadoop content.

Can pass through the Hadoopmaster ip:8088
And Hadoopmaster's ip:50070 view Hadoop information

Below and HDFs a few simple operations (are executed on the Hadoopmaster)

Hadoop fs-mkdir/input/--> Create folders on Hadoop
fs-rmdir/input/  --> Create folders on Hadoop Hadoop
fs-ls/--& gt; View files on Hadoop/directory
hadoop fs-rm/test.txt--> Delete file
Hadoop fs-put test.txt/--> upload file Test.txt to hadoop/directory
hadoop fs-get/test.txt--> downloads files from Hadoop to current directory
2. Simple application – Count the number of words 1 Ensure that the Hadoop cluster is launched
/usr/local/hadoop/sbin/start-dfs.sh
/usr/local/hadoop/sbin/start-yarn.sh
2) Writing Java code
Cd/home/hadoop && mkdir Example CD example && mkdir word_count_class jar vim Wordcount.java content as follows import J
Ava.io.IOException;
Import Java.util.StringTokenizer;
Import org.apache.hadoop.conf.Configuration;
Import Org.apache.hadoop.fs.Path;
Import org.apache.hadoop.io.IntWritable;
Import org.apache.hadoop.io.LongWritable;
Import Org.apache.hadoop.io.Text;
Import Org.apache.hadoop.mapreduce.Job;
Import Org.apache.hadoop.mapreduce.Mapper;
Import Org.apache.hadoop.mapreduce.Reducer;
Import Org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
Import Org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
Import Org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

Import Org.apache.hadoop.mapreduce.lib.output.TextOutputFormat; public class WordCount {public static class Wordcountmap extends Mapper<longwritable,
                Text, text, intwritable> {private final intwritable one = new intwritable (1); Private Text Word = new Text (); public void Map (longwritable key, Text value, Context context) throws IOException, Interru
                        ptedexception {String line = value.tostring ();
                        StringTokenizer token = new StringTokenizer (line);
                                while (Token.hasmoretokens ()) {Word.set (Token.nexttoken ());
                        Context.write (Word, one); }} public static class Wordcountreduce extends Reducer<text
                                , intwritable, text, intwritable> {public void reduce (text key, iterable<intwritable> values, Context context) throws IOException, interruptedexception {int sum
                        = 0;
                 for (intwritable val:values) {sum + = Val.get ();       } context.write (Key, New intwritable (sum)); } public static void Main (string[] args) throws Exception {Configuration conf = new Co
                Nfiguration ();
                Job Job = new Job (conf);
                Job.setjarbyclass (Wordcount.class);
                Job.setjobname ("WordCount");
                Job.setoutputkeyclass (Text.class);
                Job.setoutputvalueclass (Intwritable.class);
                Job.setmapperclass (Wordcountmap.class);
                Job.setreducerclass (Wordcountreduce.class);
                Job.setinputformatclass (Textinputformat.class);
                Job.setoutputformatclass (Textoutputformat.class);
                Fileinputformat.addinputpath (Job, New Path (args[0));
                Fileoutputformat.setoutputpath (Job, New Path (args[1));
        Job.waitforcompletion (TRUE);
 }
}
3 Download jar package, concurrent in/home/hadoop/example/jar directory

Download Link Common Package
Download link MapReduce
Download to local, upload to/home/hadoop/example/jar directory 4) compile run

javac-classpath/home/hadoop/example/jar/hadoop-common-2.6.0.2.2.9.9-2.jar:/home/hadoop/example/jar/ hadoop-mapreduce-client-core-2.6.0.2.2.9.9-2.jar-d Word_count_class Wordcount.java (compiled)
CD Word_count_class
JAR-CVF Wordcount.jar *.class (Pack)

cd/home/hadoop/example
set up two files of its own to name File1,file2. and add some words

to yourself. Hadoop fs-mkdir/input/
Hadoop fs-put file*/input/
hadoop jar Word_count_class/wordcount.jar wordcount/input/ Output

You can see the word statistics when you finish the execution.

Hadoop fs-ls/output (Output results in these three directories, we want the results in part-r-00000)
Hadoop fs-cat/output/part-r-00000

Over,thanks.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.