Hadoop 0.20.2+ubuntu13.04 Configuration and WordCount test

Source: Internet
Author: User
Tags unpack tmp file hadoop fs

In fact, this blog is a little late to write. Before I did some summary of the school's things to forget, these days want to take again to play to find that some things can't remember. Turned blog found actually not. Well, so hurry up and write a copy and keep it for yourself. There's a lot of this stuff online, but it's not entirely applicable. So keep a copy of it, the crap is done.

Because the reference room was written in Notepad. so no picture = =





Full Name: Myhadoop
User Name: Uit
Password: 000


System: ubuntu13.04
Hadoop version number: 0.20.2

Virtual machines:
This is the main thing I feel. There should be no more talking, virtual machines, I'm using VMware, and then the ubuntu13.04 mirror, that's not the case. Just ask the meeting, all right.



Install the Java environment First (this is really ubuntu13.04, with the following path is different only)
1. First confirm that the JDK package is on your disk or on your hard drive
2. Copy the compression package from the U disk to this directory------cp-r/media/uit/mydisk/hadoop/jdk-7u45-linux-i586.tar.gz


/USR/LIB/JVM (It is important to note that this version number of Ubuntu comes with a jdk.) It's just that. I did it myself once, because the path I didn't know where I came from.


, or you can put it on your own again)
3. Enter the directory under--CD/USR/LIB/JVM;
4. Unpack this package-----sudo tar-zxvf jdk-7u45-linux-i586.tar.gz do not write another path, is to extract directly to the current folder


。 Suppose you want to unzip to another folder and add it later (assuming you need permission to unzip it). Enter sudo-s and enter your login password to do it)
5.--sudo Gedit/etc/profile after decompression
6. Add-------#Set Java environment at the back of the
Export java_home=/usr/lib/jvm/jdk1.7.0_45
Export classpath= $CLASSPATH: $JAVA _home/lib: $JAVA _home/jre/lib
Export path= $PATH: $JAVA _home/lib: $JAVA _home/jre/bin: $PATH: $HOME/bin
(These paths are to be filled out according to their actual situation)


7. Get rid of the default settings for the system---sudo update-alternatives--install/usr/bin/java java
/usr/lib/jvm/jdk1.7.0_45/bin/java 300

sudo update-alternatives--install/usr/bin/javac javac
/usr/lib/jvm/jdk1.7.0_45/bin/javac 300


sudo update-alternatives--config Java (This assumes that Ubuntu has its own JDK, so you can make a choice.) Choose your own JDK option here)

sudo update-alternatives--config javac


8. Enter Java-version to see the version number information, you can:
Java Version "1.7.0_45"
Java (TM) SE Runtime Environment (build 1.7.0_45-b18)
Java HotSpot (TM) Client VM (build 24.45-b08, Mixed mode)







/*
Install the Java environment (this is Xbuntu 13.10 Java environment):
1. First create a directory under Ubuntu-------CD usr;
mkdir Java;
2. Copy the compression package from the U disk to this directory------cp-r/media/uit/mydisk/hadoop/jdk-7u45-linux-i586.tar.gz/usr/java
3. Enter the directory under--CD Java;
4. Unpack this package-----sudo tar-zxvf jdk-7u45-linux-i586.tar.gz do not write the other path, is directly extracted to the current folder, if you want to extract to another folder in the back add can (assuming that the decompression when the need for permission, input s Udo-s then enter your login password you can)
5.--sudo Gedit/etc/profile after decompression
6. Add-------#Set Java environment at the back of the
Export java_home=/usr/java/jdk1.7.0_45
Export classpath= $CLASSPATH: $JAVA _home/lib: $JAVA _home/jre/lib
Export path= $PATH: $JAVA _home/lib: $JAVA _home/jre/bin: $PATH: $HOME/bin
(These paths are to be filled out according to their actual situation)


7. Get rid of the default settings for the system---sudo update-alternatives--install/usr/bin/java java

/usr/java/jdk1.7.0_45/bin/java 300

sudo update-alternatives--install/usr/bin/javac javac
/usr/java/jdk1.7.0_45/bin/javac 300

sudo update-alternatives--config Java (This assumes that Ubuntu has its own JDK and will give you a choice, choose your own JDK option)

sudo update-alternatives--config javac


8. Enter Java-version to see the version number information, you can:
Java Version "1.7.0_45"
Java (TM) SE Runtime Environment (build 1.7.0_45-b18)
Java HotSpot (TM) Client VM (build 24.45-b08, Mixed mode)
*/









Install SSH: (when Ann is careful not to install under root permissions, and must first be installed in Hadoop, do not install Hadoop and then install SSH.) It's possible that Hadoop doesn't do it.)
1. sudo apt-get install SSH
sudo apt-get install rsync
2. Execute ssh-version to see if it is installed
3. View the status of SSH Ps-ef | grep SSH------Assume that there is an sshd execution is good, if not please look down
4. (assuming that after running the above view ssh and not sshd this process, for example, the following operations)--------
① to https://launchpad.net/ubuntu/lucid/i386/openssh-client/1:5.3p1-3ubuntu3 download file,

Install-----cd/home/uit/downloads after download
sudo dpkg-i openssh-client_5.3p1-3ubuntu3_i386.deb
② to https://launchpad.net/ubuntu/lucid/i386/openssh-server/1:5.3p1-3ubuntu3 download file,

After finishing installing------sudo dpkg-i openssh-server_5.3p1-3ubuntu3_i386.deb
③ to https://launchpad.net/ubuntu/lucid/i386/ssh/1:5.3p1-3ubuntu3 download file,

Install----sudo dpkg-issh_5.3p1-3ubuntu3_all.deb after finishing
It is important to note that there are sometimes errors when installing the 23rd step. I was prompted with an error, and I just started pretending to be wrong. You can use the command line


Download a thing----apt-get install libssl0.9.8, and then run the above command again (not downloaded)
④ These are done after loading, execute SSH-------CD/ETC/INIT.D
./ssh Start
⑤ok, in Execution Ps-ef | grep ssh Look at this time should have sshd this process


⑥ configuration file Changes------------------gedit/etc/ssh/sshd_config
(To make sure that these characters have no staring sign "#" before them)
Rsaauthentication Yes
Pubkeyauthentication Yes
Authorizedkeysfile. Ssh/authorized_keys
Passwordauthentication Yes
Peimitemptypasswords Yes


5. Configure no password login
① first See if there is no. ssh directory in the root directory--------Ll/home/uit, there is. Next; no, create------mkdir ~/.ssh
② Create key---------ssh-keygen-t RSA (This will let you fill in a file to save the key, just put the parentheses ni can)
③cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
④chmod ~/.ssh/authorized_keys
6. Command line-----ssh localhost---------------will come out for the first time to let you enter yes or on. Here's the yes, when I first came in.


To lose password, I go all the way to the carriage. Then I turned it off and then tried it again as if I didn't have to----that's a wonderful thing to say.

。。。
7. Successful----------------[email protected]:~/.ssh$ ssh localhost
Welcome to Ubuntu 13.04 (gnu/linux 3.8.0-19-generic i686)


* documentation:https://help.ubuntu.com/


288 packages can be updated.
178 updates are security updates.


New release ' 13.10 ' available.
Run ' Do-release-upgrade ' to upgrade to it.


Last Login:fri Dec 19:56:36 from localhost












Install Hadoop: (the internet says that Hadoop must be installed under the/home/username/with SSH, or you will still have to enter password if you configure no password to log in to Hadoop.) But now


Do not install in the same folder, reload the time to pay attention to)
(The hadoop0.20.2 version number is first loaded)
1. Copy the compression package from the U disk into this/home/uit directory------cp-r/media/uit/mydisk/hadoop/hadoop.zip/home/uit
2. Unzip the zip file-----Unzip Hadoop.zip will come up with a directory. It also includes a hadoop0.20.2.tar.gz and two chaotic


Code files
3. Take the hadoop-0.20.2.tar.gz out------cp-r/home/uit/hadoop/hadoop-0.20.2.tar.gz/home/uit
4. Unzip the hadoop-0.20.2.tar.gz--------TAR-ZXVF hadoop-0.20.2.tar.gz (I tried it in ubuntu13.04. cannot be directly in the root directory


After the decompression, do not know how, but can change, can now be extracted from the directory, and then the decompression to take out) (seems to be due to my


In the CD path is in the/home/uit/hadoop, assuming the switch to the root path, the test is no problem)
5. Delete unused files after extracting-------Rm-r/home/uit/hadoop.zip
Rm-r/home/uit/hadoop
Rm-r/home/uit/hadoop-0.20.2.tar.gz
6. Change the Java path in Hadoop---------cd/home/uit/hadoop-0.20.2/conf
Gedit hadoop-env.sh-----------------Add export at the back


Java_home=/usr/lib/jvm/jdk1.7.0_45
(If you use gedit useless, then close the terminal and then open a terminal)
Gedit Core-site.xml-------------------Add a child element
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
<property> ()
<name>hadoop.tmp.dir</name>
<value>/home/uit/hadoop-0.20.2/tmp</value>
<description>a Base for other temporary


Directories.</description>
</property>
The Hadoop.tmp.dir is not configured to cause Hadoop to be unusable after a reboot. Still have to delete the original TMP file, in the grid


Type file system, in start-all.sh talent, here tangled my long = = (this file will be created below, not urgent)


Gedit Hdfs-site.xml-------------------Add a child element
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
Gedit Mapred-site.xml-------------Add a child element
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>


7. Assign the user read access to the directory----------------sudo chown-r uit:uit/home/uit/hadoop-0.20.2
Permissions this really made my egg ache for a while. Always thought it was not well-equipped, the original authority did not fix, later found the right after the decompression


Limit what all have, is because did not add this sentence. Plus, it's all part of the current user.


8. Change the environment variable-------------------gedit/etc/profile---------------Add the following # set Hadoop path
Export hadoop_home=/home/uit/hadoop-0.20.2
Export path= $PATH: $HADOOP _home/bin (Note that. Such as


If you add this, you don't have to add bin/in front of the command. Some of the commands in the book start with bin/, because the environment variables are not added.


9. Update----------------source/etc/profile after changing environment variables


10. Create a buffer file------------------mkdir/home/uit/hadoop-0.20.2/tmp

11. Format Hadoop file system-----------------------Hadoop Namenode-format
12. Turn on thread----------------------start-all.sh
13. Configuration complete, verify, enter http://localhost:50030/(MapReduce page) in the browser http://localhost:50070 (HDFs page)
(There was a bit of a problem when I installed SSH.) So the last time you start the thread, there are errors, but still can come out page = =!)


14. Finally, it is important to note that before shutting down, be sure to pay attention to stop-all.sh, or open the virtual machine. Or.. Anyway, I was 10,000, just the beast.
(Sometimes the browser does not open the HDFs interface after booting.) Be able to run the format file system again, on reboot, assuming the internet search is not yet available. Usually that ID mismatch needs to be changed)




15. Performing WordCount Sample
① first you have to make sure that there is a program jar package, you do not have to go online search
② then create two files in the host------mkdir ~/input
VI ~/input/file01.txt--------to write the content freely
VI ~/input/file02.txt--------casually input
③ creating Directories in Hadoop-------Hadoop fs-mkdir/tmp/input
④ the previous two files in-----Hadoop fs-put ~/input/file0*.txt/tmp/input
⑤ take a look at-----------Hadoop fs-ls/tmp/input should have two files
⑥ Execution Program (JAR package)----Hadoop jar ~/hadoop-0.20.2/hadoopfirstdemo.jar WordCount


/tmp/input/tmp/output01
(7) View results----Hadoop fs-ls/tmp/output01
Hadoop fs-cat/tmp/output01/part-00000




Finally... I finally got this thing right, ╮(╯▽╰)╭. Open-source stuff. , get used to it.


= =,. The middle is also experiencing very many twists and turns. Let me learn a lot. Just when the results come out of the heart of the good breath, such a feeling
It is clear that only the children's shoes have been felt. , I would like to assume that someone has seen this thing in the future, I can only say. Come on, seniors can only help you here.
December 21, 2013 16:08:58 Wsr

Hadoop 0.20.2+ubuntu13.04 Configuration and WordCount test

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.