Hadoop installation (three VMS) FAQs

Source: Internet
Author: User
Tags gz file hadoop fs

Now there are a lot of articles on hadoop installation on the network. I also tried to install it according to their methods. Hey, this is not good. If there is no problem, I can only find Gu's teacher one by one, what Mr. Gu provided was messy, and finally it was installed. I wrote this article based on the name of the machine. It is for reference only.

Machine name IP Address

Master 10.64.79.153 namenode

Leon03 10.64.79.158 datanode

Leon04 10.64.79.159 datanode

I won't talk about installing virtual machines. There are many tutorials on the Internet. You can refer to them. Note that,

(1) Some software may have problems when using apt-Get install when ubtuntu is installed. This is because there is no update problem, so first apt-Get update will be installed.

(2) There are also three machines installed to shut down the firewall and use Sudu UFW disable

(3) If you want to modify the machine name, enter/etc/hostname to change the machine name.

(4) Modify/etc/hosts, VIM/etc/hosts, delete all of them, and then add

127.0.0.1 localhost

<configuration>     <property>      <name>dfs.replication</name>      <value>2</value>     </property></configuration>

10.64.79.153 master

10.64.79.158 leon03

10.64.79.159 leon04

(4) Add the DOOP user on the three machines, that is

Root @ master :~ $ Adduser doop

Enter the password and enter some information. You can press Enter.

(5)

root@master:~/home$ chown  -R  doop:doop  doop

// Change the user of the DOOP folder to the user of the DOOP folder. Otherwise, we use adduser
After the DOOP user is created, the DOOP user cannot create a folder in his/her home directory (for example, $ mkdir
. Ssh) The system prompts no permission to create a folder.

(6) Configure SSH password-less logon on the three virtual machines respectively. (Openssh-server must be installed at the same time)

Step 1: root @ master :~ $ Sudo apt-Get Install SSH

// Enable the SSH service.
Step 2: Doop @ master :~ $ Mkdir. SSH
Use the DOOP user to create a. Ssh directory under the/home/Doop directory of the three VMS
Step 3: Doop @ master :~ /. Ssh/$ ssh-keygen-t dsa-p'-f ~ /. Ssh/id_dsa
// This command will generate a key pair for the DOOP users on the masters. The generated key pairs id_dsa and id_dsa.pub are stored in the/home/Doop/. Ssh directory by default.
Step 4: Doop @ master :~ /. Ssh/$ cat id_dsa.pub> authorized_keys
// Append id_dsa.pub to the authorization key (currently there is no authorized_keys file, you can also directly use the CP command ). At this point, no password is configured to log on to the local machine.

Run the $ SSH localhost command to test the configuration. Enter yes when you log on for the first time.

Step 5:

     doop@master:~/.ssh/$scp id_dsa.pub doop@leon03:/home/doop/.ssh/  

// Transfer the id_dsa.pub file to the VM leon03. The same applies to leon04.

Step 6: Perform Step 4 on the VM leon03. The same applies to leon04. Now the master can log on to leon03 and leon04 without a password. If you want to log on to either leon03 or leon04, you can log on without a password.
To generate your own key pair on Master, you must also perform operations on leon03 and salve2, and append the key pair to the authorized_keys file of the three machines.
Now the SSH configuration on each machine has been completed. You can test it. For example, the master initiates an SSH connection to leon03.

doop@mater:~/.ssh$ ssh  leon03

OpenSSH tells you that it does not know this host, but you do not have to worry about this problem, because it is the first time you log on to this host. Type "yes" to add the "recognition mark" of this host to "~ /. Ssh/know_hosts "file.
When you access this host twice, you will find that you can establish an SSH connection without entering a password. Congratulations, the configuration is successful. However, do not forget to test the local SSH localhost (because the following hadoop command
Take this step.

Note that if you are not successful, remember to delete "~ The/. Ssh/know_hosts file can be restarted.

(7) install jdk1.6 on the three virtual machines respectively.

Step 1: copy the jdk-6u13-linux-i586.bin to the/home/Doop directory

Step 2:

        root@master:~/home/doop$chmod u+x jdk-6u13-linux-i586.bin

// Modify the BIND file to an executable file.

Step 3:

       root@master:~/home/doop/$./jdk-6u13-linux-i586.bin

// Run the Installation File

Step 4:

       root@master:~/$gedit /etc/profile

Add the following information to the file. (Add as needed)

              export JAVA_HOME=/home/doop/jdk1.6.0_13              export JRE_HOME=/home/doop/jdk1.6.0_13/jre              exportCLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH              export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$PATH

Then root @ master :~ /$ Source/etc/profile indicates that the file content takes effect.

Step 5: restart your computer. If $ Java-version is displayed for any user:

              java version "1.6.0_13"               Java(TM) SE Runtime Environment (build 1.6.0_04-b12)               Java HotSpot(TM) Client VM (build 10.0-b19, mixed mode,sharing)

Indicates that JDK is successfully installed. You can perform the same operation on leon03 and leon04, or use $

SCP-r copies the jdk1.6.0 _ 13 folder to the other two virtual machines, and then configures the environment variable.

So far, JDK has been installed successfully. Note that JDK installation is prone to one problem: First, after the installation is complete, only login users at the time of installation are available to other users in Java. In this case, you need to set the/etc/environment file.
. You can set the method to search online.

(8). Install Hadoop-0.20.1 on three virtual machines separately

Step 1: copy the hadoop-0.20.1.tar.gz file to the/home/Doop directory.

Step 2:

          root@master:~/home/doop/$tar -xzvf hadoop-0.20.205.0.tar.gz

// Decompress the file.

Step 3:

          root@master:~/home/doop/$chown  doop:doop hadoop-0.20.205.0

// Change the file owner to Doop.

Step 4:

         root@master:~/home/doop/$gedit /etc/profile

Enter the file and add the following information to the file.

         export HADOOP_HOME=/home/doop/hadoop-0.20.205.0         export PATH=$HADOOP_HOME/bin:$PATH

Step 5: Change the conf/core-site.xml, CONF/hdfs-site.xml, CONF/mapred-site.xml, CONF/hadoop-env.sh, CONF/, CONF/masters, CONF/slaves file under the conf directory.

root@master:~/home/doop/hadoop-0.20.205.0/conf/$ vim hadoop-env.sh

Enter the file and add the following information.

Enter the file and add the following information.

export   JAVA_HOME=/home/doop/jdk1.6.0_13 

Root @ master :~ /Home/Doop/hadoop-0.20.205.0/CONF/$ Vim masters

Enter the file and add the following information.

10.64.79.153

Root @ master :~ /Home/Doop/hadoop-0.20.205.0/CONF/$ Vim slaves

Enter the file and add the following information.

10.64.79.15810.64.79.159

Root @ master :~ /Home/Doop/hadoop-0.20.205.0/CONF/$ Vim core-site.xml

Root @ master :~ /Home/Doop/hadoop-0.20.205.0/CONF/$ Vim core-site.xml
Enter the file and add the following information.

 

<configuration>     <property>      <name>hadoop.tmp.dir</name>      <value>/home/doop/tmp</value>      <description>A basefor other temporary directories.</description></property><!-- file system properties --><property>      <name>fs.default.name</name>      <value>hdfs://10.64.79.153:9000</value></property></configuration>

Root @ master :~ /Home/Doop/hadoop-0.20.205.0/CONF/$ Vim hdfs-site.xml
Enter the file and add the following information. (Replication is 3 by default. If no modification is made, an error will be reported if there are less than three datanode nodes ).

<configuration>     <property>      <name>dfs.replication</name>      <value>1</value>     </property></configuration>

Root @ master :~ /Home/Doop/hadoop-0.20.205.0/CONF/$ Vim mapred-site.xml
Enter the file and add the following information.

<configuration><property>      <name>mapred.job.tracker</name>      <value>10.64.79.153:9001</value></property></configuration>

Step 6: Doop @ master :~ /$ Scphadoop-0.20.205.0 Doop @ leon03:/home/Doop/hadoop-0.20.205.0
// Copy the file hadoop-0.20.205.0 to the other two virtual machines.
Now hadoop installation is complete. Note: The error that the main function cannot be found during running indicates that the setting of the profile environment variable is invalid. You can add some information to the/etc/environment file. Search for details online.
 
8. Run the wordcount routine provided by hadoop.
Step 1: Doop @ master :~ /Hadoop-0.20.205.0/bin/$ hadoop namenode-format
// Format the file system and create a new file system.
Step 2: Doop @ master :~ /Hadoop-0.20.205.0/bin $ start-all.sh
// Start all the daemon processes of hadoop.
Step 4: Doop @ master :~ /Hadoop-0.20.1/$ JPs
// View the results of the process on the master VM. Note that JPS is a small tool under JDK. in the bin/JPs directory, it is best to configure the PATH environment variable in the DOOP @ leon03: ~ On the leon03 and leon04 virtual machines :~ /Hadoop-0.20.205.0/$ JPs

Step 4: (1) create two input files file01 and file02 on the local disk:

Step 5: Doop @ master :~ /Soft/$ echo "Hello World bye world"> file01
Doop @ master :~ /Soft/$ echo "Hello hadoop goodbye hadoop"> file02
// Create two input files file01 and file02 on the local disk. Add the hello World bye world Statement and hello hadoop goodbye hadoop statement to file01 and file02 respectively.
Step 6: Doop @ master :~ /Hadoop-0.20.1/$./hadoopfs-mkdir Input
// Create an input directory in HDFS
Step 7: Doop @ master :~ /Hadoop-0.20.20.5.0/bin $./hadoopfs-copyfromlocal/home/Doop/soft/file0 * Input
// Copy file01 and file02 to HDFS.
Step 8: Doop @ master :~ /Hadoop-0.20.205.0/bin $./hadoop jar ../hadoop-examples-0.20.205.0.jar wordcount Input Output
// Execute wordcount. Pay attention to the path problem here. I have added the upper-layer path above. If the path is incorrect, the following error occurs:

at org.apache.hadoop.util.RunJar.main(RunJar.java:90)Caused by: java.util.zip.ZipException: error in opening zip fileat java.util.zip.ZipFile.open(Native Method)at java.util.zip.ZipFile.<init>(ZipFile.java:127)at java.util.jar.JarFile.<init>(JarFile.java:135)at java.util.jar.JarFile.<init>(JarFile.java:72)at org.apache.hadoop.util.RunJar.main(RunJar.java:88)

Step 9: Doop @ master :~ // Hadoop-0.20.205.0/bin $./hadoop FS-cat output/part-r-00000
// View the result after the task is completed:
Bye 1
Goodbye 1
Hadoop 2
Hello 2
World 2

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.