Installing Hadoop on a single machine on Ubuntu

Source: Internet
Author: User

Recent Big Data Compare fire, so also want to learn a bit, so install Ubuntu Server on the virtual machine, then install Hadoop.

Here are the installation steps:

1. Installing Java

If it is a new machine, the default is not to install Java, run java–version named to see if you can view the Java version, if Java is not installed, this runs the following name:

# Update The source list
$ sudo apt-get update

# The OpenJDK project is the default version of Java
# That's provided from a supported Ubuntu repository.
$ sudo apt-get install DEFAULT-JDK

$ java-version

2. Setting up Hadoop users and groups

$sudo AddGroup Hadoop

$ sudo adduser--ingroup Hadoop hduser

3. Install and configure SSH

$ sudo apt-get install SSH

$ su HDUser

$ ssh-keygen-t Rsa-p ""

Cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

Next run the SSH command to test for success.

$ ssh localhost

4. Installing Hadoop

First you need to download and unzip the Hadoop file and run the command:

$wget http://apache.spinellicreations.com/hadoop/common/hadoop-2.6.0/hadoop-2.6.0.tar.gz

The URL here is the latest version of Hadoop2.6.0, when installed, you can go to the official website to see which version to download, and then replace the URL.

After the download is complete, unzip:

$ tar xvzf hadoop-2.6.0.tar.gz

Then move the Hadoop folder to the new folder and give HDUser this user right:

$ sudo mv Hadoop-2.6.0/usr/local/hadoop

$ cd/usr/local

$ sudo chown-r hduser:hadoop Hadoop

5. Configure Hadoop

Next we can use Putty to connect to Ubuntu via SSH and switch the current user to HDUser to do the following:

5.1 Modify ~/.BASHRC File

Run the command first to see the path to Java:

$ update-alternatives--config java
There is only one alternative in link Group Java (providing/usr/bin/java):/usr/lib/jvm/java-7-openjdk-amd64/jre/bin/ Java
Nothing to configure.

The javahome we need here is:/USR/LIB/JVM/JAVA-7-OPENJDK-AMD64, then use VI edit ~/.BASHRC

#HADOOP VARIABLES START
Export JAVA_HOME=/USR/LIB/JVM/JAVA-7-OPENJDK-AMD64
Export Hadoop_install=/usr/local/hadoop
Export path= $PATH: $HADOOP _install/bin
Export path= $PATH: $HADOOP _install/sbin
Export Hadoop_mapred_home= $HADOOP _install
Export Hadoop_common_home= $HADOOP _install
Export Hadoop_hdfs_home= $HADOOP _install
Export Yarn_home= $HADOOP _install
Export hadoop_common_lib_native_dir= $HADOOP _install/lib/native
Export hadoop_opts= "-djava.library.path= $HADOOP _install/lib"
#HADOOP VARIABLES END

5.2 Modify hadoop-env.sh File

The path to the file is:/usr/local/hadoop/etc/hadoop/hadoop-env.sh, locate the corresponding row, and change the contents to:

Export JAVA_HOME=/USR/LIB/JVM/JAVA-7-OPENJDK-AMD64

5.3 Modify Core-site.xml File

Before modifying this file, we need to create a directory with the Superuser and give HDUser the directory permission:

$ sudo mkdir-p/app/hadoop/tmp
$ sudo chown hduser:hadoop/app/hadoop/tmp

Next switch back to the HDUser user, modify the configuration file, file path:/usr/local/hadoop/etc/hadoop/core-site.xml, use VI, change the configuration to:

<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/app/hadoop/tmp</value>
<description>a base for other temporary directories.</description>
</property>

<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
<description>the name of the default file system. A URI whose
Scheme and authority determine the FileSystem implementation. The
URI ' s scheme determines the Config property (fs. Scheme.impl) naming
The FileSystem implementation class. The URI ' s authority is used to
Determine the host, port, etc. for a filesystem.</description>
</property>
</configuration>

5.4 Modifying Mapred-site.xml

By default, we only have/usr/local/hadoop/etc/hadoop/mapred-site.xml.template, we need to base this file, copy a new file, and then modify it.

$ cp/usr/local/hadoop/etc/hadoop/mapred-site.xml.template/usr/local/hadoop/etc/hadoop/mapred-site.xml

Use VI to open, modify the configuration as follows:

<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
<description>the host and port that the MapReduce job tracker runs
At. If "Local", then jobs is run in-process as a single map
and reduce task.
</description>
</property>
</configuration>

5.5 Modify Hdfs-site.xml File

Before modifying, it is also necessary to switch back to the Super Administrator account to create the required directory:

$ sudo mkdir-p/usr/local/hadoop_store/hdfs/namenode
$ sudo mkdir-p/usr/local/hadoop_store/hdfs/datanode
$ sudo chown-r hduser:hadoop/usr/local/hadoop_store

Then switch back to the HDUser user, modify the configuration file:/usr/local/hadoop/etc/hadoop/hdfs-site.xml, instead:

<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
<description>default block replication.
The actual number of replications can specified when the file is created.
The default is used if replication isn't specified in Create time.
</description>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/datanode</value>
</property>
</configuration>

6. Format HDFs

$ Hadoop Namenode–format

If you find that the Hadoop command is not known, it is because the environment variable is not loaded, the simplest way is to log out and then log in to HDUser. This command deletes all existing data, so use this command with caution if data is already available.

7. Start Hadoop

Using $ start-all.sh to start Hadoop, and to determine if it started successfully, we can run the JPS command, and we can see the following results, stating that it has started successfully:

$ JPS
2149 Secondarynamenode
1805 NameNode
2283 ResourceManager
1930 DataNode
2410 NodeManager
2707 Jps
In addition, we can access the web of Hadoop, the address is:

http://serverIP:50070/

8. Turn off Hadoop

To run the command:

$ stop-all.sh

Well, finally, Hadoop was successfully built in the virtual machine. The entire process is referenced by another blog:

http://www.bogotobogo.com/Hadoop/BigData_hadoop_Install_on_ubuntu_single_node_cluster.php

I'm just going to re-explain a few of the places that need attention, jiehuaxianfo.

Installing Hadoop on a single machine on Ubuntu

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.