Recent Big Data Compare fire, so also want to learn a bit, so install Ubuntu Server on the virtual machine, then install Hadoop.
Here are the installation steps:
1. Installing Java
If it is a new machine, the default is not to install Java, run java–version named to see if you can view the Java version, if Java is not installed, this runs the following name:
# Update The source list
$ sudo apt-get update
# The OpenJDK project is the default version of Java
# That's provided from a supported Ubuntu repository.
$ sudo apt-get install DEFAULT-JDK
$ java-version
2. Setting up Hadoop users and groups
$sudo AddGroup Hadoop
$ sudo adduser--ingroup Hadoop hduser
3. Install and configure SSH
$ sudo apt-get install SSH
$ su HDUser
$ ssh-keygen-t Rsa-p ""
Cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
Next run the SSH command to test for success.
$ ssh localhost
4. Installing Hadoop
First you need to download and unzip the Hadoop file and run the command:
$wget http://apache.spinellicreations.com/hadoop/common/hadoop-2.6.0/hadoop-2.6.0.tar.gz
The URL here is the latest version of Hadoop2.6.0, when installed, you can go to the official website to see which version to download, and then replace the URL.
After the download is complete, unzip:
$ tar xvzf hadoop-2.6.0.tar.gz
Then move the Hadoop folder to the new folder and give HDUser this user right:
$ sudo mv Hadoop-2.6.0/usr/local/hadoop
$ cd/usr/local
$ sudo chown-r hduser:hadoop Hadoop
5. Configure Hadoop
Next we can use Putty to connect to Ubuntu via SSH and switch the current user to HDUser to do the following:
5.1 Modify ~/.BASHRC File
Run the command first to see the path to Java:
$ update-alternatives--config java
There is only one alternative in link Group Java (providing/usr/bin/java):/usr/lib/jvm/java-7-openjdk-amd64/jre/bin/ Java
Nothing to configure.
The javahome we need here is:/USR/LIB/JVM/JAVA-7-OPENJDK-AMD64, then use VI edit ~/.BASHRC
#HADOOP VARIABLES START
Export JAVA_HOME=/USR/LIB/JVM/JAVA-7-OPENJDK-AMD64
Export Hadoop_install=/usr/local/hadoop
Export path= $PATH: $HADOOP _install/bin
Export path= $PATH: $HADOOP _install/sbin
Export Hadoop_mapred_home= $HADOOP _install
Export Hadoop_common_home= $HADOOP _install
Export Hadoop_hdfs_home= $HADOOP _install
Export Yarn_home= $HADOOP _install
Export hadoop_common_lib_native_dir= $HADOOP _install/lib/native
Export hadoop_opts= "-djava.library.path= $HADOOP _install/lib"
#HADOOP VARIABLES END
5.2 Modify hadoop-env.sh File
The path to the file is:/usr/local/hadoop/etc/hadoop/hadoop-env.sh, locate the corresponding row, and change the contents to:
Export JAVA_HOME=/USR/LIB/JVM/JAVA-7-OPENJDK-AMD64
5.3 Modify Core-site.xml File
Before modifying this file, we need to create a directory with the Superuser and give HDUser the directory permission:
$ sudo mkdir-p/app/hadoop/tmp
$ sudo chown hduser:hadoop/app/hadoop/tmp
Next switch back to the HDUser user, modify the configuration file, file path:/usr/local/hadoop/etc/hadoop/core-site.xml, use VI, change the configuration to:
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/app/hadoop/tmp</value>
<description>a base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
<description>the name of the default file system. A URI whose
Scheme and authority determine the FileSystem implementation. The
URI ' s scheme determines the Config property (fs. Scheme.impl) naming
The FileSystem implementation class. The URI ' s authority is used to
Determine the host, port, etc. for a filesystem.</description>
</property>
</configuration>
5.4 Modifying Mapred-site.xml
By default, we only have/usr/local/hadoop/etc/hadoop/mapred-site.xml.template, we need to base this file, copy a new file, and then modify it.
$ cp/usr/local/hadoop/etc/hadoop/mapred-site.xml.template/usr/local/hadoop/etc/hadoop/mapred-site.xml
Use VI to open, modify the configuration as follows:
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
<description>the host and port that the MapReduce job tracker runs
At. If "Local", then jobs is run in-process as a single map
and reduce task.
</description>
</property>
</configuration>
5.5 Modify Hdfs-site.xml File
Before modifying, it is also necessary to switch back to the Super Administrator account to create the required directory:
$ sudo mkdir-p/usr/local/hadoop_store/hdfs/namenode
$ sudo mkdir-p/usr/local/hadoop_store/hdfs/datanode
$ sudo chown-r hduser:hadoop/usr/local/hadoop_store
Then switch back to the HDUser user, modify the configuration file:/usr/local/hadoop/etc/hadoop/hdfs-site.xml, instead:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
<description>default block replication.
The actual number of replications can specified when the file is created.
The default is used if replication isn't specified in Create time.
</description>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/datanode</value>
</property>
</configuration>
6. Format HDFs
$ Hadoop Namenode–format
If you find that the Hadoop command is not known, it is because the environment variable is not loaded, the simplest way is to log out and then log in to HDUser. This command deletes all existing data, so use this command with caution if data is already available.
7. Start Hadoop
Using $ start-all.sh to start Hadoop, and to determine if it started successfully, we can run the JPS command, and we can see the following results, stating that it has started successfully:
$ JPS
2149 Secondarynamenode
1805 NameNode
2283 ResourceManager
1930 DataNode
2410 NodeManager
2707 Jps
In addition, we can access the web of Hadoop, the address is:
http://serverIP:50070/
8. Turn off Hadoop
To run the command:
$ stop-all.sh
Well, finally, Hadoop was successfully built in the virtual machine. The entire process is referenced by another blog:
http://www.bogotobogo.com/Hadoop/BigData_hadoop_Install_on_ubuntu_single_node_cluster.php
I'm just going to re-explain a few of the places that need attention, jiehuaxianfo.
Installing Hadoop on a single machine on Ubuntu