Hadoop installation is very simple. You can download the latest versions from the official website. It is best to use the stable version. In this example, three machine clusters are installed. The hadoop version is as follows:
- 1
Install ubuntu
Ubuntu12.10 swap space 4 GB (memory 2 GB ). The installation process is not described in detail.
Uniform user names: hadoop;
GROUP: hadoop;
Machine name: namenode (master node), datanode1 (slave Node 1), datanode2 (slave Node 2 ).
- 2
Create a hadoop user group and user in Ubuntu (you can also configure it through the interface when installing the system)
1. Create a hadoop user group;
$ Sudo addgroup hadoop
Write down the password set for the hadoop user, which will be used later when logging on to the hadoop user.
2. Create a hadoop user;
$ Sudo adduser-ingroup hadoop
3. add permissions to hadoop users to open the/etc/sudoers file;
$ Sudo gedit/etc/sudoers
4. Grant the root user the same permissions to the hadoop user.
Add the following content under root ALL = (ALL: ALL) ALL:
Hadoop ALL = (ALL: ALL) ALL
- 3
Install JDK in Ubuntu
Configure environment variables:
$ Sudo gedit/etc/profile
Add the following content at the end of the file to save and close the file.
# For java
Export JAVA_HOME =/Usr/lib/jvm/jdk1.8.0
Export JRE_HOME =$ {JAVA_HOME}/jre
Export CLASSPATH =. :$ {JAVA_HOME}/lib :$ {JRE_HOME}/lib
Export PATH =$ {JAVA_HOME}/bin :$ {JRE_HOME}/bin: $ PATH
Make the settings take effect:The configuration of this JDK is complete.
$ Source/etc/profile
To ensure correctness, perform a simple test
$ Java-version
Output:
Java version "1.8.0"
Java (TM) SE Runtime Environment (build 1.8.0)
Java HotSpot (TM) Server VM
Modify the default JDK:
Sudo update-alternatives -- install/usr/bin/java/Usr/lib/jvm/jdk1.8.0/bin/In java 300
Sudo update-alternatives -- install/usr/bin/javac/Usr/lib/jvm/jdk1.8.0/bin/Javac 300
Sudo update-alternatives -- install/usr/bin/jar/Usr/lib/jvm/jdk1.8.0/bin/Jar2 300
The following two sentences are optional (preferably executed ):
Sudo update-alternatives -- config java
Sudo update-alternatives -- config javac
- 4
Modify the machine name (this step can be omitted)
When ubuntu is successfully installed, the default machine name is ubuntu. However, to make it easy to distinguish servers in the cluster, you need to give different names to each machine. The machine name is determined by the/etc/hostname file.
1. Open the/etc/hostname file;
$ Sudo gedit/etc/hostname
2. Change ubuntu in the/etc/hostname file to the corresponding machine, such as the master node, to "namenode ".Restart takes effect.
3. Modify the/etc/hosts file
$ Sudo gedit/etc/hosts
As shown below,Set the IP address of each machine to a fixed IP address.:
- 5
Install the ssh service
Ssh here does not refer to three major frameworks: spring, struts, and hibernate, but a remote logon protocol.
Generally, only the ssh-agent is installed on ubuntu by default. You can run the following command:
$ Sudo ps-ef | grep ssh
If you only have ssh-agent, you need to install openssh-server.
$ Sudo apt-get install ssh openssh-server
- 6
Create an ssh password-less login to the Local Machine
First, convert to a hadoop user and execute the following command:
$ Sudo su-hadoop
Ssh key generation methods include rsa and dsa. By default, rsa is used.
1. Create an ssh-key. Here we use the rsa method;
$ Ssh-keygen-t rsa-P ""
Note that ssh-kengen is connected by a hyphen and must not be separated.The problem is as follows:
According to your step-by-step authentication process, the dot I generated in ssh-keygen-t rsa-P "" is displayed on the terminal after the hacker attempts to access bad escape character "ygen" (the red part is the problem, it separates ssh-keygen .)
(Note: After you press enter ~ Two files are generated under/. ssh/: id_rsa and id_rsa.pub)
2. Create the authorization file authorized_keys
Go ~ The/. ssh/directory does not contain the authorized_keys file. You can use either of the following methods:
(1) append id_rsa.pub to the authorized_keys authorization file;
$ Cd ~ /. Ssh
$ Cat id_rsa.pub> authorized_keys
(2) Copy id_rsa.pub to authorized_keys.
$ Cp ~ /. Ssh/id_rsa.pub ~ /. Ssh/authorized_keys
3. log on to localhost;
$ Ssh localhost
4. Execute the exit command (exit or CTRL + D );
- 7
Install hadoop
1. Set hadoop-0.4102.tar.gz on the desktop and copy it to the installation directory./Usr/local/;
$ Sudo cp hadoop-0.20.2.tar.gz/usr/local/
2. Decompress hadoop-0.4102.tar.gz;
$ Cd/usr/local
$ Sudo tar-zxvf hadoop-0.20.2.tar.gz
3. Rename the decompressed folder to hadoop;
$ Sudo mv hadoop-0.20.2 hadoop
4. Set the owner of the hadoop folder to hadoop,
$ Sudo chown-R hadoop: hadoop
5. Open the hadoop/conf/hadoop-env.sh file;
$ Sudo gedit hadoop/conf/hadoop-env.sh
6. Configure conf/hadoop-env.sh (find # exportJAVA_HOME =..., remove #, and then addLocal jdk path);
Export JAVA_HOME =/Usr/lib/jvm/jdk1.8.0
7. Open the conf/core-site.xml file;
$ Sudo gedit hadoop/conf/core-site.xml
<Configuration>
<Property>
<Name> fs. default. name </name>
<Value> hdfs :/// namenode: 9000 </value>
</Property>
</Configuration>
8. Open the conf/mapred-site.xml file;
$ Sudo gedit hadoop/conf/mapred-site.xml
<Configuration>
<Property>
<Name> mapred. job. tracker </name>
<Value> namenode: 9001 </value>
</Property>
</Configuration>
9. Open the conf/hdfs-site.xml file;
$ Sudo gedit hadoop/conf/hdfs-site.xml
<Configuration>
<Property>
<Name> dfs. name. dir </name>
<Value>/usr/local/hadoop/datalog1,/usr/local/hadoop/datalog2 </value>
</Property>
<Property>
<Name> dfs. data. dir </name>
<Value>/usr/local/hadoop/data1,/usr/local/hadoop/data2 </value>
</Property>
<Property>
<Name> dfs. replication </name>
<Value> 3 </value>
</Property>
</Configuration>
Note: The datalog1, datalog2, data1, and data2 folders should be created in advance.
10. Open the conf/masters file and add the host name as secondarynamenode.
11. Open the conf/slaves file and add one row as the Server Load balancer host name.
The author's settings are as follows:
- 8
Run hadoop
1. Go to the hadoop directory and format the hdfs file system. This operation is required when you first run hadoop,
$ Cd/usr/local/hadoop/
$ Bin/hadoop namenode-format
2. Start bin/start-all.sh
Go to the bin directory, $./start-all.sh close: Same directory./stop-all.sh
3. Check whether hadoop is successfully started.
$ Jps
4. Web View
Http: // localhost: 50030/-a web access interface of jobtracker, in order to facilitate users to view relevant information.
Similar:
Http: // localhost: 50070/-web UI of theNameNode daemon
Http: // localhost: 50030/-web UI of theJobTracker daemon
Http: // localhost: 50060/-web UI of theTaskTracker daemon
END