1. Install Ubuntu 14.04 Desktop LTS
- Download the ISO file, Ubuntu-14.04.1-desktop-amd64.iso.
- Create a new virtual machine in VirtualBox or VMware and set the ISO file as the startup disc.
- Install Ubuntu 14.04 http://www.linuxidc.com/Linux/2014-04/100473.htm under Windows 7 with VMware Workstation 10 virtual machines
- Next, enter the user name LINUXIDC where you want to enter the user, until the system installation is complete.
- Restart the virtual machine and log on to the system using LINUXIDC. To set the root password, enter the sudo passwd command:
- [Email protected]:~$ sudo passwd
- [sudo] password for LINUXIDC: (Enter your own password)
- Enter a new UNIX password: (set root password)
- Re-enter the new UNIX password: (Set the root password again)
- passwd: Password has been successfully updated
- installs enhancements in Ubuntu (VirtualBox enhancements or Vmtools) for later operation
- To add a user to the sudo group:
- Command:sudo adduser linuxidc sudo. If the user is already prompted within this group.
- Add the group Hadoop and add LINUXIDC to the group Hadoop (groups command to see the group the user is in ):
- Su
- AddGroup Hadoop
- Usermod-a-G Hadoop LINUXIDC
2. Configure SSH password-free login
- Install SSH server:sudo apt-get install openssh-server
- CD ~/.ssh/ # If you do not have this directory, first execute SSH localhostssh-keygen-t RSA # Will be prompted, press ENTER to be able to cat id_rsa.pub >> authorized_keys # Add to license
- Use ssh localhost to try to log in directly
3. Configuring the JDK
- Download JDK1.7 to Oracle website,http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html. Download jdk-7u79-linux-x64.tar.gz.
- Because you are ready to install the JDK under/USR/LIB/JVM, set up the JVM folder in the/usr/lib/directory first
- cd/usr/lib/
- sudo mkdir JVM
- Unzip the downloaded installation package to the JVM folder (jdk1.7.0_79)
- sudo tar zxvf/home/linuxidc/desktop/jdk-7u79-linux-x64.tar.gz-c/USR/LIB/JVM
- Set Java_home:
- Nano ~/.BASHRC
- Add before the first line: Export java_home=/usr/lib/jvm/jdk1.7.0_79, save exit.
- Make the setting effective immediately:source ~/.BASHRC
- Test whether the Java_home is set successfully, output the path set above to indicate success:
4, Installation Hadoop2.7.1
- Download the file, http://hadoop.apache.org/releases.html, select 2.7.1 binary format to download . The file name is: hadoop-2.7.1.tar.gz.
- We chose to install Hadoop to/usr/local/
- sudo tar-zxvf /home/linuxidc/desktop/hadoop-2.7.1.tar.gz -c/usr/local
- cd/usr/local/ sudo mv./hadoop-2.7.1/./hadoop # Change folder name to Hadoop sudo chown-r linuxidc./hadoop # Modify file Permissions Test if the installation is successful if the command help indicates success:/usr/local/hadoop/bin/hadoop
- Configuring Environment variables
- Nano ~/.BASHRC
- After opening the interface, enter it after the Java_home that you configured earlier :
export hadoop_install=/usr/local/HADOOP
export path= $PATH: $HADOOP _install/bin
export path= $PATH: $HADOOP _install/sbin
export hadoop_mapred_home= $HADOOP _install
Export hadoop_common_home= $HADOOP _install
export hadoop_hdfs_home= $HADOOP _install
Export Yarn_home= $HADOOP _install
- Note: Configure export hadoop_home=/usr/local/HADOOP
- Turn on the Hadoop2.6.0 of SSH can not resolve the host name and other error prompt solution! (http://shisiyou.blog.51cto.com/9864147/1608060)
Adding the following statements to ~/.bash_profile or/etc/profile is no problem.
#vi/etc/profile or VI ~/.bash_profile
Export hadoop_common_lib_native_dir= $HADOOP _home/lib/native
Export hadoop_opts= "-djava.library.path= $HADOOP _home/lib"
Then recompile with source to make it effective!
#source/etc/profile or Source ~/.bash_profile
- Make the setting effective immediately:source ~/.BASHRC
- Checks whether the installation succeeds, executes the command, and indicates success if the command help appears: HDFs
5. Configuring Pseudo-distributed
- Switch to profile directory: cd /usr/local/hadoop/etc/hadoop
- Configure Core-site.xml, execute command:nano core-site.xml. Replace <configuration></configuration> in file with:
- <configuration><property><name>hadoop.tmp.dir</name><value>file:/usr/local/hadoop/tmp</value><description>abase for other temporary directories.</description></property><property><name>fs.defaultFS</name><value>hdfs://localhost:9000</value></property></configuration>
- Configure Hdfs-site.xml, execute command:nano hdfs-site.xml, modified to:
- <configuration><property><name>dfs.replication</name><value>1</value></property><property><name>dfs.namenode.name.dir</name><value>file:/usr/local/hadoop/tmp/dfs/name</value></property><property><name>dfs.datanode.data.dir</name><value>file:/usr/local/hadoop/tmp/dfs/data</value></property></configuration>
A little explanation of Hadoop configuration items: Although you only need to configure FS.DEFAULTFS and dfs.replication to run (this is the official tutorial), if you do not configure the Hadoop.tmp.dir parameter, the default temporary directory is/tmp/hadoo-hadoop. This directory may be removed by the system when it restarts, causing the format to be re-executed. So we set it up, and we also specify Dfs.namenode.name.dir and Dfs.datanode.data.dir, otherwise you might get an error in the next step.
- Configure Yarn-site.xml, execute command:nano yarn-site.xml, modified to:
- <configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> </configuration>
- Configure Mapred-site.xml
- Copy an XML from the template file, execute the command:MV mapred-site.xml.template mapred-site.xml
- Execute command:nano mapred-site.xml
- Change the document to
<configuration>
<property>
<name>mapreduce.framework.name</name >
<value>yarn</value>
</property>
</configuration>
- After the configuration is complete, perform the formatting:hdfs namenode-format. The fifth line appears exitting with status 0 indicates success, and if exitting with status 1 is an error.
- start all of the Hadoop processes: start-all.sh
- To see if each process starts normally, execute:JPS. If everything is OK, you will see the following results:
2583 DataNode 2970 ResourceManager 3461 Jps 3177 nodemanager 2361 namenode 2840 Secondarynam
- "Note" If you are prompted after performing JPS:
The
program ' JPS ' is already included in the following packages: * default-jdk* ecj* gcj-4.6-jdk* openjdk-6-jdk* gcj-4.5-jdk* openjdk-7-jdk try: sudo apt-get insta ll < selected packages >
then, manually set the system default JDK by executing the following command:sudo update-alternatives--install/usr/bin/jps JPS/usr/lib/jvm/jdk1.7.0_79/bin/jps 1 sudo update-alternatives--install/usr/bin/javac javac/usr/lib/jvm/Jdk1.7.0_79/bin/javac sudo update-alternatives--install/usr/bin/java java/usr/lib/jvm/Jdk1.7.0_79/bin/java
Ubuntu 14.04 LTS installed on the deployment of Hadoop 2.7.1