1. Download the latest binary version to the Hadoop website.
2. Upload the package to your Linux server, unzip it, and configure the environment variable. PS: I use centos6.9 here, you can also use other Linux, such as Ubuntu.
Decompression command: TAR-ZXVF hadoop-2.9.0.tar.gz
Folder renaming: MV hadoop-2.9.0 Hadoop
Configuring Environment variables: vim/etc/profile
Export path=.: $PATH: $JAVA _home/bin: $HADOOP _home/bin
alias cdha= ' Cd/home/hadoop '
export hadoop_home=/home/ Hadoop
1 2 3
Let the changes take effect:
Source/etc/profile
Here we set up an alias Cdha that can quickly go to the Hadoop directory.
(Note that you need to have the java8 installed beforehand.) )
3. Modify the Hadoop configuration file
The configuration files that need to be modified are located in the $hadoop_home/etc/hadoop directory.
1) Vim hadoop-env.sh
Export java_home=/usr/java/jdk1.8.0_131
1
Note that this must be the absolute path of JAVA, can not be replaced with $java_home.
2) Vim Core-site.xml
3) Vim Hdfs-site.xml
4, configuration file modification Description:
Hadoop runs in a configuration file (the configuration file is read when running Hadoop), so if you need to switch back from pseudo-distributed mode to non-distributed mode, you need to remove the configuration items from the Core-site.xml.
In addition, pseudo-distributed, although only need to configure FS.DEFAULTFS and dfs.replication can be run (the official tutorial), but if not configured Hadoop.tmp.dir parameters, the default use of the temporary directory is/tmp/hadoo-hadoop, This directory may be removed by the system when it restarts, causing the format to be re-executed. So we set it up, and we also specify Dfs.namenode.name.dir and Dfs.datanode.data.dir, otherwise you might get an error in the next step.
5. After the configuration is complete, execute the following statement to format Namenode:
./bin/hdfs Namenode–format
Success will see the following prompt:
6. Then execute the following command to open the NameNode and DataNode daemons:
./sbin/start-dfs.sh
If SSH prompts you to connect, enter Yes
When the boot is complete, enter the command JPS to determine whether the startup was successful.
If successful, the following processes are listed: "NameNode", "DataNode", and "Secondarynamenode"
After successful startup, you can access the Web interface http://localhost:50070 View NameNode and Datanode information, and you can view the files in HDFS online.
If you have the following page, congratulations on the successful installation!
Http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html