System for CentOS 6.9,hadoop version 2.8.3, virtual machine vmware Workstation
This article focuses on Linux virtual machine installation, environment configuration, and Hadoop local mode installation. Pseudo-distributed and installation under Windows will be supplemented.
There are a lot of tutorials on the network about Hadoop installation, and here is a summary of the installation process and possible problems.
- Linux Environment installation
- Nat
- Linux
- Set up a network
- HOST
- Other environment settings
- Java Environment Configuration
- Hadoop Native mode installation
Linux Environment installation
- Nat
- Linux
- Set up a network
- HOST
- Other environment settings
Nat
- Cancel the DHCP service;
- Set up the subnet segment, for example, I use the default 192.168.126 network segment, then the virtual machine IP is set to the network segment;
- Modify the DNS address in the NAT settings, specify the DNS for the network you are on, and the gateway address defaults to the. 2 Address of the current segment.
Linux
Installing Linux on VMware is not difficult, so skip the virtual machine settings and list some of the issues you might encounter during installation.
The media test interface appears when I install CentOS 6.9 at the beginning of the installation. May be because the use of the CD is not, so choose OK will be wrong;
The period will encounter two times about the storage space, you can directly choose to ignore all and occupy all;
The last step, Desktop
Set up a network
Since the function of DHCP auto-assigning IP has been turned off before, it is necessary to set up the network parameters yourself.
- Right-click the network, modify the IPV4 settings, address as long as 126 on the line;
- After the modification is completed, click on the network eth0, you can successfully connect, then ping it a ping.
HOST
- Modify hostname to identify, open/etc/sysconfig/network, add hostname (networking=yes first);
- Open/etc/hosts, add the IP address and the hostname of the new settings.
Other environment settings
Because it is for learning, you can actually directly shut down the Linux firewall and SELinux.
- Firewall: Chkconfig iptables off
- Selinux:/etc/sysconfig/selinux,disabled, please.
Java Environment Configuration
CentOS usually comes with openjdk, it is best to use Oracle's JDK,OPENJDK unloading (this perhaps I will write one more);
Installation:
Download the desired version (confirm that you can match with the Hadoop you use), then select a directory to unzip (the operating system class learns by itself)
Set Environment variables:
A lot of tutorials will be written to add classpath, but I don't need it for my personal testing, and I don't have to go through Oracle's instructions to configure this for Linux.
However, there are some cases where there are some people who say that not adding will make mistakes in some situations, so. We'll know when something goes wrong. But Windows is really not needed.
The specific steps are: Modify the configuration file/etc/profile, plus
export JAVA _ HOME="JDK地址"export PATH=$JAVA_HOME/bin:$PATH
Then execute Source/etc/profile
Hadoop Native mode installation
Download Hadoop without any settings, the default is local mode.
- Download the required version of Hadoop, unzip it;
- Verify that the JAVA_HOME environment variable is configured correctly: Echo;
You can try running a test file:
#test.inputhadoop mapreduce hivehbase spark stormsqoop hadoop hivespark hadoop
And then enter
bin中hadoop的目录 jar share/hadoop/mapreduce/hadoop-mapreduce-examples-x.x.x.jar wordcount 输入文件目录 输出文件名
Local mode is seen in the job ID
Run successfully when you see the success word in the output file
The above is just a local mode installation, so it's fairly simple, pseudo-distributed, fully distributed, and HA installations are much more cumbersome, and many details will be described later.
Hadoop installation under Linux (local mode)