[Hadoop] how to install Hadoop and install hadoop
Hadoop is a distributed system infrastructure that allows users to develop distributed programs without understanding the details of the distributed underlying layer.
Important core of Hadoop: HDFS and MapReduce. HDFS is responsible for storage, while MapReduce is responsible for computing.
The following describes how to install Hadoop:
In fact, it is not troublesome to install Hadoop. It mainly requires the following prerequisites. If the following prerequisites are met, it is very easy to start Hadoop based on the official website configuration.
1. Java Runtime Environment. We recommend that you release Sun.
2. SSH public key password-free Authentication
The above environment is done, and the rest is only the Hadoop configuration. Different versions of these configurations may be different. For details, refer to the official documentation.
Environment
Virtual Machine: VMWare10.0.1 build-1379776
Operating System: 64-bit CentOS7
Install the Java environment
: Http://www.oracle.com/technetwork/cn/java/javase/downloads/jdk8-downloads-2133151-zhs.html
Rpm-ivh http://download.oracle.com/otn-pub/java/jdk/8u20-b26/jdk-8u20-linux-x64.rpm
JDK will be updated continuously. to install the latest JDK version, you need to go to the official website to obtain the rpm address of the latest installation package.
Configure SSH public key password-free Authentication
By default, CentOS comes with openssh-server, openssh-clients, and rsync. If your system does not, find the installation method.
Create a Common Account
Create a hadoop (custom name) account on all machines, and set the password to hadoop.
useradd -d /home/hadoop -s /usr/bin/bash –g wheel hadooppasswd hadoop
SSH Configuration
vi /etc/ssh/sshd_config
Find the following three configuration items and change them to the following settings. If it is commented out, remove the previous # uncommented to make the configuration take effect.
RSAAuthentication yesPubkeyAuthentication yes# The default is to check both .ssh/authorized_keys and .ssh/authorized_keys2# but this is overridden so installations will only check .ssh/authorized_keysAuthorizedKeysFile .ssh/authorized_keys
. Ssh/authorized_keys is the storage path of the public key.
Key Public Key Generation
Log on with a hadoop account.
cd ~ssh-keygen –t rsa –P ''
Will generate ~ /. Save the ssh/id_rsa.pub file ~ /. Ssh/authorized_keys
cp ~/.ssh/id_rsa.pub ~/.ssh/authorized_keys
Use the scp command to copy the. ssh directory to other machines. This will make all machines share the same key and share the public key.
scp ~/.ssh/* hadoop@slave1:~/.ssh/
Note ~ The access permission for/. ssh/id_rsa must be 600, and access by other users is prohibited.
Hadoop Installation
Refer to official configuration documents