Installing and Deploying Hadoop
First, install the virtual machine
1. Server Configuration information
Linux Host One |
Linux host two |
Linux host three |
Host Name: Master |
Host Name: Host1 |
Host Name: Host2 |
ip:192.168.100.100 |
Ip:192.168.100.101 |
ip:192.168.100.100 |
Hdfs |
Mapreduce |
Hdfs |
Mapreduce |
Hdfs |
Mapreduce |
NameNode |
Jobtracker |
DataNode |
Tasktracker |
Datenode |
Tasktracker |
Second, modify the host name
l command: [[email protected] ~] Vim/etc/sysconfig/network After you open the file, Networking=yes # using the network Hostname=master # Set host name L Command: hostname master #立即生效 L View host name command: [[email protected]~] Hostname |
Third, set the network parameters
l command: [[email protected] ~] Vim/etc/sysconfig/network-scripts/ifcfg-eth0
Device=eth0 #对应第一张网卡 Type=ethernet Onboot=yes #是否启动时运行 Nm_controlled=yes Bootproto=static #使用静态IP instead of assigning IP by DHCP Defroute=yes Ipv4_failure_fatal=yes Ipv6init=no Name= "System eth0" #名称 hwaddr=00:50:56:94:04:3c #必须对应etho是的MAC地址 (/etc/udev/rules.d/70-persistent-net.rules) Peerdns=yes Peerroutes=yes ipaddr=192.168.1.128 #指定本机IP地址 netmask=255.255.255.0 #指定子网掩码 gateway=192.168.1.2 #指定网关 dns1=192.168.1.2 |
L Restart Network
[[Email protected] ~] Service Network restart
Iv. modifying the Hosts file for a virtual machine
L Configure hosts on the master, Host1, and host2 three machines respectively.
l command: [[email protected] ~] Vim/etc/hosts
L The host file is configured as:
192.168.100.100 Master
192.168.100.101 host1
192.168.100.102 Host2
Five, configure SSH login without password
1. Generate Public Key Private key
Execute the Generate public key secret command on three hosts:
[[Email protected] ~] ssh-keygen–t RSA
Enter the public key without a password and store it in the/home/chybin/.ssh folder.
Id_rsa is the private key
Id_rsa.pub is the public key
2. Copy Master's public key
[[Email protected] ~] sudo ~/.ssh/id_rsa.pub >> authorized_keys
3. Send the public key on master to Host1, Host2
[[Email protected] ~] sudo scp authorized_keys [email protected]:~/.ssh
[[Email protected] ~] sudo scp authorized_keys [email protected]:~/.ssh
4. Modify Authorized_keys Permissions
Modify the properties of the Authorized_keys file on master, Host1, Host2
[[Email protected] ~] chmod 644 ~/.ssh/authorized_keys
[[Email protected] ~] chmod 644 ~/.ssh/authorized_keys
[[Email protected] ~] chmod 644 ~/.ssh/authorized_keys
5. The test is successful
Connect host1, Host2 on master with SSH
[[Email protected] ~] SSH host1
The first time you enter a password, you can enter multiple exit commands to exit SSH, ssh host1 again, you do not have to enter a password.
VI. Installing Java JDK
L First determine if the Java JDK has been installed, use the command [[email protected] ~] Java–version
L copy jdk-7u79-linux-x64.rpm to master in a random directory
L Install JDK, command: [[email protected] ~] # RPM–IVH ~/jdk-7u79-linux-x64.rpm.rpm
L install directory in/usr/java, this time with java-version can verify whether the installation is successful.
VII. Setting Environment variables
L Open config file: sudo vim/etc/profile
L Append variable:
java_home=/usr/java/jdk1.7.0_79
Path= $PATH: $JAVA _home/bin
Classpath= $JAVA _home/lib: $JAVA _home/jre/lib
Export Java_home CLASSPATH PATH
L make the configuration file effective immediately after modification: [[email protected] ~] Source/etc/profile
Viii. installation of Hadoop
1. Version Selection
Jdk |
V1.7.0 |
Hadoop |
V2.7.2 |
Hbase |
V1.2 |
Hive |
V1.2.1 |
2. Install Hadoop on Master
(1) Put the hadoop-2.7.2.tar.gz file in the master directory, select the root directory.
(2) Decompression hadoop-2.7.2.tar.gz
[[Email protected] ~] $ TAR–ZXVF hadoop-2.7.2.tar.gz |
(3) Create a Hadoop folder on Master
[[Email protected] ~] $ sudo mkdir/usr/hadoop |
(4) Cut hadoop-2.7.2 to the Hadoop folder
[[Email protected] ~] $ sudo mv/hadoop-2.7.2/usr/hadoop |
3. Configure Hadoop on Master
(1) Configuring the Hadoop environment variables
L Open configuration file
[email protected] ~]$ sudo vim/etc/profile |
L Append Variable
hadoop_home=/usr/hadoop/hadoop-2.7.2 Path= $PATH: $HADOOP _home/bin: $HADOOP _home/sbin Export Hadoop_home PATH |
(2) Configuration hadoop-env.sh
The configuration files for the Hadoop 2.X version are in the ${hadoop_home}/etc/hadoop directory.
L file structure as follows:
L hadoop-env.sh configured to
Export Java_home=${java_home} Export hadoop_home=/usr/hadoop/hadoop-2.7.2 Export path= $HADOOP _home/bin: $PATH |
L Make hadoop-env.sh effective
[[Email protected] ~] $ source ${hadoop_home}/etc/hadoop/hadoop-env.sh |
(3) Configuring the Slave file
[[email protected] ~]$ sudo vim ${hadoop_home}/etc/hadoop/slaves |
File contents: Host1 Host2 |
(4) Configuration Core-site.xml
<configuration> <property> <name>fs.default.name</name> <value>hdfs://master</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/usr/hadoop/hadoop-2.7.2/hadoop_tmp</value> </property> </configuration> |
(5) Configuration Hdfs-site.xml
< Configuration> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>heartbeat.recheckinterval</name> <value>10</value> </property> <property> <name>dfs.name.dir</name> <value>file:/usr/hadoop/hadoop-2.7.2/hdfs/name</value> </property> <property> <name>dfs.data.dir</name> <value>file:/usr/hadoop/hadoop-2.7.2/hdfs/data</value> </property> </configuration> |
(6) Configuration Yarn-site.xml
<configuration> <property> <name>yarn.resourcemanager.address</name> <value>master:8032</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>master:8030</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>master:8031</value> </property> <property> <name>yarn.resourcemanager.adimin.address</name> <value>master:8033</value> </property> <property> <name>yarn.resoucemanager.webapp.address</name> <value>master:8088</value> </property> <property> <name>yarn.nademanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> </configuration> |
4. Other necessary configuration
(7) Read and Write permissions to configure Hadoop folders on three machines
[email protected] ~]$ sudo chmod 777-r/usr/hadoop/hadoop-2.7.2 |
5. Copy the Hadoop folder to Host1, Host2
(8) Copy hadoop2.7.2 to Host1, host2
[[Email protected] ~] $ sudo scp–r/usr/hadoop/hadoop-2.7.2 [email protected]:/usr/hadoop |
[[Email protected] ~] $ sudo scp–r/usr/hadoop/hadoop-2.7.2 [email protected]:/usr/hadoop |
6. Formatted Namenode
[[Email protected] ~] $ Hadoop Namenode-format |
7. Verify that the installation is successful
[[email protected] ~]$ Hadoop version Hadoop 2.7.2 Subversion Https://git-wip-us.apache.org/repos/asf/hadoop.git-r b165c4fe8a74265c792ce23f546c64604acf0e41 Compiled by Jenkins on 2016-01-26t00:08z Compiled with Protoc 2.5.0 From source with checksum d0fda26633fa762bff87ec759ebe689c This command is run Using/usr/hadoop/hadoop-2.7.2/share/hadoop/common/hadoop-common-2.7.2.jar |
[Email protected] ~]$ Hadoop fs-ls/ Found 1 Items Drwxr-xr-x-chybin supergroup 0 2016-04-05 05:19/demo1 |
Master on the machine: [Email protected] ~]$ jps-l 13121 Org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode 13275 Org.apache.hadoop.yarn.server.resourcemanager.ResourceManager 15271 Sun.tools.jps.Jps 12924 Org.apache.hadoop.hdfs.server.namenode.NameNode |
host1 on the machine:
|
"Original" Installing and deploying Hadoop