Distributed Cluster Environment hadoop, hbase, and zookeeper (full)

Source: Internet
Author: User
1. Environment Description: The cluster environment requires at least three nodes (that is, three server devices): one Master and two Slave nodes. The nodes can be pinged to each other through the LAN, the following example shows how to configure the IP Address Allocation of a node: HostnameIP: create a user, and create a user password, master10.10.20.hadoop123456slave110.10.10.214.

1. Environment Description: The cluster environment requires at least three nodes (that is, three server devices): one Master and two Slave nodes. The nodes can be pinged to each other through the LAN, the following example shows how to configure the IP Address Allocation of a node: Hostname IP: The new user creates a new user password, Master 10.10.10.213 hadoop 123456 Slave1 10.10.10.214

1. Environment Description

The cluster environment requires at least three nodes (namely three server devices): one Master node and two Slave nodes. The nodes can be pinged to each other through the LAN. The following example shows that, configure the node IP Address allocation as follows:

Hostname IP Create user Create User Password
Master 10.10.10.213 Hadoop 123456
Slave1 10.10.10.214 Hadoop 123456
Slave2 10.10.10.215 Hadoop 123456

Are centos used for all three nodes? 6.3 system, to facilitate maintenance, it is best to use the same user name, user password, same hadoop, hbase, zookeeper directory structure for Cluster Environment configuration items.

2. Preparations 2.1. Modify the Hostname

To ensure the normal and stable operation of the cluster, we need to configure the hostname of each node as the corresponding Master, Slave1, and Slave2 respectively.

(1) execute the following command on the Master server:

Hostname Master // currently valid
Vi/etc/sysconfig/network // takes effect after restart
HOSTNAME=Master

(2) run the following command on the Slave1 Server:

Hostname Slave1 // currently valid
Vi/etc/sysconfig/network // takes effect after restart
HOSTNAME=Slave1

(3) run the following command on the Slave2 Server:

Hostname Slave2 // currently valid
Vi/etc/sysconfig/network // takes effect after restart
HOSTNAME=Slave2
2.2 Add Hosts ing

Run the following command to modify the hosts ing relationship under each of the three nodes:

vi /etc/hosts

Add the following content:

Master10.10.10.213Slave110.10.10.214Slave210.10.10.215
2.3 configure the JDK Environment

The Hadoop cluster must depend on the JDK environment. Therefore, we need to configure the JDK environment first. To manage the cluster, we recommend that the node JDK installation environment on the server be in the same path.

2.3.1 unzip the installation package

Copy the jdk file jdk-6u25-linux-x64.bin to the/usr/lib/java file directory (this directory can be customized), unzip the installation package, if the File Permission is limited, you can use the following command to grant permissions:

chmod u+w jdk-6u25-linux-x64.bin
2.3.2 modify environment configuration information
vi /etc/profile

Add:

export JAVA_HOME=/usr/lib/java/jdk1.6.0_25export PATH=$PATH:$JAVA_HOME/bin export CLASSPATH=.:$JAVA_HOME/lib/tools.jar:$JAVA_HOME/lib/rt.jar

You can log out or use the following command to make the change take effect:

source /etc/profile
2.3.3 check the current JDK version
java -version
2.3.4 supplement (optional)

If you check that the current JDK version is not the JDK version you just set, you can set the default JDK version:

Sudo update-alternatives -- install/usr/bin/java/usr/lib/java/jdk1.6.0 _ 25/bin/java 300 sudo update-alternatives -- install/usr/bin/javac javac/usr/lib/java/jdk1.6.0 _ 25/bin/javac 300 sudo update-alternatives -- config java (select the serial number of jdk1.6.0 _ 25)
2.4 Install SSH

By default, you can install SSH when installing Centos. You can run the following command in ubuntu to install it (the premise is that it must be connected to the Internet ):

sudo apt-get install sshsudo apt-get install rsync
2.5 create a user

To ensure the security and convenience of hadoop cluster management, we need to create another user and set the password. The command is as follows:

sudo adduser hadoopsudo passwd hadoop

In the preceding command, the first line creates a user whose user is hadoop, and the second line is to set a password for this hadoop user, so it is best to maintain consistency between servers.

2.6 configure SSH password-less login between clusters

The cluster environment must be accessed through ssh without a password. The local machine must be logged on without a password, and the host and the slave machine must be logged on without a password, there is no limit between the slave and the slave. Take this example. For example, the steps for setting a password-free logon between the Master and Slave1 are as follows:

(1) Go to the Master server and set a password-free self-login.

Ssh hadoop @ Master // log on to Masterssh-keygen-t rsa-p'-f ~ /. Ssh/id_rsacat ~ /. Ssh/id_rsa.pub> ~ //. Ssh/authorized_keys // generate the key chmod 700 ~ /. Ssh & chmod 600 ~ /. Ssh/* // Set permissions

If you do not know whether the configuration is successful, run the following command to verify the Configuration:

ssh localhost
If the preceding command does not require a password, the configuration is successful.

Go to the Server Load balancer 1 and set the password-free self-login. The operation is the same as above. You only need to change the corresponding Master to Server Load balancer 1, which is omitted here.

(2) log on to the Master server and set Master> Slave1 to log on without a password.

Ssh hadoop @ Master // log on to Mastercat ~ /. Ssh/id_rsa.pub | ssh hadoop @ Slave1 'cat-> ~ /. Ssh/authorized_keys 'ssh hadoop @ Slave1 // The configuration is successful if no password is required here

(3) log on to the Slave1 server and set the Server Load balancer instance to log on to the Master instance without a password.

Ssh hadoop @ Slave1 // log on to Slave1cat ~ /. Ssh/id_rsa.pub | ssh hadoop @ Master 'cat-> ~ /. Ssh/authorized_keys 'ssh hadoop @ Master // if you do not need to enter the password here, the operation is successful.

The above is the two-way password-less login configuration between the Master and Slave1. The configuration principle between Master and Slave2 is basically the same as above, so we will not repeat it here.

3. Hadoop cluster installation configuration 3.1. Modify the hadoop configuration file

Decompress hadoopinstallation package hadoop-1.0.3.tar.gz in centos and modify the six files in the conf directory:

(1) core-site.xml
fs.default.namehdfs://Master:9000
(2) hadoop-env.sh

Add the following code to the file:

Export JAVA_HOME = (the jdk path you configured, such as/usr/java/jdk1.6.0 _ 25)

(3) hdfs-site.xml

dfs.name.dir/home/hadoop/temp/hadoopdfs.data.dir/home/hadoop/temp/hadoopdfs.replication1dfs.support.appendtrue

(4) mapred-site.xml

mapred.job.trackerMaster:9001mapred.acls.enabledfalse

(5) Masters

Master

(6) Slaves

Slave1Slave2
3.2 synchronization installation package

Copy the unzipped modified hadoop-1.0.3 folder to the same hadoop installation path of Master, Slave1, and Slave2, respectively.

3.3 start a Hadoop Cluster

Go to the Master's hadoop-1.0.3 directory and do the following:

Bin/hadoop namenode-format // format the namenode, the operation performed before the service is started for the first time, do not need to execute bin/start-all.sh // start hadoopjps // use the jps command to see that there are 5 processes in addition to jps

Now, the hadoop cluster configuration process is complete. You can use the browser address http: // 10.10.10.213: 50070? Check whether the node activation status verification configuration is successful.

4. install and configure the Zookeeper cluster 4.1. Modify the zookeeper configuration file zoo. cfg.

Decompress zookeeperinstallation package zookeeper-3.4.3.tar.gz in centos system ?, Go to the conf directory and copy zoo_sample.cfg and name it zoo. cfg (Zookeeper? This file will be used as the default configuration file when it is started. open the file and modify it to the following format (note the permission problem. If the last configuration is incorrect, check whether the permission is correct during the process ).

dataDir=/home/hadoop/temp/zookeeper/dataserver.0=10.10.10.213:2888:3888server.1=10.10.10.214:2888:3888server.2=10.10.10.215:2888:3888
4.2 create a directory, create and edit the myid File

(This configuration myid file is stored in the/home/hadoop/temp/zookeeper/data directory)

Mkdir/home/hadoop/temp/zookeeper/data // dataDir directory vi/home/hadoop/temp/zookeeper/data/myid

Note that the content in the myid file is 0 in Master, 1 in Slave1, and 2 in Slave2, which correspond to zoo. cfg respectively.

4.3 synchronization installation package

Copy the unzipped modified zookeeper-3.4.3 folder to the same zookeeper installation path of Master, Slave1, and Slave2, respectively. Note: The content of the myid file is not the same. Each server has its own settings corresponding to zoo. cfg.

4.4 start zookeeper

The startup of Zookeeper is different from that of hadoop. Each node needs to be executed and enter the zookeeper-3.4.3 directory of three nodes to start zookeeper:

bin/zkServer.sh start
Note: if an error is reported, ignore it and continue to perform the same operation on the other two servers.
4.5 check whether zookeeper is configured successfully

After the three servers are started, if the process is correct, zookeeper should have automatically selected the leader, enter the zookeeper-3.4.3 directory of each server, perform the following operations to view the zookeeper startup status:

bin/zkServer.sh status

If the following code is displayed, the installation is successful.

[Java] view plaincopyJMX enabled by default Using config:/home/hadoop/zookeeper-3.4.3/bin/../conf/zoo. cfg Mode: follower // or there is only one leader
5. install and configure the HBase cluster 5.1. Modify the hbase configuration file.

Decompress hadoopinstallation package hadoop-1.0.3.tar.gz in centos and modify the three files in the conf directory:

(1) hbase-env.sh

Export JAVA_HOME =/usr/lib/java/jdk1.6.0 _ 25 // JDK installation directory export HBASE_CLASSPATH =/home/hadoop/hadoop-1.0.3/conf // hadoop installation directory export HBASE_MANAGES_ZK = true

(2) hbase-site.xml

hbase.rootdirhdfs://Master:9000/hbasehbase.cluster.distributedtruehbase.zookeeper.property.clientPort2181hbase.zookeeper.quorumMasterhbase.zookeeper.property.dataDir/home/hadoop/temp/zookeeperdfs.support.appendtrue

(3) regionservers

Slave1Slave2
5.2 synchronization installation package

Copy the unzipped modified hbase-0.94.1-security folder to the same hbase installation path of Master, Slave1, and Slave2, respectively.

5.3 Start HBase

Go to the Master's hbase-0.94.1-security directory and do the following:

Bin/start-hbase.sh // use jps to check if all processes are started

Now, the hbase service configuration process is complete. You can use the browser address http: // 10.10.10.213: 60010? Check whether hbase is available.

You can also run the following command to access hbase? Shell.

6. Conclusion

About the startup and shutdown sequence of hadoop, zookeeper, and hbase: hadoop and zookeeper are enabled at will, but hbase must be started at the end. hbase must be disabled at the end, disable hadoop and zookeeper at will. Otherwise, an exception occurs.

For software installation packages, you can download them from the official website. The installation configurations of different versions may change a little. If there are different versions, problems may occur. If there are any problems, you can check them accordingly, in this way, learning can make progress.

Author: jinnchang published on 16:29:30 Original article

Read: 44 comments: 0 view comments

Original article address: distributed Cluster Environment hadoop, hbase, and zookeeper (all), thanks for sharing.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.