Mainstream cloud computing Hadoop architecture Configuration

Source: Internet
Author: User
Keywords Name java value

Hadoop is a distributed computing open source framework for the Apache open source organization that has been applied to many large web sites, such as Amazon, Facebook and Yahoo. For me, one of the most recent usage points is the log analysis of the service integration platform. The service integration platform's log volume will be very large, and this also coincides with the application of distributed computing scenarios (log analysis and indexing is the two major scenarios).

Today we are actually building the Hadoop 2.2.0 version, the actual combat environment for the current mainstream server operating system CentOS 5.8 system.

First, the actual combat environment

system version: CentOS 5.8 X86_64java version: Jdk-1.7.0_25hadoop version: hadoop-2.2.0192.168.149.128 namenode (acts as Namenode, secondary Namenode and ResourceManager roles) 192.168.149.129 Datanode1 (acting as Datanode, NodeManager role) 192.168.149.130 Datanode2 (acting as Datanode, NodeManager roles)

II. System Preparation

1. Hadoop can download the latest version of Hadoop2.2 directly from the official Apache website. The official currently provides the linux32 bit system executable file, so if you need to deploy on 64-bit systems, you need to download the SRC source code itself. (For real-world online environments, download the 64-bit Hadoop version, which avoids a lot of problems, where I experimented with a 32-bit version)

hadoop download address Http://apache.claz.org/hadoop/common/hadoop-2.2.0/Java download Http://www.oracle.com/technetwork/java /javase/downloads/index.html

2, we use three Cnetos server to build Hadoop cluster, the role of the respective as already noted.

The first step: we need to set the corresponding host name in the hosts of three servers (the real environment can use the intranet DNS resolution)

[Root@node1 hadoop]# cat/etc/hosts

# do don't remove the following line, or various programs# that require receptacle functionality would fail.127.0.0.1 Localhost.localdomain localhost192.168.149.128 node1192.168.149.129 node2192.168.149.130 node3

(Note * We need to configure hosts resolution on Namenode, Datanode three servers)

The second step: from the Namenode on the Datanode server, you need to do the following configuration:

executes Ssh-keygen on Namenode 128, all the way to enter return. The public key/root/.ssh/id_rsa.pub is then copied to the Datanode server, and the copy method is as follows: Ssh-copy-id-i. Ssh/id_rsa.pub Root@192.168.149.129ssh-copy-id- I. Ssh/id_rsa.pub root@192.168.149.130

Three, Java installation configuration

tar-xvzf jdk-7u25-linux-x64.tar.gz &&mkdir-p/usr/java/; mv/jdk1.7.0_25/usr/java/. Install and configure the JAVA environment variable, add the following code at the end of/etc/profile: Export java_home=/usr/java/jdk1.7.0_25/export path= $JAVA _home/bin:$ Pathexport classpath= $JAVE _home/lib/dt.jar: $JAVE _home/lib/tools.jar:./

Save exit and execute source/etc/profile. Performing java-version on the command line on behalf of the Java installation is successful.

[Root@node1 ~]# Java-versionjava Version "1.7.0_25" Java (tm) SE Runtime Environnement (build 1.7.0_25-b15) Java HotSpot (tm) 64-bit Server VM (build 23.25-b01, mixed mode)

(Note * We need to install the Java JDK version on Namenode, datanode three servers)

Four, Hadoop version installation

The official download of the hadoop2.2.0 version, without compiling a direct decompression installation can be used, as follows:

First Step decompression:

tar-xzvf hadoop-2.2.0.tar.gz &&MV hadoop-2.2.0/data/hadoop/(Note * Install the Hadoop version on the Namenode server first, Datanode install Datanode After the configuration has been modified

The second step is to configure the variable:

continues to add the following code at the end of/etc/profile and executes Source/etc/profile. Export Hadoop_home=/data/hadoop/export path= $PATH: $HADOOP _home/bin/export java_library_path=/data/hadoop/lib/ native/(Note * We need to configure Hadoop-related variables on Namenode, datanode three servers)

V. Configuring Hadoop

In the Namenode configuration, we need to modify the following places:

1, modify the contents of Vi/data/hadoop/etc/hadoop/core-site.xml as follows:

<?xml version= "1.0" ><?xml-stylesheet type= "text/xsl" href= "configuration.xsl"?><!--put Site-specific property overrides in this file. --><configuration><property> <name>fs.default.name</name> <value>hdfs:// 192.168.149.128:9000</value> </property><property> <name>hadoop.tmp.dir</name> < Value>/tmp/hadoop-${user.name}</value> <description>a Base for other temporary directories.</ Description></property></configuration>

2, modify the contents of Vi/data/hadoop/etc/hadoop/mapred-site.xml as follows:

<?xml version= "1.0" ><?xml-stylesheet type= "text/xsl" href= "configuration.xsl"?><!--put Site-specific property overrides in this file. --> <configuration> <property> <name>mapred.job.tracker</name> <value> 192.168.149.128:9001</value> </property></configuration>

3, modify the contents of Vi/data/hadoop/etc/hadoop/hdfs-site.xml as follows:

<?xml version= "1.0" encoding= "UTF-8" ><?xml-stylesheet "type=" text/xsl "href=" configuration.xsl configuration><property><name>dfs.name.dir</name><value>/data/hadoop/data_name1,/ data/hadoop/data_name2</value></property><property><name>dfs.data.dir</name>< Value>/data/hadoop/data_1,/data/hadoop/data_2</value></property><property><name> Dfs.replication</name><value>2</value></property></configuration>

4. Append the jav_home variable at the end of/data/hadoop/etc/hadoop/hadoop-env.sh file:

echo "Export java_home=/usr/java/jdk1.7.0_25/" >>/data/hadoop/etc/hadoop/hadoop-env.sh

5, modify the contents of the Vi/data/hadoop/etc/hadoop/masters file as follows:

192.168.149.128

6, modify the contents of the Vi/data/hadoop/etc/hadoop/slaves file as follows:

192.168.149.129192.168.149.130

As configured above, the configuration of the above specific meaning here is not to do too much explanation, when the building does not understand, you can check the relevant official documents.

If the namenode on the basic structure, then we need to deploy Datanode, deployment Datanode relatively simple, do the following operations can be.

for i in ' seq 129 130 '; Do scp-r/data/hadoop/root@192.168.149. $i:/data/; Done

Since then the entire cluster has been basically built, and the next step is to start the Hadoop cluster.

Six, start Hadoop and test

Before we start Hadoop, we need to take a very critical step by executing the following command on the Namenode to initialize the name directory and the data directory.

cd/data/hadoop/./bin/hadoop Namenode-format

So how to calculate initialization success, the following screenshot successfully created the name directory is normal:

Then start all of the Hadoop services, such as the following command:

[root@node1 hadoop]#/sbin/start-all.sh

We can also see if the corresponding port is started: NETSTAT-NTPL

Visit the following address: http://192.168.149.128:50070/

Access Address: http://192.168.149.128:8088/

After the construction is completed, we have a simple practical operation, the following figure:

Since the basic setup of Hadoop, and then there is a deeper and more things need to learn and learn, and we hope to learn together, progress, sharing, happiness.

Original link: http://chinaapp.sinaapp.com/thread-3224-1-1.html

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.