Hadoop+hive+mysql Installation Documentation

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

2013-03-12 22:07 1503 people read comments (0) favorite reports Classification:Hadoop (+)

Directory (?) [+]

Hadoop+hive+mysql Installation Documentation

Software version

Redhat Enterprise server5.5	64
Hadoop	1.0.0
Hive	0.8.1
Mysql	5
Jdk	1.6

Overall architecture

A total of 7 machines, do 4 data nodes, the name node, Jobtracker and Secondaryname are separated, the machine division is as follows

Machine IP	Host Name	Use	Note
123.456.789.30	Master.hadoop	Name node	Master Node
123.456.789.31	Slave1.hadoop	Data Node 1
123.456.789.32	Slave2.hadoop	Data Node 2
123.456.789.33	Slave3.hadoop	Data Node 3
123.456.789.34	Slave4.hadoop	Data Node 4
123.456.789.35	Job.hadoop	Jobtracker
123.456.789.36	Sec.hadoop	Second name node

The required installation package

Hadoop-1.0.0-bin.tar.gz

Mysql-5.1.52.tar.gz

Jdk-6u31-linux-x64.bin

Hive-0.8.1.tar.gz

Prepare for work (using the root user) Hosts file configuration

Configure/etc/hosts files on all machines (none of them must be done by all machines)

Vi/etc/hosts

Add the following line:

123.456.789.30 Master.hadoop

123.456.789.31 Slave1.hadoop

123.456.789.32 Slave2.hadoop

123.456.789.33 Slave3.hadoop

123.456.789.34 Slave4.hadoop

123.456.789.35 Job.hadoop

123.456.789.36 Sec.hadoop

Modify the hostname of each host

When Linux is installed, its default name is localhost. Modifying the/etc/sysconfig/network configuration file

Vi/etc/sysconfig/network

such as: 123.456.789.30 machine, modified to Master.hadoop

123.456.789.31 machine, modified to Slave1.hadoop

Special Note: None of them must be done with all the machines.

Build HDUser users and Hadoop groups

Groupadd Hadoop

Useradd-g Hadoop HDUser

passwd HDUser

Uploading files

Create folders under/home/hduser Tools, use HDUser users to ftp all installation packages to this folder

Configuration-free Authentication

To avoid the need for passwords in Hadoop operations, SSH-free password verification is required from the master node and Jobtracker to the machine machine

The master node uses the HDUser user to execute

Ssh-keygen-t rsa-p ""

Cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys

SSH localhost

SSH Master.hadoop

To create a key to another host

Ssh-copy-id-i $HOME/.ssh/id_rsa.pub [email protected]

Special Note: When executing the ssh-copy-id, you need to enter Yes and the host's password.

Install Hadoop (using HDUser users) install JDK on host and unzip other software

A Switch to/home/hduser/tools, unzip the JDK

./jdk-6u31-linux-x64

Special Note: If the prompt permission is insufficient, empower it

chmod 777 *

B unzip Hadoop and rename it

TAR-XVF Hadoop-1.0.0-bin.tar

MV hadoop1.0.0 Hadoop

C decompression MySQL

TAR–XVF mysql-5.1.52.tar.gz

D Unzip hive and rename it

TAR–XVF hive-0.8.1.tar.gz

MV hive-0.8.1 Hive

Modifying a configuration file

#vi/etc/profile Add Environment variables

Export java_home=/home/hduser/jdk1.6.0_30/

Export classpath= $CLASSPATH: $JAVA _home/lib: $JAVA _home/jre/lib

Export Hadoop_home=/home/hduser/hadoop

Export path= $PATH: $JAVA _home/bin: $HADOOP _home/bin

Export Hive_home=/home/hduser/hive

Export path= $HIVE _home/bin: $PATH

Special Note: Add on all Machines

Execute #source/etc/profile to make environment variables effective immediately

Modify Hadoop-env.sh

VI $HADOOP _home/etc/hadoop/hadoop-env.sh

The following modifications:

# The Java implementation to use. Required.

Export java_home=/home/hduser/jdk1.6.0_30

Export Hadoop_pid_dir=/home/hduser/pids

Create the desired folder

Create a folder under/home/hduser/hadoop tmp

Mkdir/home/hduser/hadoop/tmp

Uploading a configuration file

Upload all the files in the Hadoop_conf folder to the/home/hduser/hadoop/etc/hadoop folder to overwrite the original.

Modify Hdfs-site.xml

Increase:

<name>dfs.permissions</name>

<value>false</value>

</property>

Modify Core-site.xml

Modify the property Hadoop.tmp.dir value to:/home/hduser/hadoop/tmp

Modify attribute Dfs.http.address value to: sec.hadoop:50070 (point to Second name node)

Modify the property Dfs.hosts.exclude value to:/home/hduser/hadoop/etc/hadoop/excludes (for deleting nodes)

Modify Mapred-site.xml

Modify attribute Mapred.job.tracker value to: job.hadoop:54311 (point to Jobtracker)

Modify Masters

Modify the content to: Sec.hadoop

Modify the Slaves file

Modify the content to:

Slave1.hadoop

Slave2.hadoop

Slave3.hadoop

Slave4.hadoop

Transferring JDK and Hadoop to other hosts

Scp–r jdk1.6.0_31 [Email protected]:/home/hduser

Scp-r jdk1.6.0_31 [Email protected]:/home/hduser

Scp-r Hadoop [Email protected]:/home/hduser

Scp-r Hadoop [email protected]:/home/hduser

Start Hadoop

Because the test is now not directly using Start-all, the boot order is (both executed on the master node)
A Log in to the master node using HDUser, enter the Hadoop/sbin directory, start the name node
./hadoop-daemon.sh Start Namenode
b Start the Data node (just start the datanode for each data node)
./start-dfs.sh
C Log on Job.hadoop start Jobtracker (the Tasktracker of each node is also started)
./start-mapred.sh
Shutdown order (performed under Master node Sbin)
Close Datanode and Namenode
./stop-dfs.sh
Login to the job host (Close Tasktracker and Jobtracker)
./stop-mapred.sh

The issue of insufficient permissions may be reported at startup, and it is necessary to assign all the files under Hadoop/sbin on each host.

Verifying the startup effect

To see if the content running on each host is working properly, using JPS, the current planning results should be as follows:

Node name	Run content	Note
Master Node	Namenode
Data node	Datanode,tasktracker
Job node	Jobtracker
Second name node	Secondaryname

Special Note: You can also view the log file under hadoop/logs/to view the startup

Install MySQL (using root)

1. Switch to MySQL extract directory

Cd/home/hduser/tools/mysql

2../configure--prefix=/usr/local/mysql--sysconfdir=/etc--localstatedir=/data/mysql

Note Modify Localstatedir to the location where you want to place the database file

3.make

4.make Install

5.make Clean

6.groupadd MySQL

7.useradd-g MySQL MySQL (the first MySQL is a group, the second MySQL is the user)

8.cd/usr/local/mysql

9.cp/usr/local/mysql/share/mysql/my-medium.cnf/etc/my.cnf

10. #bin/mysql_install_db--user=mysql #建立基本数据库, it must be specified as a MySQL user, only this step can appear in the Var directory under Usr/local/mysql

# Bin/mysqld_safe--user=mysql &

Bin/mysqladmin-u Root Password Oracle

12. Start and close MySQL (can skip)

Start MySQL

#bin/mysqld_safe & or/usr/local/mysql/share/mysql/mysql.server start MySQL

Stop MySQL Method 1

#/usr/local/mysql/share/mysql/mysql.server stop Stop MySQL

Close MySQL Method 2

#ps-aux|grep MySQL View process

#kill ID Number----This is to kill the MySQL process, the ID number is seen in the view of the MySQL process.

13. Registering MySQL as a service

Cp/usr/local/mysql/share/mysql/mysql.server/etc/init.d/mysqld

Add MySQL to the system service with the root user.

#/sbin/chkconfig--add mysqld add MySQL as service

#/sbin/chkconfig--del mysqld Delete mysql service

/sbin/service mysqld Restart #重新启动服务查看是否生效

/sbin/chkconfig--list mysqld #查看是否345运行级别都打开mysql

14. Create the appropriate MySQL account for hive and give sufficient permissions

Enter Root/usr/local/mysql/bin under execution:./mysql-u root-p;

Creating a Hive Database: Create databases hive;

Create user hive, which can only connect to the database from localhost and connect to the WordPress database: Grant all on hive.* to [email protected] identified by ' Oracle '.

Installing Hive (HDUser)

1. Modify the Hadoop_home in/conf/hive-env.sh.template in the hive directory to be the actual Hadoop installation directory:/home/hduser/hadoop

2. Create folder TMP and warehouse under Hive

Mkdir/home/hduser/hive/tmp

Mkdir/home/hduser/hive/warehouse

3. Create/tmp and/user/hive/warehouse in HDFs and set permissions:

Hadoop fs-mkdir/home/hduser/hive/tmp

Hadoop Fs-mkdir/home/hduser/hive/warehouse

Hadoop Fs-chmod g+w/home/hduser/hive/tmp

Hadoop Fs-chmod G+w/home/hduser/hive/warehouse

4. Copy the configuration files from the hive_conf to the/home/hduser/hive/conf

5. Modify the Hive-site.xml file

Modify the property Hive.metastore.warehouse.dir value to:/home/hduser/hive/warehouse

Modify the property Hive.exec.scratchdir value to:/home/hduser/hive/tmp

6. Copy the MySQL JDBC driver package Mysql-connector-java-5.0.7-bin.jar to the Lib directory of hive.

7. Start hive and execute show tables;

8. If the seventh step is no exception, the successful more than half, build several tables, load some data test

Hive>

CREATE TABLE Test_src

(

ACCOUNT1 String,

URL string)

Row format delimited terminated by ' \| ';

Touch A.txt

123412342|http://www.sohu.com

454534534|http://qww.cocm.ccc

Hive>

Load data local inpath '/data/myfile/a.txt ' into table test_src;

Currently using the most basic installation, not set any parameters, production needs to be configured with the relevant parameters

Sqoop installation (HDUser user)

1. Unzip the sqoop1.4.tar.gz

2. Rename to Sqoop

3. Modify the Sqoop file Bin/configure-sqoop, comment out all about hbase and zookeeper

4. Copy the Ojdbc6.jar and Hadoop-core-1.0.0.jar to the Sqoop/lib

5. Add Environment variables

Export Sqoop_home=xxxx

Export path= $SQOOP _home/bin: $PATH

Hadoop+hive+mysql Installation Documentation

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Hadoop+hive+mysql Installation Documentation

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Hadoop+hive+mysql Installation Documentation

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support