(1), Hive framework construction and architecture introduction

Last Update:2015-06-29 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

First, Introduction

Hive is a Hadoop-based data warehousing tool that facilitates querying and managing datasets in distributed storage systems, ideal for statistical analysis of data warehouses

Hive is not suitable for the processing of connected machines, nor for real-time queries, and is better suited for batch jobs with large amounts of immutable data.

Second, download and install

1. Download the hive compression package and copy it to the/opt/module directory of the CentOS system

2. Extract files: tar-zxvf apache-hive-1.0.1-bin.tar.gz Execute rename folder as Hive

3, add hive-1.0.1 to the environment variables, the premise is that the operating environment of Hadoop has been configured, the operating environment is hadoop2.2

Vi/etc/profile Insert Content Export Hive_home=/opt/modules/hiveexport path= $PATH: $HIVE _home

4. Configuring Hive-default.xml and Hive-site.xml files

Go to/opt/modules/hive/conf, copy Hive-default.xml.template to Hive-default.xml and Hive-site.xml file, modify permissions of Hive-env.xml file

chmod u+x hive-env.sh

5. In the shell command line, enter: Hive will enter the shell command window of Hive

(In the configuration process, encountered a lot of problems, but according to the log log, can be a step-by-step solution to the problem)

Third, the structure

The architecture of hive can be divided into four parts

User interface
When the CLI, client, and WUI,CLI are started, a copy of Hive is started at the same time

Meta data storage
Metadata in hive is stored in the RDBMS, such as: Mysql,hive's metadata includes the name of the table, the columns of the table, the properties of the table, the directory where the table's data resides
In hive, for each database that is created in the database for a directory with the HDFs file system, the corresponding database directory is the subdirectory of the database directory.
Interpreter, compiler, optimizer
Using the HQL statement query, the resulting query information is stored in HDFs and executed by the MapReduce call from lexical analysis, parsing, compiling, optimization, and query plan generation.
Data storage
The data for hive is stored in HDFs, and most of the queries are interpreted as mapreduce tasks, and only a small portion of the files are read directly

The schema diagram is shown below.

Iv. storing metadata in a MySQL database

A MySQL database installation

Yum install-y mysql-server MySQL Mysql-deve

B. Restart MySQL Service

Service mysqld Restart

C. Log in to MySQL and grant permissions

Mysql-u root-p * * *

Assign permissions: Grant all privileges on * * to ' root ' @ ' Hadoop-yarn ' identified by ' root123 ';

Refresh Permissions: Flush Privileges

D. Create a hive-specific metabase: "Hive"

Create DATABASE hive;

E. Add the following configuration to the Hive-site.xml file in the Conf directory of Hive

    <property>        <name>hive.metastore.local</name>        <value>true</value>    </property>    <property>        <name>javax.jdo.option.ConnectionURL</name>        < value>jdbc:mysql://hadoop-yarn:3306/hive?characterencoding=utf-8</value>    </property>    <property>        <name>javax.jdo.option.ConnectionDriverName</name>        <value> com.mysql.jdbc.driver</value>    </property>    <property>        <name> javax.jdo.option.connectionusername</name>        <value>root</value>    </property>    <property>        <name>javax.jdo.option.ConnectionPassword</name>        <value> Root123</value>    </property>

F. Copy the MySQL driver package to the Lib directory of the hive directory

G, start the Hive command as shown in

H. Enter MySQL database

Execute command: Use Hive

Show tables

This is where the hive environment is built and metadata metastore stored in the MySQL database.

(1), Hive framework construction and architecture introduction

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

(1), Hive framework construction and architecture introduction

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

(1), Hive framework construction and architecture introduction

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support