Topic Center

Contact Sales

Home > Internet > Online Trends

Introduction of Big Data Offline Analysis Tool Hive

Last Update:2020-06-22 Source: Internet

Author: User

Keywords big data hive hive Introduction

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Hive was developed by Facebook to solve the analysis of massive log data. Later, the open source was given to the Apache Software Foundation. It can be seen that the Apache Software Foundation is a magical organization. Many of the open source tools we have learned before have the figure of the Apache Software Foundation.

Official website definition:

The Apache Hive ™ data warehouse software facilitates reading, writing, and managing large datasets residing in distributed storage using SQL.

The version here is Hive-1.0.0

Several features of Hive

The biggest feature of Hive is to analyze big data through SQL-like, and avoid writing MapReduce programs to analyze data, which makes it easier to analyze data.
Data is stored on HDFS, Hive itself does not provide data storage function
Hive maps data into a database and tables, and the metadata information of the libraries and tables generally exists in a relational database (such as MySQL).
Data storage: It can store large data sets, and is not strict with data integrity and format.
Data processing: Because the Hive statement will eventually generate a MapReduce task to calculate, it is not suitable for real-time computing scenarios, it is suitable for offline analysis.

The core of Hive
The core of Hive is the driving engine, which consists of four parts:

Interpreter: The role of the interpreter is to convert HiveSQL statements into a syntax tree (AST).
Compiler: The compiler compiles the syntax tree into a logical execution plan.
Optimizer: The optimizer optimizes the logic execution plan.
Actuator: The executor is to call the underlying running framework to execute the logic execution plan.
Hive's underlying storage
Hive data is stored on HDFS, and the libraries and tables in Hive can be seen as a mapping to the data on HDFS. So HVIE must be running on a Hadoop cluster

Hive statement execution process

The executor in Hive is to put the final MapReduce program to be executed on YARN and execute it in a series of jobs.

Hive's metadata storage
Hive metadata is generally stored in a relational database such as MySQL, and Hive and MySQL interact through the MetaStore service.

Hive client
Hive has many kinds of clients.

cli command line client: Use the interactive window to communicate with Hive using the hive command line.
HiveServer2 client: use Thrift protocol to communicate, Thrift is a converter between different languages, is a protocol to connect programs in different languages, and access Hive through JDBC or ODBC.
HWI client: A client that comes with hive, but it is rough and generally not needed.
HUE client: interact with Hive through a Web page, which is used more often.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

Related Keywords:

Getting Started with CDN 12-02

Front-end Must Learn: CDN Acceleration Principle 12-02

Elements of CDN Network 12-01

Understand the Principle of CDN Acceleration in One Article 12-01

Cloud Security Issues Derived from the Development of Cloud C... 11-26

8 New Types of Attacks Facing the Cloud Environment 11-26

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

Hot Article

Hot Tags

computing conference access forum computer class data get http html applications

Popular Keywords

direct digital landing development documentation data user director of marketing deploy it ddos how to description of products and services ddos information data website domain to dns

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Introduction of Big Data Offline Analysis Tool Hive

Contact Us

Hot Article

Hot Tags

Popular Keywords

Recommend Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support