Hive brief and several access methods

Last Update:2015-07-13 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

What is hive?

Hive is a data warehouse infrastructure built on Hadoop. It provides a series of tools that can be used to extract and transform the data (ETL), a mechanism that can store, query, and analyze large-scale data stored in Hadoop. Hive defines a simple class SQL query language called HQL, which allows users who are familiar with SQL to query data. At the same time, the language also allows developers to familiarize themselves with the development of custom Mapper and reducer for the built-in mapper and reducer of complex analytical work that cannot be done.

Hive is part of the Data warehouse in the Hadoop ecosystem. He is able to manage the data in the **hadoop and can query the data in the **hadoop.

Advantages and Disadvantages

Low cost, start faster.
Simple MapReduce statistics can be quickly implemented with class-SQL statements without the need to develop specialized mapreuduce applications.
Real-time queries are not supported.

Hive System Architecture

Metadata storage: typically stored in a relational database, such as MySQL, Derby. The metadata in Hive includes the name of the table, the columns and partitions of the table and its properties, the properties of the table (whether it is an external table, etc.), the directory where the table's data resides, and so on.
driver: interpreter, compiler, optimizer, actuator
Query Compiler:
Execution Engine:
Server:
Client components:
Extensible Interface Section:

hive Meta data store

Derby (Built-in Derby, default)
Single session
Create a metadata file on the startup Terminal Day record
cannot be shared by multiple users

MySQL
Install MySQL, configure accounts, permissions
Mysql-connector-java-5.1.22-bin.jar Copy to the hive installation directory under the Lib directory
Modify Hive-site.xml

Hive Client access mode

1. CLI command line

[root@hadoop1 ~]# hive

2, Hwi

[root@hadoop1 ~]# hive --service hwihttp://localhost:9999/hwi

3, Hiveserver

Start Hiveserver[[email protected] ~]# hive--service hiveserverIf Org.apache.thrift.transport.TTransportException:Could appears notCreate ServerSocket onAddress0.0. 0. 0/0.0. 0. 0:10000.WORKAROUND: Port is occupied, kill the port process or re-establish port hive--service hiveserver-p10001Accessed through the Hive-jdbc method.Private Static StringHivedriver="Org.apache.hadoop.hive.jdbc.HiveDriver";Private Static StringUrl="Jdbc:hive://hadoop1:10001/default";Private Static StringName="";Private Static Stringpassword="";Class. forname (Hivedriver); Connection conn = drivermanager.getconnection (Url,name,password); Statement stat=conn.createstatement ();StringSql="Show Tables"; ResultSet rs = stat.executequery (SQL);

Demo:

 PackageExampleImportJava.sql.Connection;ImportJava.sql.DriverManager;ImportJava.sql.ResultSet;ImportJava.sql.SQLException;ImportJava.sql.Statement; Public  class hivejdbc {    Private StaticString hivedriver="Org.apache.hadoop.hive.jdbc.HiveDriver";Private StaticString url="Jdbc:hive://hadoop1:10001/default";Private StaticString name="";Private StaticString password=""; Public Static void Main(string[] args) {Try{Class.forName (hivedriver);            Connection conn = drivermanager.getconnection (Url,name,password);            Statement stat=conn.createstatement (); String sql="Show Tables"; String sqlString ="SELECT * from addressall_2015_07_09"; ResultSet rs = stat.executequery (sqlString); while(Rs.next ()) {//hive is starting from 1.                //system.out.println (rs.getstring (1)); System.out.println (Rs.getstring (1)+" "+rs.getint (2)+" "+rs.getint (3)+" "+rs.getint (4)); }        }Catch(ClassNotFoundException e)        {E.printstacktrace (); }Catch(SQLException e)        {E.printstacktrace (); }       }}

Operation Result:

2015_07_09 536 488 493

Hive brief and several access methods

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Hive brief and several access methods

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support