1. Java Operation Hive via JDBC

Last Update:2017-04-10 Source: Internet

Author: User

Tags prepare time in milliseconds metabase

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

0. Overview

The use of CLI or Hive–e only allows the use of HIVEQL to perform queries, updates, and other operations. However, Hive also provides a client-side implementation through Hiveserver or HIVESERVER2, where the client can manipulate data in hive without starting the CLI, both allowing remote clients to use multiple programming languages such as Java, Python submits the request to hive to retrieve the results.

The similarities and differences between Hiveserver and HiveServer2?

Hiveserver and HiveServer2 are based on thrift. Why do you need HiveServer2 if you already exist hiveserver? Because Hiveserver cannot handle concurrent requests from more than one client, this is due to limitations caused by the thrift interface used by Hiveserver and cannot be corrected by modifying Hiveserver code. Therefore, rewriting the Hiveserver code in the Hive-0.11.0 version has been HiveServer2, which solves the problem. HIVESERVER2 supports multi-client concurrency and authentication, providing better support for open API clients such as JDBC and ODBC.

1. Start the service

1), Hive-site. XML key configuration

<name>hive.metastore.warehouse.dir</name>

<value>/usr/hive/warehouse</value>//(the location of the database in hive and the folder where the table resides in HDFs)

<description>location of default database for the warehouse</description>

</property>

<name>hive.server2.thrift.port</name>

<value>10000</value>//(HiveServer2 remote connection port, default is 10000)

<description>port number of HiveServer2 Thrift interface.

Can be overridden by setting $HIVE _server2_thrift_port</description>

</property>

<name>hive.server2.thrift.bind.host</name>

<value>**.**.**.**</value>//(IP address of hive cluster)

<description>bind host on which to run the HiveServer2 Thrift interface. Can be overridden by setting $HIVE _server2_thrift_bind_host</description>

</property>

<name>hive.server2.long.polling.timeout</name>

<value>5000</value>//(the default is 5000L, here is changed to 5000, otherwise the program will error)

<description>time in milliseconds that HiveServer2 would wait, before responding to asynchronous calls that use long Polling</description>

</property>

<name>javax.jdo.option.ConnectionURL</name>

<value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value>//(Hive's metabase, I'm using local mysql as a meta database)

<DESCRIPTION>JDBC connect string for a JDBC metastore</description>

</property>

<name>javax.jdo.option.ConnectionDriverName</name>//(driver name of the connection metadata)

<value>com.mysql.jdbc.Driver</value>

<description>driver class name for a JDBC metastore</description>

</property>

<name>javax.jdo.option.ConnectionUserName</name>//(Connect metabase user name)

<description>username to use against Metastore database</description>

</property>

<name>javax.jdo.option.ConnectionPassword</name>//(Connecting a metabase password)

<description>password to use against Metastore database</description>

</property>

2), start meta-database

Start the metabase first, and at the command line type: Hive--service metastore &

3), start the service

#hive--service hiveserver2 >/dev/null &

The above command starts the Hiveserver2 service.

Hive provides the JDBC driver so that we can use Java code to connect to hive and make SQL statement queries for some class-relational databases. First, we have to open the Hive service, which is hiveserve. If you start Hiveserver, change the above command to

#hive --service hiveserver >/dev/null &

2. Place the required jar package in

$HADOOP_home/share/hadoop/common/hadoop-common-2.8.0.jar

$HIVE _home/lib/hive-exec-2.1. 1.jar

$HIVE _home/lib/hive-jdbc-2.1. 1.jar

$HIVE _home/lib/hive-metastore-2.1. 1.jar

$HIVE _home/lib/hive-service-2.1. 1.jar

$HIVE _home/lib/libfb303-0.9.3.jar

$HIVE _home/lib/commons-logging-1.2.jar

$HIVE _home/lib/slf4j-api-1.6.1.jar

3.Java Connection program

Import java.sql.Connection;

Import Java.sql.DriverManager;

Import java.sql.SQLException;

Import java.sql.PreparedStatement;

Import Java.sql.ResultSet;

Import java.sql.Statement;

public class Hiveclientutils {

private static String drivername = "Org.apache.hive.jdbc.HiveDriver";

Fill in the IP of Hive, previously configured in the configuration file IP

private static String url= "Jdbc:hive2://localhos:10000/default";

private static Connection Conn;

private static PreparedStatement PS;

private static ResultSet Rs;

// Create connection

public static Connection Getconnnection () {

try {

Class.forName (drivername);

The user name here must be a user with permission to operate HDFs, or the program will prompt "permission Deny" exception

conn = Drivermanager.getconnection (Url, "vagrant", "vagrant");

} catch (ClassNotFoundException e) {

E.printstacktrace ();

System.exit (1);

} catch (SQLException e) {

E.printstacktrace ();

}

Return conn;

}

public static PreparedStatement prepare (Connection conn, String sql) {

PreparedStatement PS = null;

try {

PS = conn.preparestatement (SQL);

} catch (SQLException e) {

E.printstacktrace ();

}

return PS;

}

public static void GetAll (String tablename) {

conn=getconnnection ();

String sql= "SELECT * from" +TABLENAME;

SYSTEM.OUT.PRINTLN (SQL);

try {

PS=PREPARE (conn, SQL);

Rs=ps.executequery ();

int Columns=rs.getmetadata (). getColumnCount ();

while (Rs.next ()) {

for (int i=1;i<=columns;i++) {

System.out.print (rs.getstring (i));

System.out.print ("\t\t");

}

System.out.println ();

}

} catch (SQLException e) {

E.printstacktrace ();

}

public static void Main (string[] args) {

String tablename= "Test1";

GETALL (tablename);

}

The above code is for Hiveserver2. If it's hiveserver. There are two changes that need to be modified, as follows:

Org.apache. Hive. jdbc. Hivedriver instead:org.apache. Hadoop. Hive.jdbc.HiveDriver

Jdbc:hive2://localhost:10000/default instead of:jdbc:hive://localhost:10000/default

where 'localhost' is the host address,10000 is the default DB after the Port .

1. Java Operation Hive via JDBC

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More