1. Java Operation Hive via JDBC

Source: Internet
Author: User
Tags prepare time in milliseconds metabase

0. Overview

The use of CLI or Hive–e only allows the use of HIVEQL to perform queries, updates, and other operations. However, Hive also provides a client-side implementation through Hiveserver or HIVESERVER2, where the client can manipulate data in hive without starting the CLI, both allowing remote clients to use multiple programming languages such as Java, Python submits the request to hive to retrieve the results.

The similarities and differences between Hiveserver and HiveServer2?

Hiveserver and HiveServer2 are based on thrift. Why do you need HiveServer2 if you already exist hiveserver? Because Hiveserver cannot handle concurrent requests from more than one client, this is due to limitations caused by the thrift interface used by Hiveserver and cannot be corrected by modifying Hiveserver code. Therefore, rewriting the Hiveserver code in the Hive-0.11.0 version has been HiveServer2, which solves the problem. HIVESERVER2 supports multi-client concurrency and authentication, providing better support for open API clients such as JDBC and ODBC.

1. Start the service

1), Hive-site. XML key configuration

<property>

<name>hive.metastore.warehouse.dir</name>

<value>/usr/hive/warehouse</value>//(the location of the database in hive and the folder where the table resides in HDFs)

<description>location of default database for the warehouse</description>

</property>

<property>

<name>hive.server2.thrift.port</name>

<value>10000</value>//(HiveServer2 remote connection port, default is 10000)

<description>port number of HiveServer2 Thrift interface.

Can be overridden by setting $HIVE _server2_thrift_port</description>

</property>

<property>

<name>hive.server2.thrift.bind.host</name>

<value>**.**.**.**</value>//(IP address of hive cluster)

<description>bind host on which to run the HiveServer2 Thrift interface. Can be overridden by setting $HIVE _server2_thrift_bind_host</description>

</property>

<property>

<name>hive.server2.long.polling.timeout</name>

<value>5000</value>//(the default is 5000L, here is changed to 5000, otherwise the program will error)

<description>time in milliseconds that HiveServer2 would wait, before responding to asynchronous calls that use long Polling</description>

</property>

<property>

<name>javax.jdo.option.ConnectionURL</name>

<value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value>//(Hive's metabase, I'm using local mysql as a meta database)

<DESCRIPTION>JDBC connect string for a JDBC metastore</description>

</property>

<property>

<name>javax.jdo.option.ConnectionDriverName</name>//(driver name of the connection metadata)

<value>com.mysql.jdbc.Driver</value>

<description>driver class name for a JDBC metastore</description>

</property>

<property>

<name>javax.jdo.option.ConnectionUserName</name>//(Connect metabase user name)

<value>hive</value>

<description>username to use against Metastore database</description>

</property>

<property>

<name>javax.jdo.option.ConnectionPassword</name>//(Connecting a metabase password)

<value>hive</value>

<description>password to use against Metastore database</description>

</property>

2), start meta-database

Start the metabase first, and at the command line type: Hive--service metastore &

3), start the service

#hive--service hiveserver2 >/dev/null &

The above command starts the Hiveserver2 service.

Hive provides the JDBC driver so that we can use Java code to connect to hive and make SQL statement queries for some class-relational databases. First, we have to open the Hive service, which is hiveserve. If you start Hiveserver, change the above command to

#hive --service hiveserver >/dev/null &

2. Place the required jar package in

$HADOOP_home/share/hadoop/common/hadoop-common-2.8.0.jar

$HIVE _home/lib/hive-exec-2.1. 1.jar

$HIVE _home/lib/hive-jdbc-2.1. 1.jar

$HIVE _home/lib/hive-metastore-2.1. 1.jar

$HIVE _home/lib/hive-service-2.1. 1.jar

$HIVE _home/lib/libfb303-0.9.3.jar

$HIVE _home/lib/commons-logging-1.2.jar

$HIVE _home/lib/slf4j-api-1.6.1.jar

3.Java Connection program

Import java.sql.Connection;

Import Java.sql.DriverManager;

Import java.sql.SQLException;

Import java.sql.PreparedStatement;

Import Java.sql.ResultSet;

Import java.sql.Statement;

public class Hiveclientutils {

private static String drivername = "Org.apache.hive.jdbc.HiveDriver";

Fill in the IP of Hive, previously configured in the configuration file IP

private static String url= "Jdbc:hive2://localhos:10000/default";

private static Connection Conn;

private static PreparedStatement PS;

private static ResultSet Rs;

// Create connection

public static Connection Getconnnection () {

try {

Class.forName (drivername);

The user name here must be a user with permission to operate HDFs, or the program will prompt "permission Deny" exception

conn = Drivermanager.getconnection (Url, "vagrant", "vagrant");

} catch (ClassNotFoundException e) {

E.printstacktrace ();

System.exit (1);

} catch (SQLException e) {

E.printstacktrace ();

}

Return conn;

}

public static PreparedStatement prepare (Connection conn, String sql) {

PreparedStatement PS = null;

try {

PS = conn.preparestatement (SQL);

} catch (SQLException e) {

E.printstacktrace ();

}

return PS;

}

public static void GetAll (String tablename) {

conn=getconnnection ();

String sql= "SELECT * from" +TABLENAME;

SYSTEM.OUT.PRINTLN (SQL);

try {

PS=PREPARE (conn, SQL);

Rs=ps.executequery ();

int Columns=rs.getmetadata (). getColumnCount ();

while (Rs.next ()) {

for (int i=1;i<=columns;i++) {

System.out.print (rs.getstring (i));

System.out.print ("\t\t");

}

System.out.println ();

}

} catch (SQLException e) {

E.printstacktrace ();

}

}

public static void Main (string[] args) {

String tablename= "Test1";

GETALL (tablename);

}

}

The above code is for Hiveserver2. If it's hiveserver. There are two changes that need to be modified, as follows:

Org.apache. Hive. jdbc. Hivedriver instead:org.apache. Hadoop. Hive.jdbc.HiveDriver

Jdbc:hive2://localhost:10000/default instead of:jdbc:hive://localhost:10000/default

where 'localhost' is the host address,10000 is the default DB after the Port .

1. Java Operation Hive via JDBC

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.