0. Overview
The use of CLI or Hive–e only allows the use of HIVEQL to perform queries, updates, and other operations. However, Hive also provides a client-side implementation through Hiveserver or HIVESERVER2, where the client can manipulate data in hive without starting the CLI, both allowing remote clients to use multiple programming languages such as Java, Python submits the request to hive to retrieve the results.
The similarities and differences between Hiveserver and HiveServer2?
Hiveserver and HiveServer2 are based on thrift. Why do you need HiveServer2 if you already exist hiveserver? Because Hiveserver cannot handle concurrent requests from more than one client, this is due to limitations caused by the thrift interface used by Hiveserver and cannot be corrected by modifying Hiveserver code. Therefore, rewriting the Hiveserver code in the Hive-0.11.0 version has been HiveServer2, which solves the problem. HIVESERVER2 supports multi-client concurrency and authentication, providing better support for open API clients such as JDBC and ODBC.
1. Start the service
1), Hive-site. XML key configuration
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/usr/hive/warehouse</value>//(the location of the database in hive and the folder where the table resides in HDFs)
<description>location of default database for the warehouse</description>
</property>
<property>
<name>hive.server2.thrift.port</name>
<value>10000</value>//(HiveServer2 remote connection port, default is 10000)
<description>port number of HiveServer2 Thrift interface.
Can be overridden by setting $HIVE _server2_thrift_port</description>
</property>
<property>
<name>hive.server2.thrift.bind.host</name>
<value>**.**.**.**</value>//(IP address of hive cluster)
<description>bind host on which to run the HiveServer2 Thrift interface. Can be overridden by setting $HIVE _server2_thrift_bind_host</description>
</property>
<property>
<name>hive.server2.long.polling.timeout</name>
<value>5000</value>//(the default is 5000L, here is changed to 5000, otherwise the program will error)
<description>time in milliseconds that HiveServer2 would wait, before responding to asynchronous calls that use long Polling</description>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value>//(Hive's metabase, I'm using local mysql as a meta database)
<DESCRIPTION>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>//(driver name of the connection metadata)
<value>com.mysql.jdbc.Driver</value>
<description>driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>//(Connect metabase user name)
<value>hive</value>
<description>username to use against Metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>//(Connecting a metabase password)
<value>hive</value>
<description>password to use against Metastore database</description>
</property>
2), start meta-database
Start the metabase first, and at the command line type: Hive--service metastore &
3), start the service
#hive--service hiveserver2 >/dev/null &
The above command starts the Hiveserver2 service.
Hive provides the JDBC driver so that we can use Java code to connect to hive and make SQL statement queries for some class-relational databases. First, we have to open the Hive service, which is hiveserve. If you start Hiveserver, change the above command to
#hive --service hiveserver >/dev/null &
2. Place the required jar package in
$HADOOP_home/share/hadoop/common/hadoop-common-2.8.0.jar
$HIVE _home/lib/hive-exec-2.1. 1.jar
$HIVE _home/lib/hive-jdbc-2.1. 1.jar
$HIVE _home/lib/hive-metastore-2.1. 1.jar
$HIVE _home/lib/hive-service-2.1. 1.jar
$HIVE _home/lib/libfb303-0.9.3.jar
$HIVE _home/lib/commons-logging-1.2.jar
$HIVE _home/lib/slf4j-api-1.6.1.jar
3.Java Connection program
Import java.sql.Connection;
Import Java.sql.DriverManager;
Import java.sql.SQLException;
Import java.sql.PreparedStatement;
Import Java.sql.ResultSet;
Import java.sql.Statement;
public class Hiveclientutils {
private static String drivername = "Org.apache.hive.jdbc.HiveDriver";
Fill in the IP of Hive, previously configured in the configuration file IP
private static String url= "Jdbc:hive2://localhos:10000/default";
private static Connection Conn;
private static PreparedStatement PS;
private static ResultSet Rs;
// Create connection
public static Connection Getconnnection () {
try {
Class.forName (drivername);
The user name here must be a user with permission to operate HDFs, or the program will prompt "permission Deny" exception
conn = Drivermanager.getconnection (Url, "vagrant", "vagrant");
} catch (ClassNotFoundException e) {
E.printstacktrace ();
System.exit (1);
} catch (SQLException e) {
E.printstacktrace ();
}
Return conn;
}
public static PreparedStatement prepare (Connection conn, String sql) {
PreparedStatement PS = null;
try {
PS = conn.preparestatement (SQL);
} catch (SQLException e) {
E.printstacktrace ();
}
return PS;
}
public static void GetAll (String tablename) {
conn=getconnnection ();
String sql= "SELECT * from" +TABLENAME;
SYSTEM.OUT.PRINTLN (SQL);
try {
PS=PREPARE (conn, SQL);
Rs=ps.executequery ();
int Columns=rs.getmetadata (). getColumnCount ();
while (Rs.next ()) {
for (int i=1;i<=columns;i++) {
System.out.print (rs.getstring (i));
System.out.print ("\t\t");
}
System.out.println ();
}
} catch (SQLException e) {
E.printstacktrace ();
}
}
public static void Main (string[] args) {
String tablename= "Test1";
GETALL (tablename);
}
}
The above code is for Hiveserver2. If it's hiveserver. There are two changes that need to be modified, as follows:
Org.apache. Hive. jdbc. Hivedriver instead:org.apache. Hadoop. Hive.jdbc.HiveDriver
Jdbc:hive2://localhost:10000/default instead of:jdbc:hive://localhost:10000/default
where 'localhost' is the host address,10000 is the default DB after the Port .
1. Java Operation Hive via JDBC