I've described two ways in which hive imports analysis results into MySQL tables, respectively: Sqoop import and Using hive, MySQL JDBC driver, now I'm going to introduce a third, and use a lot more ways--using hive custom Functions ( UDF or genericudf) inserts each record into a database table.
First, using the UDF method
Using UDF is a simple implementation, as long as you inherit the UDF class and override the Evaluate method to
1. Write implementation class
Package Com.gxnzx.hive.udf;import Org.apache.hadoop.hive.ql.exec.udf;import Com.gxnzx.hive.util.DBSqlHelper; public class Analyzestatistics extends udf{public string Evaluate (string clxxbh,string hphm) { // JTXX2 database is the target database table String sql= "insert into jtxx2 values (?,?)"; Insert records into the database if (dbsqlhelper.addbatch (SQL, CLXXBH, HPHM)) { return clxxbh+ " SUCCESS " +HPHM; } else{ return clxxbh+ " faile " +HPHM; } }
2, database operation method
public static Boolean Addbatch (String sql,string clxxbh,string hphm) { Boolean flag=false; try{ conn=dbsqlhelper.getconn ();//Open a database connection ps= (preparedstatement) conn.preparestatement (SQL); Ps.setstring (1, CLXXBH); Ps.setstring (2, HPHM); System.out.println (Ps.tostring ()); Ps.execute (); flag=true; } catch (Exception e) { e.printstacktrace (); } finally{ try { ps.close (); } catch (SQLException e) { e.printstacktrace (); } } return flag; }
3. Use Eclipse to import the project package into a hive class environment using a jar package
hive> Add jar Hiveudf2.jar
4. Import the MySQL JDBC driver package into the Hive class environment
hive> Add Jar/home/hadoopuser/cloud/hive/apache-hive-0.13.1-bin/lib/mysql-connector-java-5.1.18-bin.jar
5. Create hive Temporary function
Hive> Create temporary function analyze as ' com.gxnzx.hive.udf.AnalyzeStatistics ';
6. Testing
Hive> Select Analyze (CLXXBH,HPHM) from transjtxx_hbase limit 10; Total jobs = 1Launching Job 1 out of 1Number of reduce tasks are set to 0 since there ' s no reduce operatorstarting Job = Jo b_1428394594787_0034, Tracking URL = Http://secondmgt:8088/proxy/application_1428394594787_0034/Kill Command =/home/ Hadoopuser/cloud/hadoop/programs/hadoop-2.2.0/bin/hadoop Job-kill Job_1428394594787_0034hadoop Job information for Stage-1: Number of mappers:1; Number of reducers:02015-04-23 10:15:34,355 Stage-1 map = 0, reduce = 0%2015-04-23 10:15:51,032 Stage-1 map = 100%, re Duce = 0, Cumulative CPU 7.14 secmapreduce Total cumulative cpu time:7 seconds, msecended Job = job_1428394594787_003 4MapReduce Jobs launched:job 0:map:1 Cumulative cpu:7.14 sec HDFs read:256 HDFs write:532 successtotal MapReduce CPU time Spent:7 seconds msecOK32100017000000000220140317000015 SUCCESS lu Q5818232100017000000000220140317000016 SU ccess Lu QV466232100017000000000220140317000019 SUCCESS suLL812832100017000000000220140317000020 SUCCESS su CAH36732100017000000000220140317000023 SUCCESS Lu q7899w32100017000000000220140317000029 SUCCESS su HN381932100017000000000220140317000038 SUCCESS Lu C0157632100017000000000220140317000044 SUCCESS su DT917832100017000000000220140317000049 SUCCESS Su LZ111232100017000000000220140317000052 SUCCESS su K9795 police time taken:35.815 seconds, Fetched:10 row (s)
7. View data in MySQL table
Mysql> SELECT * from jtxx2;+----------------------------------+-------------+| CLLXBH | hphm |+----------------------------------+-------------+| 32100017000000000220140317000015 | Lu Q58182 | | 32100017000000000220140317000016 | lu QV4662 | | 32100017000000000220140317000019 | su LL8128 | | 32100017000000000220140317000020 | Su CAH367 | | 32100017000000000220140317000023 | LU q7899w | | 32100017000000000220140317000029 | su HN3819 | | 32100017000000000220140317000038 | Lu C01576 | | 32100017000000000220140317000044 | su DT9178 | | 32100017000000000220140317000049 | su LZ1112 | | 32100017000000000220140317000052 | Su K9795 police |+----------------------------------+-------------+10 rows in Set (0.00 sec)
second, the use of Genericudf way
Using the Genericudf method, the implementation is more complex, I refer to someone else's code, as follows:
1. Write Call function
Package Com.gxnzx.hive.main;import Java.sql.connection;import Java.sql.drivermanager;import Java.sql.preparedstatement;import Java.sql.sqlexception;import Org.apache.commons.logging.log;import Org.apache.commons.logging.logfactory;import Org.apache.hadoop.hive.ql.exec.description;import Org.apache.hadoop.hive.ql.exec.udfargumenttypeexception;import org.apache.hadoop.hive.ql.metadata.HiveException ; Import Org.apache.hadoop.hive.ql.udf.udftype;import Org.apache.hadoop.hive.ql.udf.generic.genericudf;import Org.apache.hadoop.hive.serde.constants;import Org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector; Import Org.apache.hadoop.hive.serde2.objectinspector.primitiveobjectinspector;import Org.apache.hadoop.hive.serde2.objectinspector.primitive.primitiveobjectinspectorfactory;import Org.apache.hadoop.hive.serde2.objectinspector.primitive.stringobjectinspector;import org.apache.hadoop.io.intwritable;/** * Analyzegenericudfdboutput is designed to output data directly from Hive to a * JDBC datastore. This UDF was useful for exporting small to medium summaries * that has a unique key. * * Due to the nature of Hadoop, individual mappers, reducers or entire jobs can * fail. If A failure occurs a mapper or reducer may be retried. This UDF has no * to detecting failures or rolling back a transaction. Consequently, you * should only the use of this to the export to a table with a unique key. The unique * key should safeguard against duplicate data. * Should follow below three steps first of all, you need * to packag the UDF into the jar file; Secondly, you should use hive Add jar * feature to add the UDF jar file to current class path; Thirdly, you should * the Use hive Add Jar feature to add JDBC Driver the jar file to the current class path; * Fourthly, you should use hive create temporary function feature to create a * temporary function belong to the UDF Clas S. * * Examples for mysql:hive> add jar Udf.jar hive> add jar * mysql-connector-java-5.1.18-Bin.jar hive> Create temporary function * Analyzedboutput as ' com.gxnzx.hive.main.AnalyzeGenericUDFDBOutput ' */@ Description (name = "Analyzedboutput", value = "_func_ (jdbctring,username,password,preparedstatement,[arguments])" + " -Sends data to a JDBC driver ", extended =" argument 0 are the JDBC Connection string\n "+" Argument 1 is the database user name\n "+" argument 2 is the database user's password\n "+" argument 3 is a SQL query to BES used in the preparedstatement\n "+" argument (4-n) The remaining arguments must be primitive and is "+" passed to the PreparedStatement object\n ") @UDFTyp E (deterministic = false) public class Analyzegenericudfdboutput extends genericudf {private static final log log = Logfacto Ry.getlog (AnalyzeGenericUDFDBOutput.class.getName ());p rivate transient objectinspector[] argumentoi;private Transient Connection Connection = null;private string Url;private string User;private string pass;private final intwritabl E result = new intwritable ( -1);/** * @param argumeNTS * Argument 0 is the JDBC connection string Argument 1 are the * User Name argument 2 is the PASSW Ord Argument 3 is a SQL * query to being used in the PreparedStatement argument (4-n) the * remaining Arguments must be primitive and is passed to the * PreparedStatement Object */@Overridepublic Objectinspector Initialize (objectinspector[] arguments) throws udfargumenttypeexception {Argumentoi = arguments;//this should be connection//url,username,password,query,column1[,columnn]*for (int i = 0; i < 4; i++) {if (Arguments[i].getcategory () = = ObjectInspector.Category.PRIMITIVE) {Primitiveobjectinspector poi = ((primitiveobjectinspector) arguments[i]); if ( ! (poi.getprimitivecategory () = = PrimitiveObjectInspector.PrimitiveCategory.STRING)) {throw new Udfargumenttypeexception (I, "the argument of function should is \" "+ Constants.string_type_name +" \ ", but \" "+ Arguments[i].gettypename () + "\" is found ");}}} for (int i = 4; I < ArgumenTs.length; i++) {if (arguments[i].getcategory ()! = ObjectInspector.Category.PRIMITIVE) {throw new Udfargumenttypeexception (I, " The argument of function should be primative ' + ', but ' "" + arguments[i].gettypename () + "\" is found ");}} return primitiveobjectinspectorfactory.writableintobjectinspector;} /** * @return 0 on Success-1 on Failure */@Overridepublic Object evaluate (deferredobject[] arguments) throws Hiveexceptio n {url = ((stringobjectinspector) argumentoi[0]). Getprimitivejavaobject (Arguments[0].get ()); user = (( Stringobjectinspector) (Argumentoi[1]). Getprimitivejavaobject (Arguments[1].get ());p (Stringobjectinspector) ARGUMENTOI[2]). Getprimitivejavaobject (Arguments[2].get ()); try {connection = drivermanager.getconnection (URL, user, pass);} catch (SQLException ex) {log.error ("Driver Loading or connection issue", ex); Result.set (2);} if (connection! = null) {try {preparedstatement PS = connection.preparestatement ((stringobjectinspector) argumentoi[3] ). Getprimitivejavaobject (ArgumenTs[3].get ())); for (int i = 4; i < arguments.length; ++i) {Primitiveobjectinspector poi = ((primitiveobjectinspector) AR Gumentoi[i]);p s.setobject (I-3,poi.getprimitivejavaobject (Arguments[i].get ())); Ps.execute ();p s.close (); result.set (0);} catch (SQLException e) {log.error ("Underlying SQL exception", e); Result.set (1);} finally {try {connection.close ();} catch (Exception ex) {log.error ("underlying SQL Exception during close", Ex);}}} return result;} @Overridepublic String getdisplaystring (string[] children) {StringBuilder SB = new StringBuilder (); Sb.append ("Dboutput ("); if (Children.length > 0) {sb.append (children[0]); for (int i = 1; i < children.length; i++) {sb.append (","); Sb.append (Children[i]);}} Sb.append (")"); return sb.tostring ();}}
2, the program into a jar package, import into the Hive class path
hive> add jar Hivegenericudf.jar;
3. Add MySQL JDBC driver jar file
hive> Add Jar/home/hadoopuser/cloud/hive/apache-hive-0.13.1-bin/lib/mysql-connector-java-5.1.18-bin.jar
4. Create temporary functions
Hive> Create temporary function analyzedboutput as ' com.gxnzx.hive.main.AnalyzeGenericUDFDBOutput ';
5. Testing
Hive> Select Analyzedboutput (' Jdbc:mysql://192.168.2.133:3306/transport ', ' hive ', ' hive ', ' insert into jtxx2 values ' (?,?) ', CLXXBH,HPHM) from transjtxx_hbase limit 5; Total jobs = 1Launching Job 1 out of 1Number of reduce tasks are set to 0 since there ' s no reduce operatorstarting Job = Jo b_1428394594787_0043, Tracking URL = Http://secondmgt:8088/proxy/application_1428394594787_0043/Kill Command =/home/ Hadoopuser/cloud/hadoop/programs/hadoop-2.2.0/bin/hadoop Job-kill Job_1428394594787_0043hadoop Job information for Stage-1: Number of mappers:1; Number of reducers:02015-04-23 22:01:46,205 Stage-1 map = 0, reduce = 0%2015-04-23 22:02:01,985 Stage-1 map = 100%, re Duce = 0, Cumulative CPU 9.37 secmapreduce Total cumulative cpu Time:9 seconds 370 msecended Job = job_1428394594787_004 3MapReduce Jobs launched:job 0:map:1 Cumulative cpu:9.37 sec HDFs read:256 hdfs write:10 successtotal MapReduce C PU time Spent:9 seconds 370 msecok00000time taken:32.118 seconds, Fetched:5 row (S
Analyzedboutput six parameters are: MySQL jdbc connection string, MySQL data user name, password, SQL INSERT statement, corresponding CLXXBH,HPHM two query fields in hive table.
6. View MySQL database table data
Mysql> select * from jtxx2; Empty Set (0.00 sec) mysql> Select * from jtxx2;+----------------------------------+-----------+| CLLXBH | hphm |+----------------------------------+-----------+| 32100017000000000220140317000015 | Lu Q58182 | | 32100017000000000220140317000016 | lu QV4662 | | 32100017000000000220140317000019 | su LL8128 | | 32100017000000000220140317000020 | Su CAH367 | | 32100017000000000220140317000023 | LU q7899w |+----------------------------------+-----------+5 Rows in Set (0.00 sec)
End here
Import hive Statistical analysis results into a MySQL database table (iii)-using hive UDF or genericudf