from:http://hugh-wangp.iteye.com/blog/1472371
The template to use when you write your own code
UDF steps:
1. org.apache.hadoop.hive.ql.exec.UDF2 must be inherited. The Evaluate function must be implemented, and the Evaluate function supports overloading
Java code
- <span style="Font-size:x-small;" > Packagecom.alibaba.hive.udf;
- Import Org.apache.hadoop.hive.ql.exec.UDF
- Public class Helloword extends udf{
- Public String Evaluate () {
- return "Hello world!";
- }
- Public string Evaluate (String str) {
- return "Hello World:" + str;
- }
- }</span>
UDAF steps:
1. Must Inherit Org.apache.hadoop.hive.ql.exec.UDAF (function class inheritance) Org.apache.hadoop.hive.ql.exec.UDAFEvaluator (inner class evaluator Real Now Udafevaluator interface) 2.Evaluator need to implement INIT, iterate, terminatepartial, merge, terminate these functions init (): Similar to constructors, for UDAF initialization of ITE Rate (): Receives incoming parameters and makes internal rotation. Its return type is Boolean terminatepartial (): No parameter, which returns the iterate function after the rotation of the functions, iterate and terminatepartial similar to Hadoop combiner ( Iterate--mapper;terminatepartial--reducer) Merge (): Receive terminatepartial return result, data merge operation with a return type of Boolean terminate () : Returns the result of the final aggregation function
Java code
- <span style="Font-size:x-small;" > Packagecom.alibaba.hive;
- Import Org.apache.hadoop.hive.ql.exec.UDAF;
- Import Org.apache.hadoop.hive.ql.exec.UDAFEvaluator;
- Public class Myavg extends udaf{
- public static class avgscore{
- private long pSum;
- private double pcount;
- }
- public static class Avgevaluator extends udafevaluator{
- Avgscore score;
- Public Avgevaluator () {
- Score = new Avgscore ();
- Init ();
- }
- /*
- The *init function is similar to a constructor for UDAF initialization
- */
- public void init () {
- Score.psum = 0;
- Score.pcount = 0;
- }
- /*
- The *iterate receives the incoming parameters and makes internal rotations. Its return type is Boolean
- * Similar to mapper in Combiner
- */
- Public Boolean iterate (Double in) {
- if (in = null) {
- Score.psum + = in;
- Score.pcount + +;
- }
- return true;
- }
- /*
- *terminatepartial no parameters, which is the iterate function after the end of the rotation, return to the rotation data
- * Similar to reducer in combiner
- */
- Public Avgscore terminatepartial () {
- return score.pcount = = 0? Null:score;
- }
- /*
- The *merge receives the return result of the terminatepartial and makes a data merge operation with a return type of Boolean
- */
- Public Boolean merge (Avgscore in) {
- if (in = null) {
- Score.psum + = In.psum;
- Score.pcount + = In.pcount;
- }
- return true;
- }
- /*
- *terminate returns the result of the final aggregation function
- */
- Public Double Terminate () {
- return score.pcount = = 0? null:Double.valueof (Score.psum/score.pcount);
- }
- }
- }</span>
UDTF steps: 1. Must inherit Org.apache.hadoop.hive.ql.udf.generic.GenericUDTF
2. Implement initialize, process, close three methods
3.UDTF will first
A. Call the Initialize method, which returns information about the returned rows of the UDTF (number of returns, type)
B. Once the initialization is complete, the process method is called, the passed parameters are processed, and the result can be returned by the ForWord () method.
C. Last Close () method call to clean up the method that needs to be cleaned Java code
- <span style="Font-size:x-small;" ><span style="Font-size:xx-small;" >public class Genericudtfexplode extends GENERICUDTF {
- private Listobjectinspector listoi = null;
- @Override
- public Void Close () throws hiveexception {
- }
- @Override
- Public Structobjectinspector Initialize (objectinspector[] args) throws udfargumentexception {
- if (args.length! = 1) {
- throw New Udfargumentexception ("explode () takes only one argument");
- }
- if (args[0].getcategory ()! = ObjectInspector.Category.LIST) {
- throw New Udfargumentexception ("explode () takes an array as a parameter");
- }
- Listoi = (listobjectinspector) args[0];
- arraylist<string> FieldNames = new arraylist<string> ();
- arraylist<objectinspector> Fieldois = new arraylist<objectinspector> ();
- Fieldnames.add ("col");
- Fieldois.add (Listoi.getlistelementobjectinspector ());
- return Objectinspectorfactory.getstandardstructobjectinspector (FieldNames,
- Fieldois);
- }
- private final object[] forwardobj = new object[1];
- @Override
- public void Process (object[] o) throws hiveexception {
- list<?> list = listoi.getlist (o[0]);
- if (list = = null) {
- return;
- }
- For (Object r:list) {
- forwardobj[0] = R;
- Forward (forwardobj);
- }
- }
- @Override
- Public String toString () {
- return "explode";
- }
- }</span></span>
Go Map reduce code framework template for HIVE UDF/UDAF/UDTF