Hive Custom Function UDAF development

Last Update:2014-12-14 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Hive Custom Function UDAF development
Hive supports custom functions, UDAF is to accept multiple lines, output one line. This function is usually used when group by.
In fact, the best learning materials are the official examples. I'm using the version 0.10 hive here, so for the examples inHttps://github.com/apache/hive/tree/branch-0.10/contrib/src/java/org/apache/hadoop/hive/contrib/udaf/example

The functional requirements I have here are:Actioncount (act_code,act_times, ' 1 ')
if act_code== ' 1 ', then the act_times in a group will be added together.

Package Hive.udaf;import Org.apache.hadoop.hive.ql.exec.udaf;import Org.apache.hadoop.hive.ql.exec.UDAFEvaluator ;/** * * It should is very easy-follow and can be used as a example for writing * New Udafs. * * Note that Hive internally uses a different mechanism (called GENERICUDAF) to * implement built-in aggregation Functio NS, which is harder to program but * more efficient.   * */public Final class Actioncount extends Udaf {/** * The internal state of a aggregation for average.   * * Note that this was only needed if the internal state cannot was represented * by a primitive. * The internal state can also contains fields with types like * arraylist<string> and HASHMAP&LT;STRING,DOUBL   e> if needed.    */public static class Udafstate {private Long mCount;  Private long mSum; }/** * The actual class for doing the aggregation.   Hive would automatically look * for all internal classes of the UDAF that implements Udafevaluator. */public static Class Udafexampleavgevaluator implements Udafevaluator {udafstate state;      Public Udafexampleavgevaluator () {super ();      state = new Udafstate ();    Init ();     }/** * Reset the state of the aggregation.      */public void init () {state.msum = 0;    State.mcount = 0;     }/** * Iterate through one row of original data.     * The number and type of arguments need to the same as we call this UDAF * from Hive command line.     * * This function should always return true. */Public Boolean iterate (String act_code,long act_times,string act_type)//a line {if (act_code. Equals (act_t        ype)) {state.msum + = Act_times;      state.mcount++;    } return true; }/** * Terminate a partial aggregation and return the state.     If the state was a * primitive, just return primitive Java classes like Integer or String. */Public Udafstate terminatepartial () {//state pass//This is SQL standard- Average of zero items should be null. return State.mcount = = 0?    Null:state;     }/** * Merge with a partial aggregation. * * This function should always had a single argument which had the same * type as the return value of terminate     Partial ().        */Public boolean merge (Udafstate o) {//Sub-task merge if (o! = null) {state.msum + = O.msum;      State.mcount + = O.mcount;    } return true;     }/** * Terminates the aggregation and return the final result.      */Public long terminate () {//Return final result//This is SQL standard-average of zero items should be null. return State.mcount = = 0?    0:state.msum; }} private Actioncount () {//Prevent instantiation}}

The key is to deeply understand the map-reduce work model to better harness hive.
The author of this article: Linger
This article link: http://blog.csdn.net/lingerlanlan/article/details/41920151

Hive Custom Function UDAF development

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Hive Custom Function UDAF development

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Hive Custom Function UDAF development

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support