Hive has been used for a while, but no related logs have been written, because hive is mainly used in the create table, upload data, and crud processes. Later, I needed some frequently used methods in my work. I learned that hive supports UDF (user define function). I have read some articles and found that UDF writing is also very simple, inherit the UDF and override the evaluate method. The following uses an ip2long method as a reference.
1. Write a UDF class
01 |
import org.apache.hadoop.hive.ql.exec.UDF; |
03 |
public class NewIP2Long extends UDF { |
04 |
public static long ip2long(String ip) { |
06 |
String[] ips = ip.split( "[.]" ); |
11 |
for ( int i = 0 ; i < ips.length; i++) { |
12 |
ipNum = ipNum << Byte.SIZE | Long.parseLong(ips[i]); |
18 |
public long evaluate(String ip) { |
19 |
if (ip.matches( "\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}" )) { |
21 |
long ipNum = ip2long(ip); |
23 |
} catch (Exception e) { |
31 |
public static void main(String[] argvs) { |
32 |
NewIP2Long ipl = new NewIP2Long(); |
33 |
System.out.println(ip2long( "112.64.106.238" )); |
34 |
System.out.println(ipl.evaluate( "58.35.186.62" )); |
2. Compile and package it into ip2long. jar.
3. When ip2long is required:
1 |
add jar /tmp/NEWIP2Long.jar; |
2 |
drop temporary function ip2long; |
3 |
create temporary function ip2long as 'NewIP2Long' ; // If the class has a package name, add the package name |
4 |
select ip2long(ip) from XXX ; |
This method requires "add" and "CREATE" each time it is used. It is very troublesome to compile UDF into hive source code.
Advanced: Compile custom UDF into hive
Recompile Hive:
1) copy the written Jave file ~ /Install/hive-0.8.1/src/QL/src/Java/org/Apache/hadoop/hive/QL/UDF/
1 |
cd ~/install/hive- 0.8 . 1 /src/ql/src/java/org/apache/hadoop/hive/ql/udf/ |
2) Modify ~ /Install/hive-0.8.1/src/QL/src/Java/org/Apache/hadoop/hive/QL/exec/functionregistry. Java, add import and registerudf
1 |
import com.meilishuo.hive.udf.UDFIp2Long; // Add import |
3 |
registerUDF( "ip2long" , UDFIp2Long. class , false ); // Add register |
3) In ~ Run ant-dhadoop. Version = 1.0.1 package in/install/hive-0.8.1/src.
1 |
cd ~/install/hive- 0.8 . 1 /src |
2 |
ant -Dhadoop.version= 1.0 . 1 package |
4) Replace the exec jar package. The newly generated package is in the/hive-0.8.1/src/build/QL directory. Replace the link.
1 |
cp hive-exec- 0.8 . 1 .jar /hadoop/hive/lib/hive-exec- 0.8 . 1 .jar. 0628 |
3 |
ln -s hive-exec- 0.8 . 1 .jar. 0628 hive-exec- 0.8 . 1 .jar |
5) restart the hive service.
6) test