Using Java to develop a HelloWorld level UDF, packaged into Udf.jar, stored under/home/hadoop/lib, the code is as follows:
package com.luogankun.udf; import Org.apache.hadoop.hive.ql.exec.UDF; public class HELLOUDF extends UDF { public string evaluate (string str) { try { return " HelloWorld "+ Str; catch (Exception e) { return null ; } }}
using UDFs in Hive
CD $SPARK _home/binspark--jars/home/hadoop/lib/udf.jar' Com.luogankun.udf.HelloUDF';
Select 1;
using UDFs in Sparksql
Mode one: specified by--jars when starting Spark-sql
CD $SPARK _home/binspark--jars/home/hadoop/lib/udf.jar' Com.luogankun.udf.HelloUDF';
Select 1;
Mode two: First start spark-sql after add jar
CD $SPARK _home/binspark-/home/hadoop/lib/'Com.luogankun.udf.HelloUDF ';
Select 1;
Found in the test process does not support the way, will be reported Java.lang.ClassNotFoundException:com.luogankun.udf.HelloUDF
How to solve?
1) You need to first configure the path of the Udf.jar to the spark-env.sh Spark_classpath, as follows:
Export spark_classpath= $SPARK _classpath:/home/hadoop/software/mysql-connector-java-5.1. -bin.jar:/home/hadoop/lib/udf.jar
2) Restart Spark-sql, directly create temporary function can be;
CD $SPARK _home/binspark-'com.luogankun.udf.HelloUDF';
Select 1;
Method Three: Thrift using UDF in JDBC server
Execute on the beeline command line:
Add jar/home/hadoop/lib/'com.luogankun.udf.HelloUDF';
Select 1;
How to use the UDF with Sparksql