The list of hive pre-defined UDF functions is as follows, mainly from the translation of describe function XXX and code Annotations :)
- ABS (x)-returns the absolute value of X
- ACOs (x)-returns the arc cosine of X if-1 <= x <= 1 or null otherwise
- ASCII (STR)-returns the numeric value of the first character of STR
- Asin (x)-returns the arc sine of X if-1 <= x <= 1 or null otherwise
- Bin (N)-returns N in binary
Convert to binary
- Ceil (x)-find the smallest integer not smaller than X
- Ceiling (x)-find the smallest integer not smaller than X
- Coalesce (A1, A2,...)-returns the first non-null argument
Returns the first non-null parameter.
- Concat (str1, str2)-returns the concatenation of str1 and str2
- Conv (Num, from_base, to_base)-convert num from from_base to to_base
Convert to base, from from_base to to_base
- Cos (x)-returns the cosine of X (X is in radians)
- Date_add (start_date, num_days)-returns the date that is num_days after start_date.
Add date by day. The format of date is 'yyyy-mm-dd hh: mm: ss' or 'yyyy-mm-dd', the same below
- Date_sub (start_date, num_days)-returns the date that is num_days before start_date.
Minus by day
- Datediff (date1, date2)-returns the number of days between date1 and date2
Date difference by day
- Day (date)-returns the date of the Month of date
Same as dayofmonth
- Dayofmonth (date)-returns the date of the Month of date
- ELT (n, str1, str2,...)-returns the n-th string
- Exp (x)-returns e to the power of X
- Floor (x)-find the largest integer not greater than X
- From_unixtime (unix_time, format)-returns unix_time in the specified format
- Get_json_object (json_txt, PATH)-extract a JSON object from path
Parses JSON objects. Path supports a subset of jsonpath, including: * $: root object *.: Child operator * []: subscript operator for Array **: wildcard for [] For example, The src_json table contains only one column of JSON, and the content of one row is:
+-------------------------------------------------------------------+ json +-------------------------------------------------------------------+ {"store": {"fruit":[{"weight":8,"type":"apple"},{"weight":9,"type":"pear"}], "bicycle":{"price":19.95,"color":"red"} }, "email":"amy@only_for_json_udf_test.net", "owner":"amy" } +-------------------------------------------------------------------+
JSON content can be parsed using the following query statement:
hive> SELECT get_json_object(src_json.json, '$.owner') FROM src_json; amy hive> SELECT get_json_object(src_json.json, '$.store.fruit[0]') FROM src_json; {"weight":8,"type":"apple"} hive> SELECT get_json_object(src_json.json, '$.non_exist_key') FROM src_json; NULL
- Hash (A1, A2,...)-returns a hash value of the arguments
- Hex (N or Str)-convert the argument to hexadecimal
- Index (A, n)-returns the n-th element of
- Instr (STR, substr)-returns the index of the first occurance of substr in Str
- Isnotnull A-returns true if A is not null and false otherwise
- Isnull A-returns true if A is null and false otherwise
- Lcase (STR)-returns STR with all characters changed to lowercase
- Length (STR)-returns the length of STR
- Ln (x)-returns the natural logarithm of X
- Locate (substr, STR [, POS])-returns the position of the first occurance of substr in STR after position POS
- Log ([B], x)-returns the logarithm of X with base B
- Log10 (x)-returns the logarithm of X with base 10
- Log2 (x)-returns the logarithm of X with base 2
- Lower (STR)-returns STR with all characters changed to lowercase
- Lpad (STR, Len, pad)-returns STR, left-padded with pad to a length of Len
- Ltrim (STR)-removes the leading space characters from Str
- Month (date)-returns the month of date
- Parse_url (URL, parttoextract [, key])-extracts a part from a URL
Parse the URL string. The parttoextract options include [host, path, query, ref, protocol, file, authority, userinfo].
For example, * parse_url ('HTTP: // facebook.com/path/p1.php? Query = 1', 'host') returns 'Facebook. com' * parse_url ('HTTP: // facebook.com/path/p1.php? Query = 1', 'path') returns '/path/p1.php' * parse_url ('HTTP: // facebook.com/path/p1.php? Query = 1', 'query') returns 'query = 1'. You can specify the key to return specific parameters. The key format is query: <key_name>, for example, query: K1.
* Parse_url ('HTTP: // facebook.com/path/p1.php? Query = 1 # ref ', 'ref') returns 'ref '* parse_url ('HTTP: // facebook.com/path/p1.php? Query = 1 # ref ', 'protocol') return 'http'
- A pmod B-compute the positive modulo
- Pow (x1, x2)-raise X1 to the power of X2
- Power (x1, x2)-raise X1 to the power of X2
- Rand ([seed])-returns a random udorandom number between 0 and 1
- Regexp_extract (STR, Regexp [, idx])-extracts a group that matches Regexp
- Regexp_replace (STR, Regexp, Rep)-replace all substrings of STR that match Regexp with rep
- Repeat (STR, n)-repeat STR n times
- Reverse (STR)-reverse Str
- Round (X [, d])-round X to d decimal places
- Rpad (STR, Len, pad)-returns STR, right-padded with pad to a length of Len
- Rtrim (STR)-removes the trailing space characters from Str
- Sin (x)-returns the sine of X (X is in radians)
- Size (a)-returns the size of
- Space (N)-returns n Spaces
- Split (STR, RegEx)-splits STR around occurances that match RegEx
- SQRT (x)-returns the square root of X
- Substr (STR, POS [, Len])-returns the substring of STR that starts at POs and is of length Len
- Substring (STR, POS [, Len])-returns the substring of STR that starts at POs and is of length Len
- To_date (expr)-extracts the date part of the date or datetime expression expr
Returns the date string ('yyyy-mm-dd '). Expr can be date or datetime ('yyyy-mm-dd hh: mm: ss ').
- Trim (STR)-removes the leading and trailing space characters from Str
- Ucase (STR)-returns STR with all characters changed to uppercase
- Unix_timestamp ([date [, pattern])-returns the Unix Timestamp
Returns the Unix-style time epoch (the number of seconds since 00:00:00 ). The time epoch of the current time is returned by default.
Returns the time epoch of the specified time when the date is set. You can use pattern to format a date string. For example, unix_timestamp ('20140901', 'yyyymmdd') returns 20121228
- Upper (STR)-returns STR with all characters changed to uppercase
- Year (date)-returns the year of date
From: http://tedxu.is-programmer.com/posts/17272.html