Hive and Oozie methods for adding third-party jar packages

Source: Internet
Author: User

This article source: http://blog.csdn.net/bluishglc/article/details/46005269 prohibited any form of reprint, or will entrust CSDN official maintenance rights!

Many times, we need to introduce a third-party jar package in hive or a "UDF" jar package that we write ourselves. In hive, there are two configurations involved in specifying an external jar package:

    1. Configuration item "Hive.aux.jars.path" in Hive-site.xml
    2. Environment variable: Hive_aux_jars_path

There are two points that can be identified from the current experiment:

    1. The Hive.aux.jars.path configuration item is valid for Hive server, but does not affect the hive shell. This means that even if you configure this on a hive node, it is not valid for the hive shell.
    2. The environment variable Hive_aux_jars_path is valid for the hive shell.
Hive-site.xml Configuration item: Hive.aux.jars.path

For Hive.aux.jars.path configuration, it is recommended to develop an HDFS path, which is more convenient for uploading the jar packets to HDFs. If you specify a local path, you need to make sure that the required jar packages are placed at the corresponding locations on each node, which can be cumbersome to operate.

environment variable Hive_aux_jars_path

For environment variable Hive_aux_jars_path, in particular, we usually say that setting this variable can introduce the corresponding jar, but under the current version of hive, the value of this variable has some problems, Let's take a look at the shell script that starts hive hive-env.sh, which has a section like this:

# Folder containing extra libraries Required for Hive Compilation/execution can is controlled by:  if  [" ${hive_aux_jars_path}  "! = " "  ];  then  export  Hive_aux_jars_path=${hive_aux_jars_path}  elif  [-d   "/usr/hdp/current/hive-webhcat/share/ Hcatalog "];  then  export  Hive_aux_jars_path=/usr/hdp/current/hive-webhcat/share/hcatalogfi   

This is a bad script because once we set a value for Hive_aux_jars_path, the/usr/hdp/current/hive-webhcat/share/hcatalog will be ignored. This obviously looks weird, in fact hive can only read a hive_aux_jars_path, which is the main reason for the weird code above. So, it's a good idea that we can place our shared jar packages in one place, and then create a corresponding soft connection under/usr/hdp/current/hive-webhcat/share/hcatalog, for example, We put the jar uniformly under the/usr/lib/share-lib and then set up the soft connection:

-u-s /usr/lib/share-lib/elasticsearch-hadoop-2.1.0.Beta4.jar /usr/hdp/current/hive-webhcat/share/hcatalog/elasticsearch-hadoop-2.1.0.Beta4.jar
How to specify a third-party jar package in Oozie

If your hive script that relies on a third-party jar itself is a link in a Oozie workflow, then our work is not finished, and if you configure and introduce a third-party jar in Oozie, your workflow will still fail. For Oozie, the method of introducing a third-party jar is Oozie.service.WorkflowAppService.system.libpathin Oozie-site.xml, and we need to configure this option, and upload the corresponding jar package to this directory. Note that this is also an HDFS path!

Hive and Oozie methods for adding third-party jar packages

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.