In Oozie, you cannot directly execute impala SQL scripts like hive SQL. There is currently no Impala operation, so you must use the shell operation that calls impala-shell. The shell script that calls impala-shell must also include an environment variable that sets the location of PYTHON EGGS. This is an example of a shell script (impala_overwrite.sh):
export PYTHON_EGG_CACHE=./myeggs
/usr/bin/kinit -kt YourKeytabFile.keytab -V<your username> #This one is optional
impala-shell -q "invalidatemetadata"
#You can also use the -f parameter to execute an impala SQL file
# impala-shell -f "impala_test.sql"
Note: If you do not set the location of PYTHON_EGG_CACHE, the job will fail to execute (Main class [org.apache.oozie.action.hadoop.ShellMain], exitcode [1] error will be reported when the workflow job is executed).
In the case of a kerberized cluster, this also does kinit. This is the workflow using the script:
<workflow-appname="shell-impala-invalidate-wf"xmlns="uri:oozie:workflow:0.4">
<startto="shell-impala-invalidate"/>
<actionname="shell-impala-invalidate">
<shell xmlns="uri:oozie:shell-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
</configuration>
<exec>shell-impala-invalidate.sh</exec>
<file>shell-impala-invalidate.sh#shell-impala-invalidate.sh</file>
<file>YourKeytabFile.keytab#YourKeytabFile.keytab</file>
</shell>
<ok to="end"/>
<error to="kill"/>
</action>
<kill name="kill">
<message>Action failed, errormessage[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>
</workflow-app>
The <file> tag must be included in the shell script; unless Kerberos is used, the keytab part is optional.
The graphical interface configuration of the Web UI in hue is as follows:
1. Create a new Workflow, the type is shell
Execute Impala Sheel script with Oozie in Hue
2. Select the location of the shell script
Execute Impala Sheel script with Oozie in Hue
You need to put the shell script to be executed from the local to the HDFS workspace directory of Oozie workflow, as follows:
/user/hue/oozie/workspaces/hue-oozie-1519636855.0
Execute Impala Sheel script with Oozie in Hue
Execute Impala Sheel script with Oozie in Hue
If the directory placed in the shell script is not correct, the Cannot run program "impala_overwrite.sh"… java.io.IOException:error=2, No such file or directory error will be reported when the workflow job is executed:
Execute Impala Sheel script with Oozie in Hue
3. Successfully execute the workflow job after completing the configuration
Execute Impala Sheel script with Oozie in Hue
4. View workflow configuration