Build a database test in hive, create a table user in the database, and use Spark SQL to read the table in the Spark program
"Select * Form Test.user"
The program works correctly when the deployment mode is spark stand mode and yarn-client mode, but the Yarn-cluster mode reports errors that cannot be found for the "test.user" table.
Workaround:
Spark and Hive are integrated to add the hive-site.xml to the spark root conf, so you need to add hive-site.xml to the--files when submitting the spark task
Example: Spark version is 2.0.1
Spark-submit--class cn.xxx. Xx.xxx--master yarn--deploy-mode cluster --executor-memory 5g--num-executors 5 --FILES/DATA/HIVE-1.2.1/ Conf/hive-site.xml
Reference:
http://blog.csdn.net/baiyangfu_love/article/details/40402743
In Yarn-cluster mode, the original application name in the set is not effective, this add--name parameter resolution:
Spark-submit--class cn.xxx. Xx.xxx--master yarn--deploy-mode cluster --executor-memory 5g--num-executors 5--name test--files/data/hive-1.2. 1/conf/hive-site.xml
The main cause of this problem is that the yarn-client and yarn-cluster patterns are different in the execution order of Setappname when the task is submitted, The setappname in Yarn-client is read before the application is registered with yarn, and the Yarn-cluser mode is read after registering the application with yarn. This causes the application name of the Yarn-cluster mode setting to not take effect.
Reference: http://support.hwclouds.com/usermanual-mrs/zh-cn_topic_0036027341.html
Spark SQL cannot find a table in Yarn-cluster mode