1, the program can not load hive package, need to compile the spark (with Spark-shell boot, with Spark-sql can directly access hive table) in the Lib directory, test out the assembly package, for it to create a maven repository, And then add it to dependency inside. The stupidest way to create a repository is to create a path directly, and then change the name of the. Pom inside the Spark-core to copy it directly.
2, when submitted with Yarn-cluster, encountered: Spark SQL Java.lang.RuntimeException:java.lang.RuntimeException:Unable to instantiate Org.apache.hadoop.hive Such a problem, this is the program does not $spark_home/lib inside the DataNucleus package import, in the--jars add can: $ (echo $SPARK _home/lib/*.jar| Tr ' \ n ' | grep DataNucleus | Tr ' \ n ', ')
3. After loading the jar package, it is discovered that spark Metastore is encountered. Retryinghmshandler:nosuchobjectexception error, this is spark did not find Hive-site.xml file, add in--files.
Spark-sql use hive table to run problems and solutions in Yarn-cluster mode