Spark-sql Connecting Hive

Source: Internet
Author: User

Step One: fix the Hive configuration file Hive-site.xml

Add the following properties to cancel the local metadata service:

<property>    <name>hive.metastore.local</name>    <value>false</value >  </property>

 To modify the Hive Metadata service address and port:

<property>  <name>hive.metastore.uris</name>  <value>thrift://  192.168.10.10:9083</value>for the   remote Metastore. Used by Metastore client-to-connect to remote metastore.</description></property>

Then copy the configuration file Hive-site.xml to the Conf directory of Spark

Step two : Copy the Mysql-connector-java-5.1.41-bin.jar to the spark's jar directory using MySQL for the hive metabase

It is now possible to query the hive database under the Scala terminal.

But one of the first requirements was to use Spark-sql to query hive.

So the start of Spark-sql, a day has been reported in the following error

ExceptioninchThread"Main"Java.lang.RuntimeException:java.lang.RuntimeException:Unable to instantiate Org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient at Org.apache.hadoop.hive.ql.session.SessionState.start (Sessionstate.java:522) at Org.apache.spark.sql.hive.thriftserver.sparksqlclidriver$.main (Sparksqlclidriver.scala: the) at Org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main (Sparksqlclidriver.scala) at SUN.REFLECT.NATIVEMETHODACCESSORIMPL.INVOKE0 (Native Method) at Sun.reflect.NativeMethodAccessorImpl.invoke ( Nativemethodaccessorimpl.java: +) at Sun.reflect.DelegatingMethodAccessorImpl.invoke (Delegatingmethodaccessorimpl.java: +) at Java.lang.reflect.Method.invoke (Method.java:498) at org.apache.spark.deploy.sparksubmit$.org$apache$spark$deploy$sparksubmit$ $runMain (Sparksubmit.scala: 738) at org.apache.spark.deploy.sparksubmit$.dorunmain$1(Sparksubmit.scala:187) at Org.apache.spark.deploy.sparksubmit$.submit (Sparksubmit.scala:212) at Org.apache.spark.deploy.sparksubmit$.main (Sparksubmit.scala:126) at Org.apache.spark.deploy.SparkSubmit.main (Sparksubmit.scala) caused By:java.lang.RuntimeException:Unable to Instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient at Org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance (Metastoreutils.java:1523) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init> (Retryingmetastoreclient.java: the) at Org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy (Retryingmetastoreclient.java: the) at Org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy (Retryingmetastoreclient.java:104) at Org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient (Hive.java:3005) at Org.apache.hadoop.hive.ql.metadata.Hive.getMSC (Hive.java:3024) at Org.apache.hadoop.hive.ql.session.SessionState.start (Sessionstate.java:503)    ...  One  Morecaused by:java.lang.reflect.InvocationTargetException at Sun.reflect.NativeConstructorAccessorImpl.newInstance0 (Native Method) at Sun.reflect.NativeConstructorAccessorImpl.newInstance (Nativeconstructoraccessorimpl.java: +) at Sun.reflect.DelegatingConstructorAccessorImpl.newInstance (Delegatingconstructoraccessorimpl.java: $) at Java.lang.reflect.Constructor.newInstance (Constructor.java:423) at Org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance (Metastoreutils.java:1521)    ...  -  Morecaused by:metaexception (message:version information not foundinchMetastore.) At Org.apache.hadoop.hive.metastore.ObjectStore.checkSchema (Objectstore.java:6664) at Org.apache.hadoop.hive.metastore.ObjectStore.verifySchema (Objectstore.java:6645) at Sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method) at Sun.reflect.NativeMethodAccessorImpl.invoke (Nativemethodaccessorimpl.java: +) at Sun.reflect.DelegatingMethodAccessorImpl.invoke (Delegatingmethodaccessorimpl.java: +) at Java.lang.reflect.Method.invoke (Method.java:498) at Org.apache.hadoop.hive.metastore.RawStoreProxy.invoke (Rawstoreproxy.java: the) at Com.sun.proxy. $Proxy 6.verifySchema (Unknown Source) at Org.apache.hadoop.hive.metastore.hivemetastore$hmsha Ndler.getms (Hivemetastore.java:572) at Org.apache.hadoop.hive.metastore.hivemetastore$hmshandler.createdefaultdb (Hivemetastore.java:620) at Org.apache.hadoop.hive.metastore.hivemetastore$hmshandler.init (Hivemetastore.java:461) at Org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init> (Retryinghmshandler.java: the) at Org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy (Retryinghmshandler.java: the) at Org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler (Hivemetastore.java:5762) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init> (Hivemetastoreclient.java:199) at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init> (Sessionhivemetastoreclient.java: About)    ...  A  More

At first I check this bug is the first line of error information to check, did not succeed, the following search for the last error message

Message:version information not found in Metastore

finally found a solution to the problem, Change the value of Hive.metastore.schema.verification in Hive-site.xml to false

 <property> <name>hive.metastore.schema.verification</name> <value>false  </value> <description> enforce Metastore schema vers      Ion consistency. True:verify that version information stored  in    is compatible with one from Hive jars. Also Disable automatic schema migration attempt. Users is required to manually migrate schema after Hive upgrade  which   Ensures proper Metastore schema migration.      (Default) False:warn  if  the version information stored in  metastore doesn "  t match with one from in Hive jars.  </description></property> 

The reason should be that Hive's jar package and storage metadata information versions are inconsistent, and this setting is not verified.

Reference Blog: http://www.cnblogs.com/rocky-AGE-24/p/7345417.html

http://blog.csdn.net/jyl1798/article/details/41087533

http://dblab.xmu.edu.cn/blog/1086-2/

http://blog.csdn.net/youngqj/article/details/19987727

Spark-sql Connecting Hive

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.