- Sparksql Accessing HBase Configuration
- Test validation
Sparksql to access HBase configuration:
- Copy the associated jar package for HBase to the $spark_home/lib directory on the SPARK node, as shown in the following list:
Guava-14.0.1.jarhtrace-core-3.1.0-incubating.jarhbase-common-1.1.2.2.4.2.0-258.jarhbase-common-1.1.2.2.4.2.0-258-tests.ja Rhbase-client-1.1.2.2.4.2.0-258.jarhbase-server-1.1.2.2.4.2.0-258.jarhbase-protocol-1.1.2.2.4.2.0-258.jarhive-hbase-handl Er-1.2.1000.2.4.2.0-258.jar
- On Ambari, configure the $spark_home/conf/spark-env.sh of the SPARK node to add the above jar package to Spark_classpath, such as:
- A list of configuration items is as follows: note There can be no spaces or carriage returns between Jar packages
Export Spark_classpath=/usr/hdp/2.4.2.0-258/spark/lib/guava-11.0.2.jar:/usr/hdp/2.4.2.0-258/spark/lib/ hbase-client-1.1.2.2.4.2.0-258.jar:/usr/hdp/2.4.2.0-258/spark/lib/hbase-common-1.1.2.2.4.2.0-258.jar:/usr/hdp/ 2.4.2.0-258/spark/lib/hbase-protocol-1.1.2.2.4.2.0-258.jar:/usr/hdp/2.4.2.0-258/spark/lib/ Hbase-server-1.1.2.2.4.2.0-258.jar:/usr/hdp/2.4.2.0-258/spark/lib/hive-hbase-handler-1.2.1000.2.4.2.0-258.jar :/usr/hdp/2.4.2.0-258/spark/lib/htrace-core-3.1.0-incubating.jar: /usr/hdp/2.4.2.0-258/spark/lib/ Protobuf-java-2.5.0.jar:${spark_classpath}
- Copy the Hbase-site.xml to ${hadoop_conf_dir}, because the HADOOP profile directory ${hadoop_conf_dir} is configured in spark-env.sh, Therefore, the hbase-site.xml is loaded, and the Hbase-site.xml is mainly configured with the following parameters:
- Ambari Component Services that are affected after you modify the configuration on a restart
Test validation:
- Any spark client node validation:
- Command: cd/usr/hdp/2.4.2.0-258/spark/bin (Spark installation directory)
- Command: ./spark-sql
- Execution: select * from Stocksinfo; (Stocksinfo is the hive external table associated with hbase)
- The results are OK:
Spark (iv): Spark-sql read HBase