Sparksql and Hive Integration

Source: Internet
Author: User

Hive Configuration

To edit $HIVE _home/conf/hive-site.xml, add the following:

<property> <name>hive.metastore.uris</name> <value>thrift://master:9083</value> <description>thrift URI for the remote Metastore. Used by Metastore client-to-connect to remote metastore.</description></property>12345
Start Hive Metastore
Start Metastore: $hive--service metastore & View Metastore: $jobs [1]+ Running hive--service metastore &A MP; off Metastore: $kill%1kill%jobid,1 Representative Job id1234567891011
Spark Configuration
The $HIVE _home/conf/hive-site.xml copy or the soft link to the $SPARK _home/conf/will $HIVE _home/lib/mysql-connector-java-5.1.12.jar Copy or soft chain to $spark_home/lib/copy or soft chain $spark_home/lib/is convenient for SPARK standalone mode use 123
Start Spark-sql
    1. ./bin/spark-sql --master spark:master:7077 --jars /home/stark_ Summer/spark/spark-1.4/spark-1.4.1/lib/mysql-connector-java-5.1.12.jar

      • 1

      Yarn-client mode

$./bin/spark-sql --master yarn-client --jars /home/stark_summer/spark/spark-1.4/ Spark-1.4.1/lib/mysql-connector-java-5.1.12.jar Execution  sql:select count (*)  from o2o_app; results : 302time taken: 0.828 seconds, fetched 1 row (s) 2015-09-14 18:27:43,158  INFO  [main] CliDriver  (SessionState.java:printInfo (536))  - Time  Taken: 0.828 seconds, fetched 1 row (s) spark-sql> 2015-09-14 18:27:43,160  info  [sparklistenerbus] scheduler. statsreportlistener  (Logging.scala:logInfo)  - Finished stage: [email  Protected] 18:27:43,161 info  [sparklistenerbus] scheduler. statsreportlistener  (Logging.scala:logInfo)  - task runtime: (Count: 1, mean:  242.000000, stdev: 0.000000, max: 242.000000, min: 242.000000) 2015-09-14  18:27:43, 161 info  [sparklistenerbus] scheduler. statsreportlistener  (Logging.scala:logInfo)  -    0%       5%      10%     25%      50%     75%     90%     95%      100%2015-09-14 18:27:43,161 INFO  [SparkListenerBus]  Scheduler. statsreportlistener  (Logging.scala:logInfo)  -    242.0 ms         242.0 ms        242.0 ms         242.0 ms         242.0 ms        242.0 ms    242.0  Ms 242.0 ms      &nBsp; 242.0 ms2015-09-14 18:27:43,162 info  [sparklistenerbus] scheduler. statsreportlistener  (Logging.scala:logInfo)  - fetch wait time: (Count: 1,  mean: 0.000000, stdev: 0.000000, max: 0.000000, min: 0.000000) 2015-09-14 18:27:43,162 info  [sparklistenerbus] scheduler. statsreportlistener  (Logging.scala:logInfo)  -    0%       5%      10%     25%      50%     75%     90%     95%      100%2015-09-14 18:27:43,162 INFO  [SparkListenerBus]  Scheduler. statsreportlistener  (Logging.scala:logInfo)  -    0.0 ms  0.0  ms  0.0 ms  0.0 ms  0.0 ms  0.0 ms  0.0 ms  0.0 ms   0.0 ms2015-09-14 18:27:43,163 info  [sparklistenerbus] scheduler. statsreportlistener  (Logging.scala:logInfo)  - remote bytes read: (Count: 1,  mean: 31.000000, stdev: 0.000000, max: 31.000000, min: 31.000000) 2015-09-14 18:27:43,163 info  [sparklistenerbus] scheduler. statsreportlistener  (Logging.scala:logInfo)  -    0%       5%      10%     25%      50%     75%     90%     95%      100%2015-09-14 18:27:43,163 INFO  [SparkListenerBus]  Scheduler. statsreportlistener  (Logging.scala:logInfo (59))  -    31.0 b  31.0 b  31.0 b  31.0  B  31.0 B  31.0 B  31.0 B  31.0 B   31.0 b2015-09-14 18:27:43,163 info  [sparklistenerbus] scheduler. statsreportlistener  (Logging.scala:logInfo)  - task result size: (Count: 1,  mean: 1228.000000, stdev: 0.000000, max: 1228.000000, min:  1228.000000) 2015-09-14 18:27:43,163 info  [sparklistenerbus] scheduler. statsreportlistener  (Logging.scala:logInfo)  -    0%       5%      10%     25%      50%     75%     90%     95%      100%2015-09-14 18:27:43, 163 info  [sparklistenerbus] scheduler. statsreportlistener  (Logging.scala:logInfo)  -    1228.0 B         1228.0 b        1228.0 b         1228.0 B         1228.0 b        1228.0 b    1228.0  b 1228.0 b        1228.0 b2015-09-14 18:27:43,164  info  [sparklistenerbus] scheduler. statsreportlistener  (Logging.scala:logInfo)  - executor  (non-fetch)  time pct:   (count: 1, mean: 69.834711, stdev: 0.000000, max: 69.834711, min :  69.834711) 2015-09-14 18:27:43,164 info  [sparklistenerbus] scheduler. Statsreportlistener  (Logging.scala:logInfo)  -    0%      5%       10%     25%     50%      75%     90%     95%      100%2015-09-14 18:27:43,164 INFO  [SparkListenerBus]  Scheduler. statsreportlistener  (Logging.scala:logInfo)  -    70 %     70 %    70 %    70 %    70  %    70 %    70 %    70 %     70 %2015-09-14 18:27:43,165 INFO  [SparkListenerBus]  Scheduler. statsreportlistener  (Logging.scala:logInfo)  - fetch wait time pct:  ( Count: 1, mean: 0.000000, stdev: 0.000000, max: 0.000000, min: 0.000000) 2015-09-14  18:27:43,165 info  [sparklistenerbus] scheduler. statsreportlistener  (Logging.scala:logInfo)  -    0%       5%      10%     25%      50%     75%     90%     95%      100%2015-09-14 18:27:43,165 INFO  [SparkListenerBus]  Scheduler. statsreportlistener  (Logging.scala:logInfo)  -     0 %      0 %     0 %     0 %      0 %     0 %     0 %      0 %  &nbsP;  0 %2015-09-14 18:27:43,166 info  [sparklistenerbus] scheduler. statsreportlistener  (Logging.scala:logInfo)  - other time pct:  (count: 1 ,  mean: 30.165289, stdev: 0.000000, max: 30.165289, min: 30.165289) 2015-09-14 18:27:43,166 info  [sparklistenerbus] scheduler. statsreportlistener  (Logging.scala:logInfo)  -    0%       5%      10%     25%      50%     75%     90%     95%      100%2015-09-14 18:27:43,166 INFO  [SparkListenerBus]  Scheduler. statsreportlistener  (Logging.scala:logInfo)  -    30 %     30 %     30 %    30 %    30 %    30 %     30 %    30 %    30 % 12345678910111213141516171819202122232425262728293031
    1. Yarn-cluster mode

./bin/spark-sql--master Yarn-cluster--jars/home/dp/spark/spark-1.4/spark-1.4.1/lib/ Mysql-connector-java-5.1.12.jarerror:cluster Deploy mode is not a applicable to Spark SQL shell. Run with--help for usage help or--verbose for debug output2015-09-14 18:28:28,291 INFO [Thread-0] util. Utils (Logging.scala:logInfo)-Shutdown Hook calledcluster deploy mode not supported 123456
Start Spark-shell
    1. Standalone mode

./bin/spark-shell--master spark:master:7077--jars/home/stark_summer/spark/spark-1.4/spark-1.4.1/lib/ Mysql-connector-java-5.1.12.jar1
    1. Yarn-client mode

./bin/spark-shell--master Yarn-client--jars/home/dp/spark/spark-1.4/spark-1.4.1/lib/ Mysql-connector-java-5.1.12.jarsqlcontext.sql ("From O2o_app Select count (appkey,name1,name2)"). Collect (). foreach ( PRINTLN) 1234

Respecting originality, refusing to reprint, http://blog.csdn.net/stark_summer/article/details/48443147


Sparksql and Hive Integration

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.