Premise:
1, spark1.0 Package compile time to specify the support hive:./make-distribution.sh--hadoop 2.3.0-cdh5.0.0--with-yarn --with-hive --tgz
2, installation finished spark1.0;
3. Install the CDH version of hive corresponding to Hadoop;
Spark SQL supports hive cases:
1. Copy the Hive-site.xml configuration file to $spark_home/conf
Hive-site.xml file contents are as follows:
<?xml version= "1.0"? ><?xml-stylesheet type= "text/xsl" href= "configuration.xsl"?><configuration> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql: //hadoop000 :3306/hive?createdatabaseifnotexist=true</value> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> </property> <property> <name> javax.jdo.option.connectionusername</name> <value>root</value> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value> Root</value> </property></configuration>
2. Start Spark:spark-shell
The case comes from Spark's official document: http://spark.apache.org/docs/latest/sql-programming-guide.html
//Create HivecontextVal Hivecontext =NewOrg.apache.spark.sql.hive.HiveContext (SC)//Implicit conversionsImporthivecontext._//Creating hive TablesHQL ("CREATE TABLE IF not EXISTS hive.kv_src (key INT, value STRING)")//load data into hive tableHQL ("LOAD DATA LOCAL inpath '/home/spark/app/spark-1.0.0-bin-2.3.0-cdh5.0.0/examples/src/main/resources/kv1.txt ' Into TABLE hive.kv_src ")//Query by HQLHQL ("From Hive.kv_src SELECT key, Value"). Collect (). foreach (println)