This article is based on a sequel to the previous article
One: Program
1. Procedures,
1 Packagecom.scala.it2 3 Importjava.util.Properties4 5 ImportOrg.apache.spark.sql.SaveMode6 ImportOrg.apache.spark.sql.hive.HiveContext7 ImportOrg.apache.spark. {sparkconf, sparkcontext}8 9 Object Hivetomysql {Tendef main (args:array[string]): Unit = { OneVal conf =Newsparkconf () A. Setmaster ("local[*]") -. Setappname ("Hive-yo-mysql") -Val sc =sparkcontext.getorcreate (conf) theVal SqlContext =NewHivecontext (SC) -Val (URL, username, password) = ("jdbc:mysql://linux-hadoop01.ibeifeng.com:3306/hadoop09", "root", "123456") -Val props =NewProperties () -Props.put ("User", username) +Props.put ("Password", password) - + // ================================== A //First step: Synchronize Hive's dept table to MySQL at SqlContext - . Read -. Table ("Hadoop09.dept")//Database.tablename - . Write -. Mode (Savemode.overwrite)//Presence Overlay -. JDBC (URL, "Mysql_dept", props) in - //The second step: the Hive table and MySQL table for data join operation ==> using HQL statement implementation to //2.1 registering MySQL data as a temporary table + SqlContext - . Read the. JDBC (URL, "Mysql_dept", props) *. registertemptable ("Temp_mysql_dept")//do not appear in the temp table "." $ Panax Notoginseng //Third Step data join - Sqlcontext.sql ( the """ +| SELECT a.*, B.dname,b.loc A| From Hadoop09.emp a joins temp_mysql_dept b on a.deptno =B.deptno the"" ". Stripmargin) + . Write -. Format ("Org.apache.spark.sql.execution.datasources.parquet") $ . Mode (savemode.overwrite) $. Save ("/spark/join/parquet") - - //detecting whether the data was successfully join the SqlContext - . ReadWuyi. Format ("Parquet") the. Load ("/spark/join/parquet") - . Show () Wu - } About}
2. Effects
Second: Knowledge points
1.format
Can write the package name.
Join between Hive and MySQL two data sources