MySQL data source connection is clearly configured in the spark-default.conf file
Then start Spark-shell execute the following test code:
Import Org.apache.spark. {sparkcontext, Sparkconf}import org.apache.spark.sql. {savemode, dataframe}import org.apache.spark.sql.hive.HiveContextval mysqlurl = "Jdbc:mysql://localhost:3306/yangsy ? user=root&password=yangsiyi " val people_ddl = S" "" CREATE temporary TABLE people USING ORG.APACHE.SPARK.SQL.JDBC options ( url ' ${mysqlurl} ', dbtable ' person ' ) "". Stripmargin sqlcontext.sql (PEOPLE_DDL) val person = SQL ("SELECT * from people"). Cache () Val name = "Name" val Targets = Person.filter ("name =" +name). Collect ()
Collect () when the newspaper did not find driver
This is a very strange question. The data source connection is also true Ah, after all, in Hive's metastore also uses this AH. At the end of the day, the jar package can only be introduced at the start of Spark-shell =
./spark-shell--jars/usr/local/spark-1.4.0-bin-2.5.0-cdh5.2.1/lib/mysql-connector-java-5.1.30-bin.jar
Then the execution is OK. Weird..
Or a jar package that introduces MySQL before executing collect () can also
Sqlcontext.sql ("Add Jar/usr/local/spark-1.4.0-bin-2.5.0-cdh5.2.1/lib/mysql-connector-java-5.1.30-bin.jar")
But it always feels wrong. There is a solution to the guide ha ~
Spark 1.4 connecting MySQL weird problems and solutions