Spark connects MySQL and MongoDB

Source: Internet
Author: User
Tags mysql query log4j

During the spark operation, it is often necessary to connect different types of databases to obtain or store data, and here will mention how spark connects MySQL and MongoDB.

1. Connecting MySQL, a new concept Dataframe is presented in version 1.3, so the following methods are obtained for Dataframe, but are available through javardd<row> rows = Jdbcdf.tojavardd () Converted to Javardd.

Importjava.io.Serializable;ImportJava.util.HashMap;Importjava.util.List;ImportJava.util.Map;Importorg.apache.spark.SparkConf;ImportOrg.apache.spark.api.java.JavaSparkContext;ImportOrg.apache.spark.sql.DataFrame;ImportOrg.apache.spark.sql.Row;ImportOrg.apache.spark.sql.SQLContext; Public classMainImplementsSerializable {Private Static FinalOrg.apache.log4j.Logger Logger = Org.apache.log4j.Logger.getLogger (Main.class); Private Static FinalString mysql_driver = "Com.mysql.jdbc.Driver"; Private Static FinalString mysql_username = "Expertuser"; Private Static FinalString mysql_pwd = "expertuser123"; Private Static FinalString Mysql_connection_url = "jdbc:mysql://localhost:3306/employees?user=" + mysql_username + "&password=" +mysql_pwd; Private Static FinalJavasparkcontext sc =NewJavasparkcontext (NewSparkconf (). Setappname ("Sparkjdbcds"). Setmaster ("local[*]")); Private Static FinalSqlContext SqlContext =NewSqlContext (SC);  Public Static voidMain (string[] args) {//Data Source Optionsmap<string, string> options =NewHashmap<>(); Options.put ("Driver", Mysql_driver); Options.put ("url", Mysql_connection_url);//getconnection Returns a structured database connection that is already open, and Jdbcrdd automatically maintains the shutdown. Options.put ("DBTable",                    "(Select Emp_no, Concat_ws (", first_name, last_name) as full_name from employees) as Employees_name ");//SQL is a query statement, and this query statement must contain two placeholders? As a parameter for splitting the database Resulset, for example: "Select Title, author from books where? < = ID and ID <=?" "Options.put ("Partitioncolumn", "emp_no");//table fields that are partitionedOptions.put ("Lowerbound", "10001");//Owerbound, Upperbound, Numpartitions respectively is the first, the second placeholder, the number of partition. For example, given Lowebound 1,upperbound, numpartitions 2, the query is (1, 10) with (one, three)Options.put ("Upperbound", "499999"); Options.put ("Numpartitions", "10"); //Load MySQL Query result as DataFrameDataFrame JDBCDF = sqlcontext.load ("jdbc", Options); javardd<row> rows = Jdbcdf.tojavardd ();
List<Row> employeefullnamerows =jdbcdf.collectaslist (); for(Row employeefullnamerow:employeefullnamerows) {logger.info (employeefullnamerow); } }}

2. Connect MongoDB

Refer to Https://github.com/mongodb/mongo-hadoop/wiki/Spark-Usage

Spark connects MySQL and MongoDB

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.