Java and SPARK2. x method for connecting mongodb3.x standalone or cluster (with authentication and without authentication)

Source: Internet
Author: User
Tags data structures mongoclient

First, it is clear that there is no difference between accessing the MONGOs and accessing the standalone Mongod. The next method is to access both the Mongod and the MONGOs.

In addition, read Java to write Scala, anyway, everyone can understand ... Probably?

1. Connection method without authentication cluster (JAVAScala):

The first is to create a connection method, we declare a client, and then specify the DB and collection to access:

    Private New Mongoclient ("192.168.2.51", 27017)    private lazy val db = mongo.getdatabase ("Test")     Private lazy val dbcoll = db.getcollection ("origin2")

Then we read the data:

Import com.mongodb.client.model.Filters. {eq= dbcoll.find (eqq ("Basiclabel.procedure", "second"). Iterator ()

Amount: The above code is read data with filter filtering. First import Com.mongodb.client.model.Filters.eq and rename the EQ to eqq, and then read the specified data through the Dbcoll.find (Bson) method. The rest is the way the normal iterator is used, and the data that docs gets is iterator[document].

Then we update the data:

Dbcoll.updateone (EQQ ("_id", X.get ("_id"), Set ("Segdata", Fenduan (str, name)))

The above code is to find _id corresponding data, and set one of the fields as a new value, this value can be document,string,int,list a series of data structures. I'm here. The Fenduan method returns a document, nesting a layer.

As far as inserting data is easier:

Dbcoll.insertone (DOC)

2. The spark Read method without authentication (Scala, righteously)

Two ways, one is to create sparksession (Sparkcontext can use the second method, wake up brother, 2017 years ), directly specify "Spark.mongodb.input.uri". Then use the normal Mongospark to read the data. (Pipeline inside is the filter, willing to try the others can try other methods under the filter). The RDD is used because the RDD is more suitable for a series of fine-grained transformations such as map and Flatmap, and if you only need to read the data, you can use the Mongospark.read (Spark) method to get Dataframereader directly.

Val Spark =Sparksession.builder (). Master ("spark://192.168.2.51:7077"). config (NewSparkconf (). Setjars (Array ("Hdfs://192.168.2.51:9000/mongolib/mongo-spark-connector_2.11-2.0.0.jar",                "Hdfs://192.168.2.51:9000/mongolib/bson-3.4.2.jar",                "Hdfs://192.168.2.51:9000/mongolib/mongo-java-driver-3.4.2.jar",                "Hdfs://192.168.2.51:9000/mongolib/mongodb-driver-3.4.2.jar",                "Hdfs://192.168.2.51:9000/mongolib/mongodb-driver-core-3.4.2.jar",                "Hdfs://192.168.2.51:9000/mongolib/commons-io-2.5.jar",                "Hdfs://192.168.2.51:9000/segwithorigin2.jar"))). config ("Spark.cores.max", 80). config ("Spark.executor.cores", 16). config ("Spark.executor.memory", "32g"). config ("Spark.mongodb.input.uri", "mongodb://192.168.2.51:27017/test.origin2") //. config ("Spark.mongodb.output.uri", "MongoDB://192.168.12.161:27017/test.origin2 "). Getorcreate ()
Val Rdd = Mongospark.builder (). Sparksession (Spark). Pipeline (Seq (' Match ' (Eqq ("Basiclabel.procedure", "second"))). Build.tordd ()

The second approach is also simpler, creating a readconfig, which is a singleton class provided by connector, which can set many parameters, for example (without having to specify "Spark.mongodb.input.uri" at this time), The following is a way to read data by Sparkcontext and by Sparksession two ways:

      Val readconfig = readconfig (Map (              "uri", "mongodb://192.168.2.48:27017/",              "Database"- > "Test",              "collection", "Test"              = mongospark.load (Spark, Readconfig). Rdd // val r2 = mongospark.load (Spark.sparkcontext, Readconfig)

3. Java Read method with authentication:

With authentication you need to create a Mongouri that specifies the user name, password, and authentication library in the URI. This method is more versatile because it is used by spark, or if it is required to use a library equal to the authentication library, or there is no commonality. This method can be verified in the admin and then read the test data, it is very good.
With authentication need to first create a Mongouri, in the URI to the user name, password and authentication library are specified clearly, as to why the need to specify the library recommended a blog
New Mongoclienturi ("Mongodb://gaoze:[email protected]:27017/?authsource=admin") //val Mongouri = new Mongoclienturi ("MongoDB://192.168.2.48:27017/"); New mongoclient (Mongouri) Private lazy val db = mongo.getdatabase ("Test") private lazy val dbcoll = Db.getcollection ("Test")
And then the same as 1.

4. Spark Read method with authentication:

As with 3, the user name password and library are added to the URI:

Val Spark =Sparksession.builder (). Master ("spark://192.168.2.51:7077"). config (NewSparkconf (). Setjars (Array ("Hdfs://192.168.2.51:9000/mongolib/mongo-spark-connector_2.11-2.0.0.jar",                "Hdfs://192.168.2.51:9000/mongolib/bson-3.4.2.jar",                "Hdfs://192.168.2.51:9000/mongolib/mongo-java-driver-3.4.2.jar",                "Hdfs://192.168.2.51:9000/mongolib/mongodb-driver-3.4.2.jar",                "Hdfs://192.168.2.51:9000/mongolib/mongodb-driver-core-3.4.2.jar",                "Hdfs://192.168.2.51:9000/mongolib/commons-io-2.5.jar",                "Hdfs://192.168.2.51:9000/segwithorigin2.jar"))). config ("Spark.cores.max", 80). config ("Spark.executor.cores", 16). config ("Spark.executor.memory", "32g")
Here This configuration entry specifies the user name Gaoze, password Gaolaoban, authentication Library admin. config ("Spark.mongodb.input.uri", "Mongodb://gaoze:[email protected]:27017/test.origin2?authsource=admin"). Getorcreate () Val Rdd= Mongospark.builder (). Sparksession (Spark). Pipeline (Seq (' Match ' (Eqq ("Basiclabel.procedure", "second"))). Build.tordd ()

Or:

This specifies the username rw, password 1, authentication library test
Val readconfig = readconfig (Map ( "uri", "Mongodb://rw:[email protected]:27017/?authsource=test" , " Database "Test", "collection" ,"test") )

Val Rdd = Mongospark.builder (). Sparksession (Spark). Readconfig (Readconfig). Build (). Tordd ()
val r2 = mongospark.load (Spark.sparkcontext, Readconfig)

Java and SPARK2. x method for connecting mongodb3.x standalone or cluster (with authentication and without authentication)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.