Java and SPARK2. x method for connecting mongodb3.x standalone or cluster (with authentication and without authentication)

Last Update:2017-08-23 Source: Internet

Author: User

Tags data structures mongoclient

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

First, it is clear that there is no difference between accessing the MONGOs and accessing the standalone Mongod. The next method is to access both the Mongod and the MONGOs.

In addition, read Java to write Scala, anyway, everyone can understand ... Probably?

1. Connection method without authentication cluster (JAVAScala):

The first is to create a connection method, we declare a client, and then specify the DB and collection to access:

    Private New Mongoclient ("192.168.2.51", 27017)    private lazy val db = mongo.getdatabase ("Test")     Private lazy val dbcoll = db.getcollection ("origin2")

Then we read the data:

Import com.mongodb.client.model.Filters. {eq= dbcoll.find (eqq ("Basiclabel.procedure", "second"). Iterator ()

Amount: The above code is read data with filter filtering. First import Com.mongodb.client.model.Filters.eq and rename the EQ to eqq, and then read the specified data through the Dbcoll.find (Bson) method. The rest is the way the normal iterator is used, and the data that docs gets is iterator[document].

Then we update the data:

Dbcoll.updateone (EQQ ("_id", X.get ("_id"), Set ("Segdata", Fenduan (str, name)))

The above code is to find _id corresponding data, and set one of the fields as a new value, this value can be document,string,int,list a series of data structures. I'm here. The Fenduan method returns a document, nesting a layer.

As far as inserting data is easier:

Dbcoll.insertone (DOC)

2. The spark Read method without authentication (Scala, righteously)

Two ways, one is to create sparksession (Sparkcontext can use the second method, wake up brother, 2017 years ), directly specify "Spark.mongodb.input.uri". Then use the normal Mongospark to read the data. (Pipeline inside is the filter, willing to try the others can try other methods under the filter). The RDD is used because the RDD is more suitable for a series of fine-grained transformations such as map and Flatmap, and if you only need to read the data, you can use the Mongospark.read (Spark) method to get Dataframereader directly.

Val Spark =Sparksession.builder (). Master ("spark://192.168.2.51:7077"). config (NewSparkconf (). Setjars (Array ("Hdfs://192.168.2.51:9000/mongolib/mongo-spark-connector_2.11-2.0.0.jar",                "Hdfs://192.168.2.51:9000/mongolib/bson-3.4.2.jar",                "Hdfs://192.168.2.51:9000/mongolib/mongo-java-driver-3.4.2.jar",                "Hdfs://192.168.2.51:9000/mongolib/mongodb-driver-3.4.2.jar",                "Hdfs://192.168.2.51:9000/mongolib/mongodb-driver-core-3.4.2.jar",                "Hdfs://192.168.2.51:9000/mongolib/commons-io-2.5.jar",                "Hdfs://192.168.2.51:9000/segwithorigin2.jar"))). config ("Spark.cores.max", 80). config ("Spark.executor.cores", 16). config ("Spark.executor.memory", "32g"). config ("Spark.mongodb.input.uri", "mongodb://192.168.2.51:27017/test.origin2") //. config ("Spark.mongodb.output.uri", "MongoDB://192.168.12.161:27017/test.origin2 "). Getorcreate ()
Val Rdd = Mongospark.builder (). Sparksession (Spark). Pipeline (Seq (' Match ' (Eqq ("Basiclabel.procedure", "second"))). Build.tordd ()

The second approach is also simpler, creating a readconfig, which is a singleton class provided by connector, which can set many parameters, for example (without having to specify "Spark.mongodb.input.uri" at this time), The following is a way to read data by Sparkcontext and by Sparksession two ways:

      Val readconfig = readconfig (Map (              "uri", "mongodb://192.168.2.48:27017/",              "Database"- > "Test",              "collection", "Test"              = mongospark.load (Spark, Readconfig). Rdd // val r2 = mongospark.load (Spark.sparkcontext, Readconfig)

3. Java Read method with authentication:

With authentication you need to create a Mongouri that specifies the user name, password, and authentication library in the URI. This method is more versatile because it is used by spark, or if it is required to use a library equal to the authentication library, or there is no commonality. This method can be verified in the admin and then read the test data, it is very good.

With authentication need to first create a Mongouri, in the URI to the user name, password and authentication library are specified clearly, as to why the need to specify the library recommended a blog
New Mongoclienturi ("Mongodb://gaoze:[email protected]:27017/?authsource=admin")    //val Mongouri = new Mongoclienturi ("MongoDB://192.168.2.48:27017/");    New mongoclient (Mongouri)     Private lazy val db = mongo.getdatabase ("Test")    private lazy val dbcoll = Db.getcollection ("Test")
And then the same as 1.

4. Spark Read method with authentication:

As with 3, the user name password and library are added to the URI:

Val Spark =Sparksession.builder (). Master ("spark://192.168.2.51:7077"). config (NewSparkconf (). Setjars (Array ("Hdfs://192.168.2.51:9000/mongolib/mongo-spark-connector_2.11-2.0.0.jar",                "Hdfs://192.168.2.51:9000/mongolib/bson-3.4.2.jar",                "Hdfs://192.168.2.51:9000/mongolib/mongo-java-driver-3.4.2.jar",                "Hdfs://192.168.2.51:9000/mongolib/mongodb-driver-3.4.2.jar",                "Hdfs://192.168.2.51:9000/mongolib/mongodb-driver-core-3.4.2.jar",                "Hdfs://192.168.2.51:9000/mongolib/commons-io-2.5.jar",                "Hdfs://192.168.2.51:9000/segwithorigin2.jar"))). config ("Spark.cores.max", 80). config ("Spark.executor.cores", 16). config ("Spark.executor.memory", "32g")
Here This configuration entry specifies the user name Gaoze, password Gaolaoban, authentication Library admin. config ("Spark.mongodb.input.uri", "Mongodb://gaoze:[email protected]:27017/test.origin2?authsource=admin"). Getorcreate () Val Rdd= Mongospark.builder (). Sparksession (Spark). Pipeline (Seq (' Match ' (Eqq ("Basiclabel.procedure", "second"))). Build.tordd ()

Or:

This specifies the username rw, password 1, authentication library test
Val readconfig = readconfig (Map (              "uri", "Mongodb://rw:[email protected]:27017/?authsource=test" ,              " Database "Test", "collection"              ,"test")              )

Val Rdd = Mongospark.builder (). Sparksession (Spark). Readconfig (Readconfig). Build (). Tordd ()
val r2 = mongospark.load (Spark.sparkcontext, Readconfig)

Java and SPARK2. x method for connecting mongodb3.x standalone or cluster (with authentication and without authentication)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More