Using MongoDB to realize MapReduce

Source: Internet
Author: User
Tags emit install mongodb mongodb mongodb server web services

MapReduce is a software framework released by Google in 2004 to support distributed computing for large-scale data, see here for details.

MongoDB is an open source document-oriented NoSQL database system, written in C + +, please see here for details.

1. Install Mangodb

First, please follow the official this document to install the MongoDB database, in this article, we are installed under Mac OS X and tested correctly.

I use the sudo port install mongodb command to install MongoDB, the only problem is the Xcode version of the problem, upgrade to the latest version of the Xcode is good.

2. Run MongoDB

Start MongoDB is very simple, only need to execute Mogod in the terminal window.

The default MongoDB is to run on port 27017 and use/data/db as the default directory to store the data (we already created this directory in the first step)

If you modify these default configurations, you can modify them by using command-line arguments:

Mongod--port [Your_port]--dbpath [Your_db_file_path]

You need to be sure that the data directory must already exist and that there are no other files in the directory when MongoDB first started.

3. Start MongoDB Interactive Environment

We can start the MongoDB interactive environment to connect to the MongoDB server and run the MongoDB command directly from the command line.

On the same machine, you just need to perform a simple MONGO to enter the interactive environment, if you want to connect the MongoDB server on different machines, you can use the following parameters to specify the IP address and port of the target server:

MONGO [Ip_address]:[port]

For example: MONGO localhost:4000

4. Create a Database

Next, execute the following command in the interactive environment to create the database:

Use Library

The above command creates a database named library.

Then we can look at the database we just created with the following command, which lists all the databases in the system:

Show DBS;

You'll notice that the database you just created isn't listed, because MongoDB only creates the database when it's needed, so you need to add some data to the database.

5. Inserting data into the database

First we create two books with the following command:

> Book1 = {name: "Understanding JAVA", pages:100}
> book2 = {name: "Understanding JSON", pages:200}

Then keep the two books in the collection named book:

> Db.books.save (BOOK1)
> Db.books.save (BOOK2)

The above command will create a collection called books in the library database (that is, a table in the SQL database), and the following command lists the two we just added:

> Db.books.find ();

{"_id": ObjectId ("4f365b1ed6d9d6de7c7ae4b1"), "name": "Understanding JAVA", "pages": 100}
{"_id": ObjectId ("4f365b28d6d9d6de7c7ae4b2"), "name": "Understanding JSON", "pages": 200}

To add more records:

> book = {name: "Understanding XML", pages:300}
> Db.books.save (book)
> book = {name: "Understanding Web Services", pages:400}
> Db.books.save (book)
> book = {name: "Understanding Axis2", pages:150}
> Db.books.save (book)

6. Write the MAP function

Next we write a search function to find more than 250 pages of books: View the source print?

1 > var map = function () {
2 var category;
3 if (this.pages >= 250)
4 Category = ' big books ';
5 Else
6 Category = "Small books";
7 Emit (category, {name:this.name});
8 };

The results returned by:

{"Big books", [{Name: "Understanding XML"}, {name: "Understanding Web Services"}]);
{"Small books", [{Name: "Understanding JAVA"}, {name: "Understanding JSON"},{name: "Understanding Axis2"}]);

7. Write Reduce function to see the source print?

1 > var reduce = function (key, values) {
2 var sum = 0;
3 Values.foreach (function (DOC) {
4 sum = 1;
5 });
6 return {books:sum};
7 };

8. Run MapReduce View source print in books collection?

1 > var count = db.books.mapReduce (map, reduce, {out: "Book_results"});
2 > Db[count.result].find ()
3
4 {"_id": "Big Books", "value": {"Books": 2}}
5 {"_id": "Small Books", "value": {"Books": 3}}

The above results show that we have two big books and three small books.

You can do anything with the MongoDB interactive environment, in Java, but you need to download some of the necessary jar packs.

The following is the Java source code: View the source print?

01 Import Com.mongodb.BasicDBObject;
02 Import Com.mongodb.DB;
03 Import com.mongodb.DBCollection;
04 Import Com.mongodb.DBObject;
05 Import Com.mongodb.MapReduceCommand;
06 Import Com.mongodb.MapReduceOutput;
07 Import Com.mongodb.Mongo;
08
09 public class Mongoclient {
10
11 /**
12 * @param args
13 */
14 public static void Main (string[] args) {
15
16 Mongo Mongo;
17
18 try {
19 MONGO = new MONGO ("localhost", 27017);
20 DB db = Mongo.getdb ("library");
21st
22 Dbcollection books = db.getcollection ("books");
23
24 Basicdbobject book = new Basicdbobject ();
25 Book.put ("name", "Understanding JAVA");
26 Book.put ("pages", 100);
27 Books.insert (book);
28
29 Book = new Basicdbobject ();
30 Book.put ("name", "Understanding JSON");
31 Book.put ("pages", 200);
32 Books.insert (book);
33
34 Book = new Basicdbobject ();
35 Book.put ("name", "Understanding XML");
36 Book.put ("pages", 300);
37 Books.insert (book);
38
39 Book = new Basicdbobject ();
40 Book.put ("name", "Understanding Web Services");
41 Book.put ("pages", 400);
42 Books.insert (book);
43
44 Book = new Basicdbobject ();
45 Book.put ("name", "Understanding Axis2");
46 Book.put ("pages", 150);
47 Books.insert (book);
48
49 String map = "function () {" +
50 "var category;" +
51 "If (this.pages >= 250)" +
52 "category = ' big books ';"
53 "Else" +
54 "category = ' Small books ';" +
55 "Emit (category, {name:this.name});}";
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.