MongoDB Map Reduce

Source: Internet
Author: User
Tags emit

Map-reduce is a computational model, which simply means that a large amount of work (data) decomposition (MAP) is performed, and then the results are combined into the final result (REDUCE).

MongoDB offers a very flexible map-reduce, which is also quite useful for large-scale data analysis.

MapReduce command

The following is the basic syntax for MapReduce:

>db.collection.mapreduce (function   () {emit (Key,value);},  //map functions function   (key,values) {return Reducefunction},   //reduce function   {      out:collection,      query:document,      sort:document,      limit: Number   })

Using MapReduce to implement the two function map functions and the reduce function, the map function calls emit (key, value), traverses all the records in the collection, and passes the key and value to the reduce function for processing.

The MAP function must call emit (key, value) to return a key-value pair.

Parameter description:

    • Map: Map functions (Generate key-value pairs of sequences, as parameters of the reduce function).
    • The reduce statistic function, the task of the reduce function is to turn key-values into Key-value, that is, to turn the values array into a single value.
    • The out statistic result holds the collection (does not specify that the temporary collection is used and is automatically deleted after the client disconnects).
    • Query a filter condition in which only documents that meet the criteria call the map function. (Query. Limit,sort can be combined freely)
    • Sort and limit combine sort parameters (also sort documents before they are sent to the map function) to optimize the grouping mechanism
    • Limit the number of documents that are sent to the map function (if no limit is used, using sort alone is not very useful)
Using MapReduce

Consider the following document structure to store the user's article, which stores the user's user_name and the Status field of the article:

{   "Post_text": "w3cschool.cn w3cschool tutorials, the most comprehensive technical documentation. ",   " user_name ":" Mark ",   " status ":" Active "}

Now we'll use the MapReduce function in the posts collection to select the published article and, by user_name grouping, calculate the number of articles per User:

>db.posts.mapreduce (    function () {emit (this.user_id,1);},    function (key, values) {return Array.sum (values },       {           query:{status: "Active"}, out           : "Post_total"       })

The above mapReduce output results are:

{   "result": "Post_total",   "Timemillis": 9,   "counts": {      "input": 4,      "Emit": 4,      "reduce": 2,      "Output": 2   },   "OK": 1,}

The results show that there are 4 documents that meet the query criteria (status: "Active"), generate 4 key-value pairs in the map function, and finally use the reduce function to divide the same key values into two groups.

Specific parameter description:

    • Result: The name of the collection that stores the result, which is a temporary collection that is automatically deleted when the MapReduce connection is closed.
    • Timemillis: Time spent in execution, in milliseconds
    • Input: The number of documents that satisfy the condition being sent to the map function
    • Emit: The number of times the emit is called in the map function, that is, the total amount of data in all the collections
    • Ouput: Number of documents in the result collection (count is helpful for debugging)
    • OK: Successful, success is 1
    • ERR: If it fails, there can be a reason for failure here, but from experience, the reason is rather vague and less useful.

Use the Find operator to view the results of the MapReduce query:

>db.posts.mapreduce (    function () {emit (this.user_id,1);},    function (key, values) {return Array.sum (values },       {           query:{status: "Active"},           out: "Post_total"       }). Find ()

The above query shows the following results, two users Tom and Mark have two published articles:

{"_id": "Tom", "Value": 2} {"_id": "Mark", "Value": 2}

In a similar way, mapreduce can be used to build large, complex aggregate queries.

The map function and the reduce function can be implemented using JavaScript, and the use of MapReduce is very flexible and powerful.

MongoDB Map Reduce

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.