Powerful aggregation tools in MongoDB

Source: Internet
Author: User
Keywords nbsp function Name fact

1.count

Count returns the number of documents in the collection

&http://www.aliyun.com/zixun/aggregation/37954.html ">nbsp;

Db.refactor.count ()

No matter how large the collection is, the number of documents can be returned quickly.

Can pass the query, MongoDB will calculate the number of query results

Db.refactor.count ({"username": "Refactor"})

However, increasing the query condition causes the count to slow down.

2.distinct

Distinct is used to find all the different values for a given key. You must specify the collection and key when you use it.

As:

Db.runcommand ({"distinct": "Refactor", "Key": "username"})

3.group

Group first selects the key on which the group is grouped, and MongoDB divides the collection into groups based on the difference between the selected key values. You can then aggregate the documents within each group,

Produces a result document.

As:

Db.runcommand ({"group": {"ns": "Refactor", "key": {"username": true}, "initial": {"Count": 0}, "$reduce": function ( Doc,prev) {prev.count++}, "condition": {"age": {"$GT": 40}}})

"NS": "Refactor",

Specify the collection to group
' key ': {' username ': true},

Specifies the basis for grouping the documents, here is the username key, where all the values of the username keys are equally divided into a set, true to return the value of the key username
' initial ': {' count ': 0},

The initial number of each group of reduce function calls. All members of each group will use this accumulator.
' $reduce ': function (Doc,prev) {...}

Each document corresponds to a call once. The system passes two parameters: the current document and the accumulator document.

"condition": {"age": {"$GT": 40}}

The condition that the value of this age is greater than 40

4. Using the completion device

The completion is used to streamline the data that is uploaded from the database to the user. The output of the group command must be able to be placed in a single database corresponding.

"Finalize" comes with a function that is called once before the array result is passed to the client.

Db.runcommand ({"group": {"ns": "Refactor", "key": {"username": true}, "initial": {"Count": 0}, "$reduce": function ( Doc,prev) {prev.count++}, "Finalize": function (DOC) {doc.num=doc.count; Delete doc.count;}} } ) 

Finalize can modify the passed parameters to return the new value.

5. Use the array as a key

Sometimes groups are based on conditions that are complex, not just a key. For example, to use Group to compute multiple blog posts for each category. Because there are many authors,

It may be irregular to use case when classifying articles. So, if you group by category name, the last "MongoDB" and "MongoDB" are different groups.

To eliminate this case effect, you define a function to determine the key on which the document is based.

Define groups to use $KEYF

Db.runcommand ({"group": {"ns": "Refactor", "$keyf": function (DOC) {return {"username":d oc.username.toLowerCase ()        }, "initial": {"Count": 0}, "$reduce": function (Doc,prev) {prev.count++; }    }  } ) 

6.MapReduce

Count,distinct,group can do anything mapreduce can do. It is an aggregation method that can be easily parallelization to multiple servers. It will

Split the problem, and then send the parts to a different machine, so that each machine completes a part. When all the machines are finished, the results are assembled to form

Final full results.

MapReduce requires several steps:

1. Mapping, mapping operations to each document in the collection. This operation either does nothing or produces a key and n values.

2. Shuffle, grouped by key, and put the resulting list of key values into the corresponding key.

3. Simplify the value of the list into a single value, which is returned.

4. Shuffle until each key list has only one value, which is the final result.

MapReduce is slower than group and group is slow. In the application, it is best not to use MapReduce, which can run in the background mapreduce

Create a collection of saved results that you can query in real time.

Find all keys in the collection

MongoDB does not have a pattern, so it is not known how many keys each document has. A good way to find all the keys of a collection is to use MapReduce.

In the mapping phase, you want each key in the document. The map function returns the value to be processed using emit. Emit will give MapReduce a key and a value.

This returns ({count:1}) The count of a key in a document (emit). We count each key individually, so call once emit for each key in the document,

This is a reference to the current document:

map=function () {for (var key) {Emit (Key,{count:1})}};

This returns a large number of {count:1} documents, each of which is related to a key in the collection. This is an array of one or more {count:1} documents,

is passed to the reduce function. The reduce function has two parameters, one is the key, the first value returned by emit, and the other is an array of one or more

The {count:1} document of the corresponding key is composed.

reduce=function (key,emits) {total=0; for (var i in emits) {total+=emits[i].count; return {count:total}; }

Reduce needs to be invoked repeatedly, either as a mapping link or as a previous simplification. Reduce returns documents that must be able to be

An element of the second argument. If the X key is mapped to 3 documents {"Count": 1,id:1},{"Count": 1,id:2},{"Count": 1,id:3}

Where the ID key is used for the difference. MongoDB may call reduce:

>r1=reduce ("x", [{"Count": 1,id:1},{"Count": 1,id:2}]) {count:2} >r2=reduce ("X", [{"Count": 1,id:3}]) {count : 1} >reduce ("X", [R1,R2]) {Count:3}

Reduce should be able to handle various collections of emit documents and other reduce results.

As:

Mr=db.runcommand ({"MapReduce": "Refactor", "map": Map, "reduce": Reduce, "out": {inline:1}})

Or:

Db.refactor.mapReduce (Map,reduce,{out:{inline:1})

"Timemillis": Time spent 5,//operation
"Counts": {
"Input": Number of documents sent to the map function 10,//
"Emit": 40,//the number of times the emit is called in the map function
"Reduce": 4,//the number of times that reduce is called in the map function
Output: The number of documents created in the 4//result collection.
},

1.mapreduce is grouped according to the first parameter of the emit function called in the map function.

2. Key and document collections are handled by the reduce function only if one key matches multiple documents based on the group key

Note that the out parameter must be specified above MongoDB version 1.8

Otherwise, you will report the following error:

"Assertion": "' Out ' super-delegates to be a string or ' object ',
"Assertioncode": 13606,

Other keys in MapReduce

Mapreduce,map,reduce these three keys are required, mapreduce commands and other optional keys

Finalize: Functions

Sending the result of reduce to this key is the last step in the process

Keeptemp: Boolean

Whether temporary results are saved when the connection is closed

Output: String

The name of the result set, which is implied by the Keeptemp:true

Query: Documents

Filters the document with the specified criteria before sending to the map function

Sort: Document

Sort the document before sending it to the map function

Limit: integer

Maximum number of documents sent to map function

Scope: Documents

Variables to use in JavaScript code

Verbose: Boolean

Whether to generate a more informative server log

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.