MongoDB Aggregation (RPM)

Source: Internet
Author: User

Aggregation refers to the various operations that can handle batch records and return results. MongoDB provides rich aggregation operations for performing calculation operations on datasets. Performing aggregation operations on an Mongod instance can greatly simplify the code for your application and reduce the consumption of resources.

Aggregation has a relatively simple count of counts; distinct; GROUP by group. There are also more complex pipeline aggregations. The following will be described separately.

The Appuser Collection has the following document {name:"Human April", Age: -,"Locate":"Beijing"}{name:"Dolphin", Age: A,"Locate":"Beijing"}{name:"Yunsheng", Age: +,"Locate":"Tianjin"}{name:"Shark", Age: at,"Locate":"Tianjin"}{name:"Babywang", Age: -,"Locate":"Sichuan"}
Count returns the total number of documents that meet the query criteria
Use the MongoDB command to query the number of registered people in the Beijing area Db.appuser.count ({locate:" Beijing "}) The return result is [  2]
Distinct remove duplicate operations returns a query to a record that does not duplicate the specified field value
Use the MongoDB command to query which regions the user is coming from db.appuser.distinct ("locate") The return result is ["  Beijing ","  Tianjin "," Sichuan "
Count distinct in conjunction with
Use the MongoDB command to query the number of regions of the user source Db.runcommand ({"distinct":"appuser ","key":"locate"3
The group operation groups the queried documents according to the given field values. The grouping operation returns an array of documents in which each document contains the calculated results of a set of documents

The group command cannot be run on a shard collection. It is important to note that the result set size of the group operation cannot exceed 16MB.

the MongoDB command queries the oldest user Db.appuser.group in each region ({key:{locate:""},initial:{age:0},reduce:function (cur, result) {if(cur.age>result.age) Result.age=Cur.age; Result.name=cur.name;}}) The query returns a result of [{Waitedms:numberlong (0), retval:[{locate:"Beijing", Age:22.0, Name:"Dolphin"},{Locate:"Tianjin", Age:23.0, Name:"Shark"},{Locate:"Sichuan", Age:25.0, Name:"Babywang"}],count:numberlong (5), Keys:numberlong (3), OK:1.0}]
Group $keyf Sometimes we need to do some processing on the grouped fields.
the MongoDB command groups the name lengths to find the oldest user in each group. Db.appuser.group ({$keyf: function (doc) {return{namelength:doc.name.length};},initial:{age:0},reduce:function (cur, result) {if(cur.age>result.age) Result.age=Cur.age; Result.name=cur.name;}}) Returns the result [{Waitedms:numberlong (0), retval:[{namelength:4.0, Age:20.0, Name:"Human April"}, {namelength:7.0, Age:22.0, Name:"Dolphin"}, {namelength:8.0, Age:25.0, Name:"Babywang"}, {namelength:5.0, Age:23.0, Name:"Shark"}], Count:numberlong (5), Keys:numberlong (4), OK:1.0 }]

Group finalize actions for each grouping result after grouping
The mongodb command groups The name length, finds the oldest user in each group, and finally adds 10db.appuser.group to each person's age ({$keyf: function (doc) {return  {namelength:doc.name.length};},initial:{age:0},finalize:function (doc) {   doc.age= doc.age+if(cur.age>result.age)     = cur.age;      = Cur.name;}})
Uncle Dolphin.
Links: http://www.jianshu.com/p/5b32d7612d08
Source: Pinterest

The functionality of the aggregation pipeline is simple in two ways:

    • "Filter" the document, that is, to filter out eligible documents;
    • "Transform" the document, that is, to change the output form of the document.

# # # # # # # # # # $project # 1. We have this data

  {     "_id" : 1,     title: "abc123",     isbn: "0001122223334",>     author: { last: "zzz", first: "aaa" },     copies: 5  }

Now use Project to transform the output

 db.books.aggregate(      [          { $project : { title : 1 , author : 1 } }      ] )

Can get

 {     "_id" : 1,    "title" : "abc123",     "author" : { "last" : "zzz", "first" : "aaa" }  }

In $project, we indicate (filter) the data to be displayed, the title and author,_id are self-brought, and can be filtered out with _id:0.

2. We now have basic data

 {     "_id" : 1,     title: "abc123",     isbn: "0001122223334",     author: { last: "zzz", first: "aaa" },     copies: 5 }

But we need to change the form of his output, and we can do that.

db.books.aggregate( [   {      $project: {         title: 1,         isbn: {            prefix: { $substr: [ "$isbn", 0, 3 ] },            group: { $substr: [ "$isbn", 3, 2 ] },            publisher: { $substr: [ "$isbn", 5, 4 ] },            title: { $substr: [ "$isbn", 9, 3 ] },            checkDigit: { $substr: [ "$isbn", 12, 1] }         },         lastName: "$author.last",         copiesSold: "$copies"      }   } ] )

In the ISBN internal Key prefix,group,publisher,title,checkdigit, the external lastname,copiessold all is our own definition. $substr the string, $ISBN is the string key name, the second parameter is the start position of the string, the third parameter is to take a few.

Final results

 {  "_id" : 1,  "title" : "abc123",  "isbn" : {      "prefix" : "000",      "group" : "11",      "publisher" : "2222",      "title" : "333",      "checkDigit" : "4"  },  "lastName" : "zzz",  "copiesSold" : 5 }

The data source does not change, but we change the way the data is displayed.

# # # # # # # $match Filter data, filter the data, and then use it for other purposes. (Faucet on the filter, filtered clean water, next rice, cooking can, seemingly pulled away ...) Back! )

1. Example

  db.articles.aggregate(     [          { $match : { author : "dave" } }       ]  );

The filter condition is a key author value of Dave

Result is

 {  "result" : [               {                 "_id" : ObjectId("512bc95fe835e68f199c8686"),                 "author": "dave",                 "score" : 80               },               { "_id" : ObjectId("512bc962e835e68f199c8687"),                  "author" : "dave",                  "score" : 85               }            ],  "ok" : 1 }

2. Look again at a case

 db.articles.aggregate(     [                                  { $match : { score : { $gt : 70, $lte : 90 } } },                     { $group: { _id: null, count: { $sum: 1 } } } ]  );

This time there are two steps:

    • The first step, the filter key score value is greater than 70 and less than or equal to 90 of the document,
    • Then use the group to count the document, statistical method $sum sum, the step is 1.

Because the group operation must have a _id, NULL is given to it.

Result is

 {  "result" : [               {                 "_id" : null,                 "count" : 3               }             ],  "ok" : 1  }

# # # # # # # $cond # # # # # # #

1. For example

{ "_id" : 1, "item" : "abc1", qty: 300 }{ "_id" : 2, "item" : "abc2", qty: 200 }{ "_id" : 3, "item" : "xyz1", qty: 250 }

Now we want to generate new data (value) based on the value of Qty

db.inventory.aggregate( [  {     $project:       {         item: 1,         discount:           {             $cond: { if: { $gte: [ "$qty", 250 ] }, then: 30, else: 20 }           }       }  } ] )

Result is

 { "_id" : 1, "item" : "abc1", "discount" : 30 } { "_id" : 2, "item" : "abc2", "discount" : 20 } { "_id" : 3, "item" : "xyz1", "discount" : 30 }

It can be found that discount is our new key, which is assigned the corresponding value (then and else can be omitted) after judging by the if of cond.

# # # # # $limit # # #

1. Example

db.article.aggregate(    { $limit : 5 });

It is important to note that when both sort and limit are present in the aggregation operation, sort only sorts the data through limit, and only the data through limit is stored in memory.

# # # # # # # $skip # # # N

1. Example

db.article.aggregate(   { $skip : 5 });

# # # # # # $unwind split coefficient group collection

1. Example

 {       "_id" : 1,       "item" : "ABC1",       sizes: [ "S", "M", "L"]  }

Now disassemble the sizes.

 db.inventory.aggregate(     [         { $unwind : "$sizes" }    ]  )

Results

 { "_id" : 1, "item" : "ABC1", "sizes" : "S" } { "_id" : 1, "item" : "ABC1", "sizes" : "M" } { "_id" : 1, "item" : "ABC1", "sizes" : "L" }

We can see that each of the data in the sizes is disassembled into each document, except that the values of sizes are different and the others are the same.

The combination of $unwind and $group can achieve distinct

# # # # # # # $group First group, then merge

1. Example

{ "_id" : { "month" : 3, "day" : 15, "year" : 2014 },       "totalPrice" : 50, "averageQuantity" : 10, "count" : 1 }{ "_id" : { "month" : 4, "day" : 4, "year" : 2014 },       "totalPrice" : 200, "averageQuantity" : 15, "count" : 2 }{ "_id" : { "month" : 3, "day" : 1, "year" : 2014 },       "totalPrice" : 40, "averageQuantity" : 1.5, "count" : 2 }

_id are grouped by, _id are null, and are not grouped and merged directly.

Consolidation basis:

    • Key Totalprice The sum of the product of the key price and the key quantity value

    • Key averagequantity The average of the value of the Save key quantity

    • Key count for statistics

      Db.sales.aggregate ([{$group: {_id:null, Totalprice: {$sum: {$multiply: ["$price", "$quantity"]}}, Averagequan Tity: {$avg: "$quantity"}, Count: {$sum: 1}}])

Results

 { "_id" : null, "totalPrice" : 290, "averageQuantity" : 8.6, "count" : 5 }

2. Look again at a case

 { "_id" : 1, "item" : "abc", "price" : 10, "quantity" : 2,             "date" : ISODate("2014-03-01T08:00:00Z") } { "_id" : 2, "item" : "jkl", "price" : 20, "quantity" : 1,             "date" : ISODate("2014-03-01T09:00:00Z") } { "_id" : 3, "item" : "xyz", "price" : 5, "quantity" : 10,             "date" : ISODate("2014-03-15T09:00:00Z") } { "_id" : 4, "item" : "xyz", "price" : 5, "quantity" : 20,             "date" : ISODate("2014-04-04T11:21:39.736Z") } { "_id" : 5, "item" : "abc", "price" : 10, "quantity" : 10,             "date" : ISODate("2014-04-04T21:23:13.331Z") }

We group according to key item

  db.sales.aggregate( [ { $group : { _id : "$item" } } ] )

Results

 { "_id" : "xyz" } { "_id" : "jkl" } { "_id" : "abc" }

# # # # $sort # # #

1. Example

db.users.aggregate(  [     { $sort : { age : -1, posts: 1 } }  ])

Order the key age sequence, sort the keys posts reverse

# # # # # # # $out Create a specified replica collection

1. Example

   { "_id" : 8751, "title" : "The Banquet", "author" : "Dante", "copies" : 2 }   { "_id" : 8752, "title" : "Divine Comedy", "author" : "Dante", "copies" : 1 }   { "_id" : 8645, "title" : "Eclogues", "author" : "Dante", "copies" : 2 }   { "_id" : 7000, "title" : "The Odyssey", "author" : "Homer", "copies" : 10 }   { "_id" : 7020, "title" : "Iliad", "author" : "Homer", "copies" : 10 }

Group on it by author, then out a new collection authors

  db.books.aggregate( [                  { $group : { _id : "$author", books: { $push: "$title" } } },                  { $out : "authors" }  ] )

Results

 { "_id" : "Homer", "books" : [ "The Odyssey", "Iliad" ] } { "_id" : "Dante", "books" : [ "The Banquet", "Divine Comedy", "Eclogues" ] }

What we're looking at now is a data map, not a physical document,

In authors, however, there is a document stored as a mapped copy.

This means that $out can create a new collection that stores the aggregated document mappings.

Transfer from Https://github.com/qianjiahao/MongoDB/wiki/MongoDB%E4%B9%8B%E8%81%9A%E5%90%88%E7%AE%A1%E9%81%93%E4%B8%8A

MongoDB Aggregation (RPM)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.