MongoDB aggregation operation of Reading notes

Source: Internet
Author: User
Tags arithmetic operators emit shuffle

MongoDB Aggregation Operations
Reading notes

MongoDB, two kinds of computational aggregates pipeline and MapReduce
Pipeline queries are faster than MapReduce, but MapReduce can execute complex aggregation logic on multiple servers in parallel.
MongoDB does not allow pipeline to consume too much system memory for a single aggregation operation, and if an aggregation operation consumes more than 20% of the memory, MongoDB stops the operation directly and outputs an error message to the client.

The pipeline method uses the Db.collection.aggregate () function to perform aggregation operations, which are faster and easier to operate.
Two limits, a single aggregation operation consumes less than 20% of the memory, and the result set returned by the aggregation operation must be limited to 16MB.

$MATHCH Filtration
$project pipe break is to select a field, rename a field, derive a field
Select field
Db.foo.aggregate (
{$match: {age:{$lte: 25}}},
{$project: {age:1,idx:1, ' _id ': 0}}
)

Rename Field
Db.foo.aggregate (
{$match: {age:{$lte: 25}}},
{$project: {age:1, "preidx": "$idx", Idx:1, "_id": 0}}
)

Derived fields
Db.foo.aggregate (
{$match: {age:{$lte: 25}}},
{$project:
{
Age:1,
"Preidx": {$subtract: ["$idx", 1]},
Idx:1,
"_id": 0}
}
)

Arithmetic operators
$add
$multiply
$divide
$mod
$subtract

Character data
$SUBSTR: [expr,start,length] substring
$concat: [expr1,expr2,,, exprn] Join expression
$toLower: Expr turns lowercase
$toUpper: Expr turns uppercase

Flow operator, the doc can be processed as soon as a new doc enters
Non-streaming operators, which must wait until all documents have been received before processing the document.

Both the group operation and the sort operation are non-streaming operators.

Grouping operations
Db.foo.aggregate (
{$match: {age:{$lte: 25}}},
{$project: {age:1, "Preidx": {$subtract: ["$idx", 1]},idx:1, "_id": 0}},
{$group: {"_id": "$age", count:{$sum: 1}}
)
Group BY age to count the number of doc in each group

Multi-field grouping
Db.foo.aggregate (
{$match: {age:{$lte: 25}}},
{$project: {age:1, "Preidx": {$subtract: ["$idx", 1]},idx:1, "_id": 0}},
{$group: {"_id": {Age: "$age", Age2: "$age"},count:{$sum: 1}}
)

Grouping aggregation calculations
Db.foo.aggregate (
{$match: {age:{$lte: 25}}},
{$project: {age:1, "Preidx": {$subtract: ["$idx", 1]},idx:1, "_id": 0}},
{$group:
{
"_id": {Age: "$age", Age2: "$age"},
count:{$sum: 1},
idxtotal:{$sum: "$idx"}},
idxmax:{$max: "$idx"},
idxfirst:{$first: "$idx"}
}
}
)

Sort the results, then skip the top 10 and take the first 10.
Db.foo.aggregate (
{$match: {age:{$lte: 25}}},
{$project: {age:1, "Preidx": {$subtract: ["$idx", 1]},idx:1, "_id": 0}},
{$group:
{
"_id": {Age: "$age", Age2: "$age"},
count:{$sum: 1},
idxtotal:{$sum: "$idx"}},
idxmax:{$max: "$idx"},
idxfirst:{$first: "$idx"}
}
},
{$sort: {age:-1}},
{$skip: 10},
{$limit: 10}
)

MapReduce can calculate very complex, but very slow, not suitable for real-time data analysis. can be executed in parallel on multiple servers, and finally the result set is returned uniformly by master server.
MapReduce method, mainly divided into map, Shuffle, reduce three steps.
Map,reduce needs to be explicitly defined, shuffle implemented by MongoDB

The best implementation of MapReduce is the scenario in which the final result can be added together.

Sample, Statistics doc number
1. Define map and reduce
Map=function () {
for (var key in this)
{
Emit (Key,{count:1});
}
}

Reduce=function (key,emits) {
total=0;
for (var i in emits) {
Total+=emits[i].count;
}
return {"Count": total};
}

2. Execution of MapReduce
Mr=db.runcommand (
{
"MapReduce": "foo",
"Map": Map,
"Reduce": Reduce,
Out: "Count Doc"
})

3. View Results
Db[mr.result].find ()

Sample to count the number of different age
1. Define map and reduce
Map=function ()
{
Emit (This.age,{count:1});
}

reduce= function (key,emits)
{
total=0;
for (var i in emits)
{
Total+=emits[i].count;
}

return {"Age": Key,count:total};
}

2. Execution of MapReduce
Mr=db.runcommand (
{
"MapReduce": "foo",
"Map": Map,
"Reduce": Reduce,
Out: "Count Doc"
})

3. View the results of the aggregation operation
Db[mr.result].find ()

Sample to study the reduce
reduce= function (key,emits)
{
total=0;
for (var i in emits)
{
Total+=emits[i].count;
}

return {"Key": Key,count:total};
}

R1=reduce ("x", [{count:1},{count:2}])
R2=reduce ("x", [{count:3},{count:5}])
R3=reduce ("x", [R1,R2])

MongoDB aggregation operation of Reading notes

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.