MongoDB Aggregation Operations
Reading notes
MongoDB, two kinds of computational aggregates pipeline and MapReduce
Pipeline queries are faster than MapReduce, but MapReduce can execute complex aggregation logic on multiple servers in parallel.
MongoDB does not allow pipeline to consume too much system memory for a single aggregation operation, and if an aggregation operation consumes more than 20% of the memory, MongoDB stops the operation directly and outputs an error message to the client.
The pipeline method uses the Db.collection.aggregate () function to perform aggregation operations, which are faster and easier to operate.
Two limits, a single aggregation operation consumes less than 20% of the memory, and the result set returned by the aggregation operation must be limited to 16MB.
$MATHCH Filtration
$project pipe break is to select a field, rename a field, derive a field
Select field
Db.foo.aggregate (
{$match: {age:{$lte: 25}}},
{$project: {age:1,idx:1, ' _id ': 0}}
)
Rename Field
Db.foo.aggregate (
{$match: {age:{$lte: 25}}},
{$project: {age:1, "preidx": "$idx", Idx:1, "_id": 0}}
)
Derived fields
Db.foo.aggregate (
{$match: {age:{$lte: 25}}},
{$project:
{
Age:1,
"Preidx": {$subtract: ["$idx", 1]},
Idx:1,
"_id": 0}
}
)
Arithmetic operators
$add
$multiply
$divide
$mod
$subtract
Character data
$SUBSTR: [expr,start,length] substring
$concat: [expr1,expr2,,, exprn] Join expression
$toLower: Expr turns lowercase
$toUpper: Expr turns uppercase
Flow operator, the doc can be processed as soon as a new doc enters
Non-streaming operators, which must wait until all documents have been received before processing the document.
Both the group operation and the sort operation are non-streaming operators.
Grouping operations
Db.foo.aggregate (
{$match: {age:{$lte: 25}}},
{$project: {age:1, "Preidx": {$subtract: ["$idx", 1]},idx:1, "_id": 0}},
{$group: {"_id": "$age", count:{$sum: 1}}
)
Group BY age to count the number of doc in each group
Multi-field grouping
Db.foo.aggregate (
{$match: {age:{$lte: 25}}},
{$project: {age:1, "Preidx": {$subtract: ["$idx", 1]},idx:1, "_id": 0}},
{$group: {"_id": {Age: "$age", Age2: "$age"},count:{$sum: 1}}
)
Grouping aggregation calculations
Db.foo.aggregate (
{$match: {age:{$lte: 25}}},
{$project: {age:1, "Preidx": {$subtract: ["$idx", 1]},idx:1, "_id": 0}},
{$group:
{
"_id": {Age: "$age", Age2: "$age"},
count:{$sum: 1},
idxtotal:{$sum: "$idx"}},
idxmax:{$max: "$idx"},
idxfirst:{$first: "$idx"}
}
}
)
Sort the results, then skip the top 10 and take the first 10.
Db.foo.aggregate (
{$match: {age:{$lte: 25}}},
{$project: {age:1, "Preidx": {$subtract: ["$idx", 1]},idx:1, "_id": 0}},
{$group:
{
"_id": {Age: "$age", Age2: "$age"},
count:{$sum: 1},
idxtotal:{$sum: "$idx"}},
idxmax:{$max: "$idx"},
idxfirst:{$first: "$idx"}
}
},
{$sort: {age:-1}},
{$skip: 10},
{$limit: 10}
)
MapReduce can calculate very complex, but very slow, not suitable for real-time data analysis. can be executed in parallel on multiple servers, and finally the result set is returned uniformly by master server.
MapReduce method, mainly divided into map, Shuffle, reduce three steps.
Map,reduce needs to be explicitly defined, shuffle implemented by MongoDB
The best implementation of MapReduce is the scenario in which the final result can be added together.
Sample, Statistics doc number
1. Define map and reduce
Map=function () {
for (var key in this)
{
Emit (Key,{count:1});
}
}
Reduce=function (key,emits) {
total=0;
for (var i in emits) {
Total+=emits[i].count;
}
return {"Count": total};
}
2. Execution of MapReduce
Mr=db.runcommand (
{
"MapReduce": "foo",
"Map": Map,
"Reduce": Reduce,
Out: "Count Doc"
})
3. View Results
Db[mr.result].find ()
Sample to count the number of different age
1. Define map and reduce
Map=function ()
{
Emit (This.age,{count:1});
}
reduce= function (key,emits)
{
total=0;
for (var i in emits)
{
Total+=emits[i].count;
}
return {"Age": Key,count:total};
}
2. Execution of MapReduce
Mr=db.runcommand (
{
"MapReduce": "foo",
"Map": Map,
"Reduce": Reduce,
Out: "Count Doc"
})
3. View the results of the aggregation operation
Db[mr.result].find ()
Sample to study the reduce
reduce= function (key,emits)
{
total=0;
for (var i in emits)
{
Total+=emits[i].count;
}
return {"Key": Key,count:total};
}
R1=reduce ("x", [{count:1},{count:2}])
R2=reduce ("x", [{count:3},{count:5}])
R3=reduce ("x", [R1,R2])
MongoDB aggregation operation of Reading notes