1. MongoDB MapReduce is equivalent to Mysql's groupby, so it is easy to use MapReduce for parallel statistics on MongoDB. MapReduce is used to implement two functions: Map function and Reduce function. Map function calls emit (key, value), traverses all records in the collection, and passes the key and value to the Reduce function.
1. MongoDB MapReduce is equivalent to group by in Mysql, so it is easy to use Map/Reduce on MongoDB for parallel statistics. MapReduce is used to implement two functions: Map function and Reduce function. Map function calls emit (key, value), traverses all records in the collection, and passes the key and value to the Reduce function.
1. MongoDB MapReduce is equivalent to "group by" in Mysql, so it is easy to use Map/Reduce on MongoDB for parallel "Statistics. MapReduce implements two functions: Map function and Reduce function. Map function calls emit (key, value), traverses all records in the collection, and passes the key and value to the Reduce function for processing. The Map function and Reduce function can be implemented using JavaScript, and a mapReduce operation can be executed using db. runCommand or MapReduce commands.
2. Run the MapReduce Program (runCommand)
db.runCommand({
Mapreduce:
, Map:
, Reduce:
[, Query:
] [, Sort:
] [, Limit:
] [, Out:
] [, Keeptemp:
] [, Finalize:
] [, Scope:
] [, Verbose: true]});Parameter description:Mapreduce: target set to be operated.
Map: ing function (generate a sequence of key-value pairs as a parameter of the reduce function ).
Reduce: statistical function.
Query: Filter target records.
Sort: Sorting of target records.
Limit: limit the number of target records.
Out: stores the statistical result set. If this parameter is not specified, a temporary set is used. The set is automatically deleted after the client is disconnected ).
Keeptemp: whether to retain the temporary set.
Finalize: final processing function (sort the returned results of reduce and save them to the result set ).
Scope: Import external variables to map, reduce, and finalize.
Verbose: displays detailed time statistics.
3. Map
Test data:
> db.students.insert({classid:1, age:14, name:'Tom'})> db.students.insert({classid:1, age:12, name:'Jacky'})> db.students.insert({classid:2, age:16, name:'Lily'})> db.students.insert({classid:2, age:9, name:'Tony'})> db.students.insert({classid:2, age:19, name:'Harry'})> db.students.insert({classid:2, age:13, name:'Vincent'})> db.students.insert({classid:1, age:14, name:'Bill'})> db.students.insert({classid:2, age:17, name:'Bruce'})
Map function: You must call emit (key, value) to return to the key-value Pair and use this to access the Document to be processed. Perform the groupby operation using the key value you provided. The following example uses classid to group data. In addition, values can be transmitted using JSON objects (multiple attribute values are supported ). Example: emit (this. classid, {count: 1 })
m = function() { emit(this.classid, 1) }
4. ReduceThe parameters received by the Reduce function are similar to the Group effect. The key-value sequences returned by the Map are combined into {key, [value1, value2, value3, value...]} and passed to the reduce function. The Reduce function performs the "Statistics" operation on these values, and the returned results can use JSON Object
r = function(key, values) {... var x = 0;... values.forEach(function(v) { x += v });... return x;... }
5. Runres = db.runCommand({... mapreduce:"students",... map:m,... reduce:r,... out:"students_res"... });
{"result" : "students_res","timeMillis" : 1587,"counts" : {"input" : 8,"emit" : 8,"output" : 2},"ok" : 1}> db.students_res.find(){ "_id" : 1, "value" : 3 }{ "_id" : 2, "value" : 5 }
6. Further processing resultsUsing finalize (), we can further process the result of reduce. Function input is the Classification key and the result value after statistics.
f = function(key, value) { return {classid:key, count:value}; }
> res = db.runCommand({... mapreduce:"students",... map:m,... reduce:r,... out:"students_res",... finalize:f... });{"result" : "students_res","timeMillis" : 804,"counts" : {"input" : 8,"emit" : 8,"output" : 2},"ok" : 1}> db.students_res.find(){ "_id" : 1, "value" : { "classid" : 1, "count" : 3 } }{ "_id" : 2, "value" : { "classid" : 2, "count" : 5 } }
7. Filtering and sorting options. Specific filtering options have been described above.For example, filter by age:
> res = db.runCommand({... mapreduce:"students",... map:m,... reduce:r,... out:"students_res",... finalize:f,... query:{age:{$lt:10}}... });