Grouping statistics using MapReduce in MongoDB
Grouping statistics using MapReduce in MongoDB
Recently, when calculating the number of url deduplication for a certain period of time, an error is reported due to the large amount of data. The prompt is:
Distinct failed :{
"Errmsg": "exception: distinct too big, 16 mb cap ",
"Code": 17217,
"OK": 0
} At src/mongo/shell/collection. js: 1108
After reading the materials, mapreduce is used to solve the problem as follows:
// Define the map function
Map = function (){
Emit (this. url, {"count": 1 });
}
// Define the reduce Function
Reduce = function (key, values ){
Var total = 0;
For (var I = 0; I <values. length; I ++ ){
Total + = values [I]. count;
}
Return {count: total}
}
// Execute the mapreduce function. The out value is a set of stored execution results.
Db. runCommand ({"mapreduce": "visit", "map": map, "reduce": reduce, "query": {"vtime": {"$ gte": 1412611200, "$ lte": 1413907119 }}, "out": "test. tmp "});
CentOS compilation and installation of MongoDB
CentOS compilation and installation of php extensions for MongoDB and mongoDB
CentOS 6 install MongoDB and server configuration using yum
Install MongoDB2.4.3 in Ubuntu 13.04
MongoDB beginners must read (both concepts and practices)
MongoDB Installation Guide for Ubunu 14.04
MongoDB authoritative Guide (The Definitive Guide) in English [PDF]
Nagios monitoring MongoDB sharded cluster service practice
Build MongoDB Service Based on CentOS 6.5 Operating System
MongoDB details: click here
MongoDB: click here
This article permanently updates the link address:
This article permanently updates the link address: