article from my personal blog: MongoDB mapreduce Usage Summary
As we all know, MongoDB is a non-relational database, that is, each table in the MongoDB database is independent, there is no dependency between the table and the table. In MongoDB, in addition to the various CRUD statements, we also provide aggregation and mapreduce statistics, this article mainly to talk about MongoDB's mapreduce operation.
The concept of MapReduce I will not repeat, we have to check it out.
In MongoDB, the syntax for MapReduce is as follows:
Db.table.mapReduce (map, reduce, {query:query, out:out,///Specify the result set in what way
Storage, optional parameters include://replace: If the document (table) exists, replace table,//merge: Overwrites the existing document record if there is a record in the document Reduce: If a record of the same key exists in the document, then two records are calculated, then the old record is overwritten//{inline:1} stores records in memory, does not write to disk (Calculation of user data volume) Sort:sort, Limit:limit, finalize:function//This function is mainly used to modify the
Data, function (key,values) {//return modifiedvalues;}
Scope:document,//Specify the range of documents that reduce can access Jsmode:boolean//Specify whether to convert data to Bason format immediately between map and Ruduce, false by default If you want to set to true, remember the official my caution://you can only use Jsmode for result
Sets with fewer than//500,000 distinct key arguments to the mapper ' s emit () function. Verbose:boolean//Whether to include timing information in the result set,The default is included})
When making MongoDB MapReduce, make sure that your query is available to the index, otherwise, in the statistics of the big data, will be praised the entire database, if there is no way to build index, then in the result set to judge some non-conforming data, and remove the query.
The syntax of MapReduce is actually very simple, but there are a few things to note:
1. In map, MongoDB is every 1000 data to reduce once
2. In map, if you want to count the sum of a data, you need to write this:
Emit (key:this.key,sum:0})
Then in reduce you need to add up the last sum iteration and return {sum:sum}; If you don't, the data you calculate is always counted after less than 1000 data, and the previous data is lost.
3. If you can not use MapReduce, it is not necessary, the program can be counted, not using MongoDB frequent statistics.
The data format of the 4.mapreduce result set is: {_id:key,value:{}}, so if you want to use the table directly, it is best to re-organize the data format once, try to put the data on the last, instead of using value.xxx to query.
Here is a mapreduce that counts the number of users posted on our website for a reference value in one code format:
var db = connect (' 127.0.0.1:27017/test ');
Db.aAccounttemp.drop (); var map = function () {emit (This.accountid, {sum:0, reblogflag:this.reblogflag,dashboardflag:this . Dashboardflag,dashboardtype:this.dashboardtype, photonum:0,postnum:0,reblognum:0,videonum:0,videoshortnum:0,
musicnum:0, questionnum:0,appnum:0, dialognum:0});
} var reduce = function (key,values) {var sum = 0;
var photonum = 0;
var postnum = 0;
var reblognum = 0;
var videonum = 0;
var videoshortnum = 0;
var musicnum = 0;
var questionnum = 0;
var appnum = 0;
var dialognum = 0;
for (var i = 0; i < values.length; i++) {var data = Values[i];
var reblogflag = Data.reblogflag;
var dashboardflag = Data.dashboardflag;
var dashboardtype = Data.dashboardtype;
sum + = Data.sum;
Photonum + = Data.photonum;
Reblognum + = Data.reblognum;
Postnum + = Data.postnum; Videonum += Data.videonum;
Musicnum + = Data.musicnum;
Videoshortnum + = Data.videoshortnum;
Questionnum + = Data.questionnum;
Appnum + = Data.appnum;
Dialognum + = Data.dialognum;
if (!reblogflag) {if (Dashboardflag) {sum + = 1;
if (Dashboardtype = =) {Postnum + = 1;
} else if (Dashboardtype = =) {Photonum + = 1;
} else if (Dashboardtype = =) {Videonum + = 1;
} else if (Dashboardtype = =) {Videoshortnum + = 1;
} else if (Dashboardtype = =) {Musicnum + = 1;
} else if (Dashboardtype = =) {Questionnum + = 1;
} else if (Dashboardtype = =) {Appnum + = 1;
} else if (Dashboardtype = =) {Dialognum + = 1; }} else {
if (Dashboardtype = =) {Photonum + = 1;
}}} and else if (Reblogflag && dashboardflag) {reblognum + = 1; }} return {Sum:numberint (sum), Reblognum:numberint (Reblognum), Postnum:numberint (postnum), Photonum:numberi NT (Photonum), Videonum:numberint (Videonum), Videoshortnum:numberint (Videoshortnum), Musicnum:numberint (Musi
CNum), Questionnum:numberint (Questionnum), Appnum:numberint (Appnum), Dialognum:numberint (Dialognum)};
};
Db.getmongo (). Setslaveok ();
Db.dashboard_basic.mapReduce (map, reduce, {out:{merge: ' Aaccounttemp '}}
);
var results = Db.aAccounttemp.find ();
Reorganize the data format into the regular table while (Results.hasnext ()) {var obj = Results.next ();
var value = Obj.value;
var sum = numberint (value.sum);
var reblognum = Numberint (value.reblognum);
var postnum = Numberint (value.postnum); var photonum = Numberint(Value.photonum);
var videonum = Numberint (value.videonum);
var videoshortnum = Numberint (value.videoshortnum);
var musicnum = Numberint (value.musicnum);
var questionnum = Numberint (value.questionnum);
var appnum = Numberint (value.appnum);
var dialognum = Numberint (value.dialognum);
var accountId = obj._id; Db.dashboard_account_num.insert ({accountid:accountid,sum:sum,reblognum:reblognum,postnum:postnum,photonum: Photonum, Videoshortnum:videoshortnum,videonum:videonum,musicnum:musicnum,questionnum:questionnum, AppNum:
Appnum,dialognum:dialognum});
} print (' Success Insert total ' + results.count () + ' datas ');
Db.aAccounttemp.drop () Quit ()