MongoDB map reduce, mongoDBmapreduce
MongoDB map reduce usage
Example:
Res = db. runCommand ({
Mapreduce: 'liveepguservisits ',
Map: function (){
Emit ({provice: this. provice}, {"data": [{"mac": this. mac}], visit: this. visitNum, userCount: 0 });
},
Reduce: function (key, value ){
Var ret = {data: []}, visit = 0
Var userCount = 0;
Var macs = {};
Var sum = 0;
For (var I in value ){
Var ia = value [I];
For (var j in ia. data ){
If (! Macs [ia. data [j]. mac]) {
Macs [ia. data [j]. mac] = true;
Ret. data. push (ia. data [j]);
UserCount + = 1;
}
}
Sum + = Number (ia. visit );
}
Ret. visit = sum;
Ret. userCount = userCount;
Return ret;
},
Query: {"inputTime": {$ gte: ISODate ("2014-09-17T14: 20: 00Z"), $ lte: ISODate ("2014-09-17T14: 30: 00Z ")}},
Finalize: function (key, values ){
Return [{count: values. data. length}, {visit: values. visit}, {userCount: values. userCount}];
},
Out: 'tmp _ mo_spcode_consignid_1 ',
Verbose: true
}) In the preceding example, map: key is the province, and value is the mac value, access number, and user number of the province. Reduce: Get the value of each province, and then perform the operation. The result is returned. Use the finalize parameter to specify the output format. If no output format is specified, the key and value formats of map are installed for output.
Db. runCommand contains the following parameters:
db.runCommand( { mapreduce : <collection>, map : <mapfunction>, reduce : <reducefunction> [, query : <query filter object>] [, sort : <sort the query. useful for optimization>] [, limit : <number of objects to return from collection>] [, out : <output-collection name>] [, keeptemp: <true|false>] [, finalize : <finalizefunction>] [, scope : <object where fields go into javascript global scope >] [, verbose : true] });
-Mapreduce: Specifies the collection for mapreduce processing.
-Map: map function-reduce: reduce function-out: name of the collection in the output result. If this parameter is not specified, a collection with a random name will be created by default. (If the out option is used, you do not need to specify keeptemp: true because it is already implicitly included.)-query: A filtering condition. Only documents that meet the conditions can call the map function. (Query. Limit and sort can be combined at Will)-sort: the sort sorting parameter that is combined with limit (also sorted by the document before being sent to the map function). The grouping mechanism can be optimized-limit: the maximum number of documents sent to the map function (if there is no limit, sort alone is of little use)-keytemp: true or false, indicating whether the collection output result is temporary, if you want to keep this set after the connection is closed, you must specify keeptemp to true. If you are using the MongoDB mongo client to connect, it will be deleted only after exit. If the script is executed, exit the script or call close will automatically delete the result collection-finalize: Yes function, it calculates the key and value and returns the final result after executing map and reduce. This is the last step in the processing process. Therefore, finalize is a calculation average and an array is cropped, the right time to clear unnecessary information-scope: variables to be used in javascript code. The variables defined here are visible in the map, reduce, finalize functions-verbose: the detailed output option for debugging. To view the running process of MpaReduce, you can set it to true. You can also print the information in the map, reduce, and finalize processes to the server log.
Who knows about mongodb's mapreduce?
Map: it can be understood as the data to be filled. In SQL, it is like the portion of the where condition to be filtered;
Reduce: it can be understood as the field to be displayed;
Because mapreduce is very difficult for beginners to understand. We recommend that you start with the simple group method;
In addition, the performance of MapReduce is very low. Unless the background statistics are performed, do not use MapReduce or query it as the front-end data access method.
Which of the following processing methods is optimal for mongodb multi-Table Association?
This is the requirement. A game background system needs to analyze the log files generated every day. Game logs include user registration and user logon. Use mapreduce to collect user registration information to a collection of user_register, deduplicate the user login information and place it in another collection of user_login. Now you need to associate the two sets with the user name to collect some data. However, I found a lot of information and did not find a good solution for mongodb in this regard. I also thought about using mapreduce to solve this problem. However, according to my experience in using mapreduce during this time, it seems that mapreduce can only process one set and cannot process two sets at the same time. One solution I have come up with is to read all the data in these two sets and then use the program code for processing. Although this method can solve the problem temporarily, it is certainly not the best. So I took the liberty to send you this message to see if you can give some reasonable suggestions or methods. Thank you !!