MongoDB map reduce, mongoDBmapreduce

Source: Internet
Author: User

MongoDB map reduce, mongoDBmapreduce

MongoDB map reduce usage

Example:

Res = db. runCommand ({
Mapreduce: 'liveepguservisits ',
Map: function (){
Emit ({provice: this. provice}, {"data": [{"mac": this. mac}], visit: this. visitNum, userCount: 0 });
},
Reduce: function (key, value ){
Var ret = {data: []}, visit = 0
Var userCount = 0;
Var macs = {};
Var sum = 0;
For (var I in value ){
Var ia = value [I];
For (var j in ia. data ){
If (! Macs [ia. data [j]. mac]) {
Macs [ia. data [j]. mac] = true;
Ret. data. push (ia. data [j]);

UserCount + = 1;

}
}

Sum + = Number (ia. visit );
}
Ret. visit = sum;
Ret. userCount = userCount;
Return ret;
},
Query: {"inputTime": {$ gte: ISODate ("2014-09-17T14: 20: 00Z"), $ lte: ISODate ("2014-09-17T14: 30: 00Z ")}},
Finalize: function (key, values ){
Return [{count: values. data. length}, {visit: values. visit}, {userCount: values. userCount}];
},
Out: 'tmp _ mo_spcode_consignid_1 ',
Verbose: true
}) In the preceding example, map: key is the province, and value is the mac value, access number, and user number of the province. Reduce: Get the value of each province, and then perform the operation. The result is returned. Use the finalize parameter to specify the output format. If no output format is specified, the key and value formats of map are installed for output.

Db. runCommand contains the following parameters:

db.runCommand( { mapreduce : <collection>,   map : <mapfunction>,   reduce : <reducefunction>   [, query : <query filter object>]   [, sort : <sort the query.  useful for optimization>]   [, limit : <number of objects to return from collection>]   [, out : <output-collection name>]   [, keeptemp: <true|false>]   [, finalize : <finalizefunction>]   [, scope : <object where fields go into javascript global scope >]   [, verbose : true] });

-Mapreduce: Specifies the collection for mapreduce processing.
-Map: map function-reduce: reduce function-out: name of the collection in the output result. If this parameter is not specified, a collection with a random name will be created by default. (If the out option is used, you do not need to specify keeptemp: true because it is already implicitly included.)-query: A filtering condition. Only documents that meet the conditions can call the map function. (Query. Limit and sort can be combined at Will)-sort: the sort sorting parameter that is combined with limit (also sorted by the document before being sent to the map function). The grouping mechanism can be optimized-limit: the maximum number of documents sent to the map function (if there is no limit, sort alone is of little use)-keytemp: true or false, indicating whether the collection output result is temporary, if you want to keep this set after the connection is closed, you must specify keeptemp to true. If you are using the MongoDB mongo client to connect, it will be deleted only after exit. If the script is executed, exit the script or call close will automatically delete the result collection-finalize: Yes function, it calculates the key and value and returns the final result after executing map and reduce. This is the last step in the processing process. Therefore, finalize is a calculation average and an array is cropped, the right time to clear unnecessary information-scope: variables to be used in javascript code. The variables defined here are visible in the map, reduce, finalize functions-verbose: the detailed output option for debugging. To view the running process of MpaReduce, you can set it to true. You can also print the information in the map, reduce, and finalize processes to the server log.
 

Who knows about mongodb's mapreduce?

Map: it can be understood as the data to be filled. In SQL, it is like the portion of the where condition to be filtered;
Reduce: it can be understood as the field to be displayed;

Because mapreduce is very difficult for beginners to understand. We recommend that you start with the simple group method;

In addition, the performance of MapReduce is very low. Unless the background statistics are performed, do not use MapReduce or query it as the front-end data access method.

Which of the following processing methods is optimal for mongodb multi-Table Association?

This is the requirement. A game background system needs to analyze the log files generated every day. Game logs include user registration and user logon. Use mapreduce to collect user registration information to a collection of user_register, deduplicate the user login information and place it in another collection of user_login. Now you need to associate the two sets with the user name to collect some data. However, I found a lot of information and did not find a good solution for mongodb in this regard. I also thought about using mapreduce to solve this problem. However, according to my experience in using mapreduce during this time, it seems that mapreduce can only process one set and cannot process two sets at the same time. One solution I have come up with is to read all the data in these two sets and then use the program code for processing. Although this method can solve the problem temporarily, it is certainly not the best. So I took the liberty to send you this message to see if you can give some reasonable suggestions or methods. Thank you !!

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.