MapReduce implementation of data aggregation method in MongoDB _mongodb

Source: Internet
Author: User
Tags emit mongodb

MongoDB is a large data environment for the birth of a large amount of data to save the relational database, for a large number of data, how to do statistical operations is very important, then how to count some data from the MongoDB?

In MongoDB, we provide three ways of aggregating data:

(1) Simple user aggregation function;

(2) using aggregate for statistics;

(3) using MapReduce for statistics;

Today we first talk about how MapReduce statistics, in the subsequent articles, will be a separate article for the relevant instructions.

What's MapReduce? In my understanding, in fact, the collection of the various conditions of the document preprocessing, sorting out the desired data and then statistics to obtain the final statistical results. The map function is used to preprocess each document that satisfies the condition in the set, and to sort out the desired data. The reduce function is used to process the sorted data to obtain statistical results. Both the map function and the reduce function are JavaScript functions.

First, we construct a test dataset testing, and use the JS script to randomly insert a set of data into the collection, each of which is a record of how much money the person spends and buys. The specific script test1.js is as follows:

<span style= "font-family:kaiti_gb2312;" ><span style= "FONT-SIZE:18PX;" >for (var i=0; i<100; i++) { 
var rid=math.floor (Math.random () *10); 
var price = Parsefloat ((math.random () *10). ToFixed (2)); 
if (rid<3) { 
Db.test.insert ({"User": "majing", "SKU": RID, "Price":p rice}); 
} 
else if (rid>=3 && rid<5) { 
Db.test.insert ({"User": "Wufenglei", "SKU": RID, "Price":p rice}); 
} 
else if (rid>=5 && rid<8) { 
Db.test.insert ({"User": "Wufenglei", "SKU": RID, "Price":p rice}); 
} 
else { 
Db.test.insert ({"User": "Liyonghu", "SKU": RID, "Price":p rice}); 
} 

Next we insert specific data into the database by executing the script at the console, and the following instructions are executed:

 
 

After execution, the data is viewed through the Mongovue, as shown below, and is inserted into the collection:

Next, we can do a few simple statistical operations.

(1) Statistics how many different users have bought the goods ? Write the JS script test2.js and save the results to the Statis1 collection.

<span style= "font-family:kaiti_gb2312;" ><span style= "FONT-SIZE:18PX;" ><span style= "FONT-SIZE:18PX;" >map=function () { 
emit (this.user,1); 
} 
Reduce=function (key, values) { 
var count = 0; 
Values.foreach (function (val) {count + = val}); 
return count; 
} 

Perform the test2.js as you just executed the script and view the data:

From the database can be intuitive to see statistics, if you want to see a person such as majing how many goods purchased, direct use

<span style= "font-family:kaiti_gb2312;" >
<span style= "FONT-SIZE:18PX;" >
<span style= "FONT-SIZE:18PX;" >
<span style= "FONT-FAMILY:KAITI_GB2312;FONT-SIZE:18PX;" >db.statics1.find ({"_id": "Majing"});
</span>
</span>
</span>

(2) Statistics of the quantity of each product purchased by each user

The script test3.js looks like this:

<span style= "font-family:kaiti_gb2312;" ><span style= "FONT-SIZE:18PX;" ><span style= "FONT-SIZE:18PX;" >map=function () { 
emit ({user:this.user,sku:this.sku},1); 
} 
Reduce=function (key, values) { 
var count = 0; 
Values.foreach (function (val) {count + = val}); 
return count; 
} 

Perform the test3.js as you just executed the script and view the data:

A total of 10 records were returned. Now if we want to find out what a user is buying, you can use the following query method:

 
 

If we want to find out where a user buys a product, you can use the following query method:

(3) Statistics of the total amount of goods purchased by each user and the total amounts spent

The script test4.js looks like this:

<span style= "font-family:kaiti_gb2312;" ><span style= "FONT-SIZE:18PX;" ><span style= "FONT-SIZE:18PX;" >map=function () { 
emit ({user:this.user},{totalprice:this.price,count:1}); 
} 
Reduce=function (key, values) { 
var res = {Totalprice:0.00,count:1}; 
Values.foreach (function (val) {res.totalprice + = Val.totalprice;res.count+=val.count;}); 
return res; 
} 

Perform the test4.js as you just executed the script and view the data:

(4) to calculate the average price per user for goods purchased

In this scenario, we need to say another parameter in MapReduce Finalize, which is a JavaScript script function that is used to perform a post-processing operation on the collection after reduce.

Execute the script test5.js, as shown in the following example:

<span style= "font-family:kaiti_gb2312;" ><span style= "FONT-SIZE:18PX;" ><span style= "FONT-SIZE:18PX;" >map=function () { 
emit ({user:this.user},{totalprice:this.price,count:1}); 
} 
Reduce=function (key, values) { 
var res = {totalprice:0.00,count:1,average:0}; 
Values.foreach (function (val) {res.totalprice + = Val.totalprice;res.count+=val.count;}); 
return res; 
} 
Finalizefunc=function (key,reduceresult) { 
reduceresult.totalprice= (reduceresult.totalprice). toFixed (2); 
Reduceresult.average= (Reduceresult.totalprice/reduceresult.count). toFixed (2); 
return reduceresult; 
} 

After the implementation of the data, see the details shown below, showing the total price, the number of goods and the price of goods.

If you are looking for someone, you can use the Find () method to query, just like the query method above:

 
 

Above through 4 simple examples of MongoDB in the mapreduce of the simple description, of course, MapReduce function is very powerful, if you want to know other advanced use methods, you can go to MongoDB's official website for inspection and learning, the Web site for

https://docs.mongodb.com/manual/reference/method/db.collection.mapReduce/Thank you.

The above is a small set to introduce the MongoDB in the MapReduce to achieve data aggregation methods, I hope to help you, if you have any questions please give me a message, small series will promptly reply to everyone. Here also thank you very much for the cloud Habitat Community website support!

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.