In fact, MapReduce in MongoDB is more similar to GroupBy in relational databases.
Just after doing this experiment, the GroupBy (MapReduce) for large data volumes is still ideal, generating million 3-bit random strings
- For(VarI = 0; I <1000000; I ++)
- {
- VarX ="0123456789";
- VarTmp ="";
- For(VarJ = 0; j <3; j ++)
- {
- Tmp + = x. charAt (Math. ceil (Math. random () * 100000000) % x. length); |
- }
- VarU = {_ id: I, v1: tmp };
- Db. RandomNum. insert (u );
- }
Then perform the same random number to get the Count number, so it must be GroupBy
- VarM =Function() {Emit (This. V1, {count: 1 });};// Map key is similar to group by of relational data. The second value is the field to be aggregated (sum ...)
- VarR =Function(Key, values ){VarTotal = 0;For(VarI = 0; I <values. length; I ++) {total + = values [I]. count ;}Return{Count: total };};.// Reduce
- VarRes = db. RandomNum. mapReduce (m, r, {out: {replace:'Result'}});
- Db [res. result]. find ()
Tested:
- VarStartTime =NewDate ();
- VarM =Function() {Emit (This. V1, {count: 1 });};
- VarR =Function(Key, values ){VarTotal = 0;For(VarI = 0; I <values. length; I ++) {total + = values [I]. count ;}Return{Count: total };};
- VarRes = db. RandomNum. mapReduce (m, r, {out: {replace:'Result'}});
- Db [res. result]. find ()
- (NewDate (). getTime ()-startTime. getTime ()/1000
The result is as follows:
- > Db [res. result]. find ()
- {"_ Id":"000","Value":{"Count":1075}}
- {"_ Id":"001","Value":{"Count":1045}}
- {"_ Id":"002","Value":{"Count":1022}}
- {"_ Id":"003","Value":{"Count":968}}
- {"_ Id":"004","Value":{"Count":994}}
- {"_ Id":"005","Value":{"Count":1009}}
- {"_ Id":"006","Value":{"Count":948}}
- {"_ Id":"007","Value":{"Count":1003}}
- {"_ Id":"008","Value":{"Count":983}}
- {"_ Id":"009","Value":{"Count":993}}
- {"_ Id":"010","Value":{"Count":987}}
- {"_ Id":"011","Value":{"Count":982}}
- {"_ Id":"012","Value":{"Count":957}}
- {"_ Id":"013","Value":{"Count":1031}}
- {"_ Id":"014","Value":{"Count":971}}
- {"_ Id":"015","Value":{"Count":1053}}
- {"_ Id":"016","Value":{"Count":974}}
- {"_ Id":"017","Value":{"Count":975}}
- {"_ Id":"018","Value":{"Count":978}}
- {"_ Id":"019","Value":{"Count":1010}}
- Has more
- >
- > (New Date (). getTime ()-startTime. getTime ())/1000
- 63.335 s
- > Bye
Test Machine performance: