In another way, we can add a random key (random) to each document, generate a random number using the Math. random () method, and store it in the document. During query, a random number is also generated using Math. random (), and a document with a random number less than the random number in the set is returned. Of course, there may be no document smaller than the random number, but in this case, there must be a document greater than or equal to this random number, unless the set is empty.
Query a random data entry
The code is as follows: |
Copy code |
Var random = Math. random (); Var result = db. user. findOne ({"random": {"$ lt": random }}); If (result = null) { Result = db. user. findOne ({"random": {"$ gte": random }}); } Or PRIMARY> db. phoneMessage. count () 8704224 PRIMARY> db. phoneMessage. find (). limit (-1). skip (Math. floor (Math. random () * 8704221). next () |
Query multiple random data entries
1. Refer to a single query and insert the queried data to the collection t in a cycle.
The code is as follows: |
Copy code |
For (var I = 0; I <1000; I ++) {var c = db. phoneMessage. find (). limit (-1 ). skip (Math. floor (Math. random () * 8704221 )). next (); db. t. insert (c )} |
This method is simple, but when the data volume is large, the skip efficiency is very low and takes a long time.
2. Query by map/reduce
Mongodb 2.2 or above can be used in the following method, and the 2.0 format is somewhat problematic.
The code is as follows: |
Copy code |
Function mapf (){ If (countSubset = 0) return; Var prob = countSubset/countTotal; If (Math. random () <= prob ){ Emit (1, this ); CountSubset --; } CountTotal --; } Function performancef (key, values ){ Return {"documents": values }; } Res = db. phoneMessage. mapReduce (mapf, reducef, {"out": {"inline": 1 }," scope ": {" countTotal ": 87042," countSubset ": 1000 }}) Db.t.insert(res.results%02.16.value.doc uments) |
This method cannot cope with large data volumes, and the error InternalError: too much recursion will be reported.
3. Add a random number field and then query it.
This method is better, but the set structure is changed.