How to count the data after deduplication in the MongoDB collection

Source: Internet
Author: User
Tags mongodb mongodb collection
Let's say we have a MongoDB collection,

Take this simple set as an example, we need to include how many different mobile phone numbers in the collection, the first thought is to use the DISTINCT keyword,
Db.tokencaller.distinct (' Caller '). length
If you want to see specific and different phone numbers, then you can omit the length property, since Db.tokencaller.distinct (' Caller ') returns an array of all the mobile phone numbers.


However, this approach is sufficient for all situations. Not so, if you want to count the number of collection records, such as tens, then in such a statistical time will often reported 10044 error, the message "exception:distinct too big, 16mb cap". Later we will resolve it in other ways.
Another way to use RunCommand combined with distinct,
Db.runcommand ({"distinct": "Tokencaller", "Key": "Caller"})


Visible on the values of the mobile phone number after the deduplication, see the result is a JSON format, and then try to see if you can remove the values of the size, because if the large amount of data for the collection, the direct display of the weight of the number is obviously inappropriate, and then tried the following wording:


Discovery is possible, so the big data use this way to see if you can take out the results, found that there is no length attribute, I think it should be related to the client version of MongoDB, but also to verify ...
Both ways are not, so try the next MapReduce way, specifically as follows:


Then we will find that he will output the results of the query to a combination called "Callerstatis", as follows:


Then use Db.callerstatis.count () to know how many different mobile numbers you have.
Using this method, we also tried on the collection of big data, but failed .... (Sancent t_t), if anyone has a good way, trouble also tell me, small grateful Ah ^_^
If you are interested in my technical column and support me to continue in-depth writing, you can sweep the code to support me, after all hero cherish hero, no matter how much I sincerely thank you, after all, is the recognition of my labor achievements, thank you (^_^) ...


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.