MongoDB Group Query statistics eliminate duplicate records

Source: Internet
Author: User
Tags mongodb version

MongoDB version number is, MongoDB shell version:2.4.4
The operating environment, Shell window, is as follows:

[mongo_user@mongodb_dbs ~]# mongo --port 301002.4.4127.0.0.1:30000usepospos
1, the first statistical grouping records, the Paymentorder field to group statistics, to find out all the statistical results, the group statistics of >1
// 这里分组统计出来取分组字段paymentOrder的值_id、最大的objectid值max_id、分组统计数countvargroup=([         {$group:{_id:"$paymentOrder", max_id: {$max:"$_id"$sum1 }}},     {$sort:{count:-1}}])
2, the definition is to find the existence of duplicate groups, using the pipe operator maTCh,Strippiecesis aPUPassCheckInquiryof theLattice-,butis amadewithin The format of the output result of group:
var match ={"$match":{"count" : {"$gt"2}}};
3, finally, by aggregating the frame function db.paymentinfo.aggregate (group, match), there is a grouping of duplicate data. This process may seem complicated, but in fact it's just the group by in T-SQL ... having ... 's syntax.
var ds=db.paymentinfo.aggregate(groupmatch);

PS: Here match is invalid, out of a lot of data count is 1,that is {"match": {"Count": {"GT": 2}}; failure, why?

4, back up the first before deleting.

Backup

/usr/local/mongodb/mongodb-linux-x86_64-2.4.4--30000-d-c paymentinfo  -o /home/backup/mongodb_pos_paymentinfo_3.txt
5, start loop Delete

Here DS is a large result set, directly with Ds.result can get the result set inside the packet query out of the data:

  ////below start loop to traverse, aggregate out the result has the characteristics of the array, you can directly for the loop processing  vards = Db.paymentinfo.aggregate (group, match); for(vari =0; I <ds.result.length; i++) {varChild=ds.result[i];varCount=child.count;//Because {"$match" {"Count": {"$GT": 2}}} for the second step above, the filter is invalid, so add a count>1 here to filter out the data that is not duplicated and only perform data processing operations that have duplicates.    if(count>1){varoid=child.max_id;Print(count);//Here Gets the Objectid of the collection record of the largest Objectid in the group        varpayorder=child._id;//Get duplicate Paymentorder for all records queried out and traverse        varPs=db.paymentinfo.find ({"Paymentorder":p Ayorder});//Direct find requires toarray () to be processed into an array so that it can traverse        varPsc=ps.toarray (); for(varj=0; j<psc.length; J + +) {varPCHILD=PSC[J];//Objectid will be traversed, if it is the largest record retained, do not delete remove            if(Oid.tostring () ==pchild._id.tostring ()) {Print("The same One");Print(Pchild._id.tostring ());Print(Oid.tostring ()); }Else{Print("The other one -----");Print(Pchild._id.tostring ());Print(Oid.tostring ());d B.paymentinfo.remove ({"_id":p child._id}); }        }    }    }

By the way: If you copy my script, go to the shell of the MONGOs client to perform an error, there may be a malformed format, you can remove all the newline symbols or you manually input again, to execute, there will be no error.

MongoDB Group Query statistics remove duplicate records

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.