How to query and delete duplicate records in MySQL

Last Update:2018-05-29 Source: Internet

Author: User

Tags repetition

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

If you like these articles, click here to subscribe to this BlogMySQL Method for querying and deleting duplicate records (1) 1. Find excess duplicate records in the Table. duplicate records are based on a single field (peopleId) select * frompeoplewherepeopleIdin (selectpeopleIdfrompeoplegroupbypeopleIdhaving

If you like these articles, click here to subscribe to this Blog MySQL to query and delete duplicate records (1) 1. Find redundant duplicate records in the table, duplicate records are used to determine the select * from people where peopleId in (select peopleId from people group by peopleId having

If you like these articles, click here to subscribe to this Blog

How to query and delete duplicate records in MySQL

(1)
1. Search for redundant duplicate records in the Table. duplicate records are determined based on a single field (peopleId ).
Select * from people
Where peopleId in (select peopleId from people group by peopleId having count (peopleId)> 1)

2. Delete unnecessary duplicate records in the Table. Repeat records are determined based on a single field (eagleid), leaving only the records with the smallest rowid
Delete from people
Where peopleId in (select peopleId from people group by peopleId having count (peopleId)> 1)
And rowid not in (select min (rowid) from people group by peopleId having count (peopleId)> 1)

3. Search for redundant duplicate records in the table (multiple fields)
Select * from vitae
Where (a. peopleId, a. seq) in (select peopleId, seq from vitae group by peopleId, seq having count (*)> 1)

4. Delete redundant record (multiple fields) in the table, leaving only the records with the smallest rowid
Delete from vitae
Where (a. peopleId, a. seq) in (select peopleId, seq from vitae group by peopleId, seq having count (*)> 1)
And rowid not in (select min (rowid) from vitae group by peopleId, seq having count (*)> 1)

5. Search for redundant duplicate records (multiple fields) in the table, excluding records with the smallest rowid
Select * from vitae
Where (a. peopleId, a. seq) in (select peopleId, seq from vitae group by peopleId, seq having count (*)> 1)
And rowid not in (select min (rowid) from vitae group by peopleId, seq having count (*)> 1)

(2)
For example
There is A field "name" in Table ",
The "name" value may be the same for different records,
Now, you need to query items with duplicate "name" values between records in the table;
Select Name, Count (*) From A Group By Name Having Count (*)> 1

If the gender is also the same, the statement is as follows:
Select Name, sex, Count (*) From A Group By Name, sex Having Count (*)> 1

(3)
Method 1

Declare @ max integer, @ id integer

Declare cur_rows cursor local for select Main field, count (*) from table name group by main field having count (*)>; 1

Open cur_rows

Fetch cur_rows into @ id, @ max

While @ fetch_status = 0

Begin

Select @ max = @ max-1

Set rowcount @ max

Delete from table name where primary field = @ id

Fetch cur_rows into @ id, @ max

End

Close cur_rows

Set rowcount 0

Method 2

There are two Repeated Records. One is a completely repeated record, that is, records with all fields being repeated, and the other is records with duplicate key fields, such as duplicate Name fields, other fields are not necessarily repeated or can be ignored.

1. For the first type of repetition, it is easier to solve.

Select distinct * from tableName

You can get the result set without repeated records.

If the table needs to delete duplicate records (one record is retained), you can delete the record as follows:

Select distinct * into # Tmp from tableName

Drop table tableName

Select * into tableName from # Tmp

Drop table # Tmp

The reason for this repetition is that the table design is not weekly. You can add a unique index column.

2. Repeat problems usually require that the first record in the repeat record be retained. The procedure is as follows:

Assume that the duplicate fields are Name and Address. You must obtain the unique result set of the two fields.

Select identity (int, 1, 1) as autoID, * into # Tmp from tableName

Select min (autoID) as autoID into # Tmp2 from # Tmp group by Name, autoID

Select * from # Tmp where autoID in (select autoID from # tmp2)

The last select command gets the result set with no duplicate Name and Address (but an autoID field is added, which can be omitted in the select clause when writing)

(4)
Duplicate Query

Select * from tablename where id in (

Select id from tablename

Group by id

Having count (id)> 1

)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More