How to delete duplicate records using SQL statements

Source: Internet
Author: User
Tags repetition

Original post address: http://www.cnblogs.com/phpliu/archive/2010/06/21/1761726.html

For example:
ID name value
1 A PP
2 A PP
3 B III
4 B PP
5 B PP
6 c pp
7 c pp
8 C III
ID is the primary key
This result is required.
ID name value
1 A PP
3 B III
4 B PP
6 c pp
8 C III

Method 1
Delete yourtable
Where [ID] Not in (
Select max ([ID]) from yourtable
Group by (name + value ))

Method 2
Delete
From table a left join (
Select (ID) from Table group by name, Value
) B on A. ID = B. ID
Where B. ID is null

SQL statement for querying and deleting duplicate records
SQL statement for querying and deleting duplicate records
1. Search for redundant duplicate records in the Table. duplicate records are determined based on a single field (peopleid ).
Select * from people
Where peopleid in (select peopleid from people group by peopleid having count (peopleid)> 1)
2. Delete unnecessary duplicate records in the Table. Repeat records are determined based on a single field (eagleid), leaving only the records with the smallest rowid
Delete from people
Where peopleid in (select peopleid from people group by peopleid having count (peopleid)> 1)
And rowid not in (select Min (rowid) from people group by peopleid having count (peopleid)> 1)
3. Search for redundant duplicate records in the table (multiple fields)
Select * From vitae
Where (A. peopleid, A. seq) in (select peopleid, seq from vitae group by peopleid, seq having count (*)> 1)
4. Delete redundant record (multiple fields) in the table, leaving only the records with the smallest rowid
Delete from vitae
Where (A. peopleid, A. seq) in (select peopleid, seq from vitae group by peopleid, seq having count (*)> 1)
And rowid not in (select Min (rowid) from vitae group by peopleid, seq having count (*)> 1)

5. Search for redundant duplicate records (multiple fields) in the table, excluding records with the smallest rowid
Select * From vitae
Where (A. peopleid, A. seq) in (select peopleid, seq from vitae group by peopleid, seq having count (*)> 1)
And rowid not in (select Min (rowid) from vitae group by peopleid, seq having count (*)> 1)
(2)
For example
There is a field "name" in Table ",
The "name" value may be the same for different records,
Now, you need to query items with duplicate "name" values between records in the table;
Select name, count (*) from a group by name having count (*)> 1
If the gender is also the same, the statement is as follows:
Select name, sex, count (*) from a group by name, sex having count (*)> 1

(3)
Method 1
Declare @ Max integer, @ ID integer
Declare cur_rows cursor local for Select Main field, count (*) from table name group by main field having count (*)>; 1
Open cur_rows
Fetch cur_rows into @ ID, @ Max
While @ fetch_status = 0
Begin
Select @ max = @ max-1
Set rowcount @ Max
Delete from table name where primary field = @ ID
Fetch cur_rows into @ ID, @ Max
End
Close cur_rows
Set rowcount 0 method 2
"Repeat record" has two duplicate records. One is a completely repeated record, that is, a record with all fields already exists. The other is a record with duplicate key fields, for example, the name field is repeated, while other fields are not necessarily repeated or can be ignored.
1. For the first type of repetition, it is easier to solve.
Select distinct * From tablename
You can get the result set without repeated records.
If the table needs to delete duplicate records (one record is retained), you can delete the record as follows:
Select distinct * into # TMP from tablename
Drop table tablename
Select * into tablename from # TMP
Drop table # TMP
The reason for this repetition is that the table design is not weekly. You can add a unique index column.
2. Repeat problems usually require that the first record in the repeat record be retained. The procedure is as follows:
Assume that the duplicate fields are Name and address. You must obtain the unique result set of the two fields.
Select Identity (INT, 1, 1) as autoid, * into # TMP from tablename
Select min (autoid) as autoid into # tmp2 from # TMP group by name, autoid
Select * from # TMP where autoid in (select autoid from # tmp2)
The last SELECT command gets the result set with no duplicate name and address (but an autoid field is added, which can be omitted in the select clause when writing)
(4)
Duplicate Query
Select * From tablename where ID in (
Select ID from tablename
Group by ID
Having count (ID)> 1
)

 

After learning SQL for a while, I found that many duplicate records appear in the table I created for testing (without indexing. Later I summarized some methods for deleting duplicate records. In Oracle, you can delete duplicate records by using the unique rowid. You can also create a temporary table... this article only mentions several simple and practical methods. I hope you can share them with us (using the table "employee" as an example ).

SQL> DESC employee

Name null? Type
-------------------------------------------------------------------

Emp_id number (10)
Emp_name varchar2 (20)

Salary number (10, 2)

 

You can use the following statement to query duplicate records:

 

SQL> select * from employee;

Emp_id emp_name salary

------------------------------------------------------------

1 sunshine 10000

1 sunshine 10000

2 semon 20000

2 semon 20000

3. XYZ 30000

2 semon 20000

 

SQL> select distinct * from employee;

Emp_id emp_name salary

------------------------------------------------------------

1 sunshine 10000

2 semon 20000

3. XYZ 30000

SQL> select * from employee group by emp_id, emp_name, salary having count (*)> 1

Emp_id emp_name salary

------------------------------------------------------------

1 sunshine 10000

2 semon 20000

SQL> select * from employee E1

Where rowid in (select max (rowid) from employe E2
Where e1.emp _ id = e2.emp _ id and

E1.emp _ name = e2.emp _ name and e1.salary = e2.salary );

Emp_id emp_name salary

------------------------------------------------------------

1 sunshine 10000

3. XYZ 30000

2 semon 20000

 

2. delete several methods:

 

(1) create a temporary table.

 

SQL> Create Table temp_emp as (select distinct * from employee)

SQL> truncate table employee; (clear the data in the employee table)

 

SQL> insert into employee select * From temp_emp; (insert the content in the temporary table back)

 

(2) Unique rowid is used to delete duplicate records. in Oracle, each record has a rowid, which is unique throughout the database, rowid determines which data file, block, and row of each record in Oracle. In a duplicate record, the content of all columns may be the same, but the rowid may not be the same. Therefore, you only need to determine the rows with the largest or least rowid in the record, delete all others.

 

SQL> Delete from employee E2 where rowid not in (
Select max (e1.rowid) from employee E1 where

E1.emp _ id = e2.emp _ id and e1.emp _ name = e2.emp _ name and e1.salary = e2.salary); -- min (rowid) can be used here.

 

SQL> Delete from employee E2 where rowid <(
Select max (e1.rowid) from employee E1 where
E1.emp _ id = e2.emp _ id and e1.emp _ name = e2.emp _ name and

E1.salary = e2.salary );

(3) It also uses rowid, but it is more efficient.

 

SQL> Delete from employee where rowid not in (
Select max (t1.rowid) from employee T1 group

T1.emp _ id, t1.emp _ name, t1.salary); -- min (rowid) can be used here.

 

 

Emp_id emp_name salary

------------------------------------------------------------

1 sunshine 10000

3. XYZ 30000

2 semon 20000

 
SQL> DESC employee

Name null? Type
-------------------------------------------------------------------

Emp_id number (10)
Emp_name varchar2 (20)

Salary number (10, 2)

 

You can use the following statement to query duplicate records:

 

SQL> select * from employee;

Emp_id emp_name salary

------------------------------------------------------------

1 sunshine 10000

1 sunshine 10000

2 semon 20000

2 semon 20000

3. XYZ 30000

2 semon 20000

 

SQL> select distinct * from employee;

Emp_id emp_name salary

------------------------------------------------------------

1 sunshine 10000

2 semon 20000

3. XYZ 30000

SQL> select * from employee group by emp_id, emp_name, salary having count (*)> 1

Emp_id emp_name salary

------------------------------------------------------------

1 sunshine 10000

2 semon 20000

SQL> select * from employee E1

Where rowid in (select max (rowid) from employe E2
Where e1.emp _ id = e2.emp _ id and

E1.emp _ name = e2.emp _ name and e1.salary = e2.salary );

Emp_id emp_name salary

------------------------------------------------------------

1 sunshine 10000

3. XYZ 30000

2 semon 20000

 

2. delete several methods:

 

(1) create a temporary table.

 

SQL> Create Table temp_emp as (select distinct * from employee)

SQL> truncate table employee; (clear the data in the employee table)

 

SQL> insert into employee select * From temp_emp; (insert the content in the temporary table back)

 

(2) Unique rowid is used to delete duplicate records. in Oracle, each record has a rowid, which is unique throughout the database, rowid determines which data file, block, and row of each record in Oracle. In a duplicate record, the content of all columns may be the same, but the rowid may not be the same. Therefore, you only need to determine the rows with the largest or least rowid in the record, delete all others.

 

SQL> Delete from employee E2 where rowid not in (
Select max (e1.rowid) from employee E1 where

E1.emp _ id = e2.emp _ id and e1.emp _ name = e2.emp _ name and e1.salary = e2.salary); -- min (rowid) can be used here.

 

SQL> Delete from employee E2 where rowid <(
Select max (e1.rowid) from employee E1 where
E1.emp _ id = e2.emp _ id and e1.emp _ name = e2.emp _ name and

E1.salary = e2.salary );

(3) It also uses rowid, but it is more efficient.

 

SQL> Delete from employee where rowid not in (
Select max (t1.rowid) from employee T1 group

T1.emp _ id, t1.emp _ name, t1.salary); -- min (rowid) can be used here.

 

 

Emp_id emp_name salary

------------------------------------------------------------

1 sunshine 10000

3. XYZ 30000

2 semon 20000

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.