SQL statements used to query and delete duplicate records in an Oracle table

Source: Internet
Author: User

No matter what database I use, duplicate data exists. Next I will introduce in detail various methods and Performance Analysis for deleting duplicate records in Oracle tables, for more information, see.

Method:

The Code is as follows: Copy code
Group by XX having count (*)> 1, rowid, distinct, temporary table, procedure

The following statement can query the data that is duplicated:

The Code is as follows: Copy code
Select Field 1, Field 2, count (*) from table name group by field 1, Field 2 having count (*)> 1

Change the ">" number above to "=" to query the non-duplicated data.
To delete the duplicate data, use the following statement.

The Code is as follows: Copy code

Delete from table name a where field 1, Field 2 in
(Select Field 1, Field 2, count (*) from table name group by field 1, Field 2 having count (*)> 1)

Example

1. query duplicate data in the table

A. Repeat a field

B. Repeat Multiple Fields

C. Repeat the entire line

Create a test table:

The Code is as follows: Copy code

Create table cfa (businessid number, customer varchar2 (50), branchcode varchar2 (10), data_date varchar2 (10 ));
Insert into cfa values (1, 'Albert ', 'scb', '2017-11-11 ');
Insert into cfa values (2, 'andy ', 'db', '2017-11-12 ');
Insert into cfa values (3, 'allen ', 'HSBC', '2017-11-13 ');

--------------- The following are duplicate data ----------------------------------------------
Insert into cfa values (1, 'Alex ', 'icbc', '2017-11-14 ');
Insert into cfa values (1, 'Albert ', 'ctbk', '2017-11-15 ');
Insert into cfa values (1, 'Albert ', 'scb', '2017-11-11 ');

In case of a, only businessid is repeated.

The Code is as follows: Copy code

Select * from cfa where businessid in (select businessid from cfa group by businessid having count (businessid)> 1 );

In the case of B, both businessid and name are repeated.

The Code is as follows: Copy code

Select * from cfa where (businessid, customer) in (select businessid, customer from cfa group by businessid, customer having count (*)> 1 );

For c, repeat the entire line.

Refer to Method B:

The Code is as follows: Copy code
Select * from cfa where (businessid, customer, branchcode, data_date) in (select * from cfa group by businessid, customer, branchcode, data_date having count (*)> 1 );

2. Delete duplicate data in the table

In case a, the redundant duplicate records in the table are deleted. The duplicate records are determined based on a single field (businessid), leaving only the records with the smallest rowid

You can also keep the rowid instead of the minimum record. You need to change the value of min in the code to max.

The Code is as follows: Copy code

Delete from cfa
Where businessid in (select businessid
From cfa
Group by businessid
Having count (businessid)> 1)
And rowid not in (select min (rowid)
From cfa
Group by businessid
Having count (businessid)> 1 );

Alternatively, use the following simple and efficient statements:

The Code is as follows: Copy code

Delete from cfa t
WHERE t. ROWID>
(Select min (X. ROWID) FROM cfa x where x. businessid = t. businessid );

B. Delete the redundant record (multiple fields) in the table and only keep the records with the smallest rowid.

The Code is as follows: Copy code

Delete from cfa
Where (businessid, customer) in (select businessid, customer
From cfa
Group by businessid, customer
Having count (*)> 1)
And rowid not in (select min (rowid)
From cfa
Group by businessid, customer
Having count (*)> 1 );

Alternatively, use the following simple and efficient statements:

The Code is as follows: Copy code

Delete from cfa t
WHERE t. ROWID> (select min (X. ROWID)
FROM cfa X
Where x. businessid = t. businessid
And x. customer = t. customer );

The preceding statement is very simple, that is, to delete the queried data. However, the deletion execution efficiency is very low, and the database may be suspended for a large amount of data. Therefore, we recommend that you first Insert the queried duplicate data into a temporary table and then delete it. In this way, you do not need to perform another query When deleting the data. As follows:

Create table temporary TABLE

The Code is as follows: Copy code
(Select Field 1, Field 2, count (*) from table name group by field 1, Field 2 having count (*)> 1)

The above statement creates a temporary table and inserts the queried data into it.
You can perform the following deletion operations:

The Code is as follows: Copy code
Delete from table name a where field 1, Field 2 in (select Field 1, Field 2 from temporary table );

It is much more efficient to create a temporary table and then delete it.


C. This is simple. Use the temporary table method.

The Code is as follows: Copy code

Create table cfabak as select distinct * from cfa;

Truncate table cfa; -- it is best to back up the table for Production

Insert into cfa select * from cfabak;

Commit;


We want to delete the duplicate data and keep only the latest one. We can write it like this:

The Code is as follows: Copy code

Delete from table name
Where a. rowid! =
(
Select max (B. rowid) from table name B
Where a. Field 1 = B. Field 1 and
A. Field 2 = B. Field 2
)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.