How to delete duplicate data in Oracle (sunlen)

Source: Internet
Author: User
We may see this situation where a table is not fully designed, leading to duplicate data in the table. How can we delete duplicate data?
Duplicate data may be in the following two cases. In the first case, only some fields in the table are the same, and in the second case, the two rows have identical records.
1. Deletion of duplicated data in some fields
Let's talk about how to query duplicate data.
The following statement can query the data that is duplicated:
Select Field 1, Field 2, count (*) from table name group by field 1, Field 2 having count (*)> 1
Change the ">" number above to "=" to query the non-duplicated data.
To delete the duplicate data, use the following statement.
Delete from table name a where field 1, Field 2 in
(Select Field 1, Field 2, count (*) from table name group by field 1, Field 2 having count (*)> 1)
The preceding statement is very simple, that is, to delete the queried data. However, the deletion execution efficiency is very low, and the database may be suspended for a large amount of data. Therefore, we recommend that you first Insert the queried duplicate data into a temporary table and then delete it. In this way, you do not need to perform another query When deleting the data. As follows:
Create Table temporary table
(Select Field 1, Field 2, count (*) from table name group by field 1, Field 2 having count (*)> 1)
The above statement creates a temporary table and inserts the queried data into it.
You can perform the following deletion operations:
Delete from table name a where field 1, Field 2 in (select Field 1, Field 2 from temporary table );
It is much more efficient to create a temporary table and then delete it.

At this time, everyone may jump out and say, what? If you want to execute such a statement, isn't all the duplicates deleted? We want to keep the latest record in the duplicate data! Don't worry. Let's talk about how to perform this operation.
In Oracle, an automatic rowid is hidden, which gives each record a unique rowid. If we want to keep the latest record,
We can use this field to keep the record with the largest rowid in the repeated data.
The following is an example of querying duplicate data:
Select a. rowid, A. * from table name
Where a. rowid! =
(
Select max (B. rowid) from table name B
Where a. Field 1 = B. Field 1 and
A. Field 2 = B. Field 2
)
The statement in the brackets above is used to query the largest rowid record in duplicate data.
In addition, duplicate data except the maximum rowid is queried.
As a result, we need to delete the duplicate data and keep only the latest data record. Then we can write it like this:
Delete from table name
Where a. rowid! =
(
Select max (B. rowid) from table name B
Where a. Field 1 = B. Field 1 and
A. Field 2 = B. Field 2
)
 
To put it simply, the execution efficiency of the preceding statement is very low. You can consider creating a temporary table. We need to judge the repeated fields and rowid to be inserted into the temporary table, and then compare them When deleting the table.
Create Table temporary table
Select a. Field 1, A. Field 2, max (A. rowid) dataid from formal table a group by A. Field 1, A. Field 2;
Delete from table name
Where a. rowid! =
(
Select B. dataid from temporary table B
Where a. Field 1 = B. Field 1 and
A. Field 2 = B. Field 2
);
Commit;

Ii. Deletion of completely Repeated Records

If the two rows in the table have identical records, you can use the following statement to obtain the records after deduplication:
Select distinct * from Table Name
You can place the queried records in a temporary table, delete the original table records, and export the data in the temporary table back to the original table. As follows:
Create Table temporary table as (select distinct * from Table Name );
Truncate table formal table; -- Note: It was originally written as a drop table formal table by mistake; it has been corrected now
Insert into formal table (select * from temporary table );
Drop table temporary table;

If you want to delete the duplicate data of a table, you can first create a temporary table, import the data after removing the duplicate data to the temporary table, and then
The temporary table imports data into the formal table as follows:
Insert into t_table_bak
Select distinct * From t_table;

 

Grouping using Oracle's analysis function row_number to remove duplicate data
[Group]
1 Delete from t where rowid in (
2 select row_id from (
3 select rowid row_id,
4 row_number () over (partition by ID order by rowid) Rn
5 from t
6)
7 where rn <> 1
8 *)
Tzxd @ webora9>/

1710795 rows deleted.

Elapsed: 00:06:47. 00

The article mentioned that the time taken to delete an analytic function is 00:06:47. 00, and the time it takes to delete a group by is 00:07:59. 85. Therefore, analysis functions are indeed more efficient.

In addition, analytics functions are supported only in oracle8.17 or later versions. In this version, only common SQL statements are used.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.