Efficiently remove duplicate data from Oracle databases and retain the latest method

Last Update:2018-07-23 Source: Internet

Author: User

Tags oracle database

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

We may encounter this situation in the operation of the database, the data in the table may be repeated, so that we have a lot of inconvenience in the operation of the database, then how to delete these duplicate useless data?
The data de-duplication technology can provide more backup capacity, achieve longer data retention, maintain continuous verification of backup data, improve the level of data recovery service, and facilitate data disaster tolerance. Duplicate data may have two scenarios, the first one being only some of the fields in the table, and the second is exactly the same as two rows of records. The Oracle Database De-duplication technology has the following advantages: greater backup capacity, continuous data validation, higher levels of data recovery services, and easy implementation of backup data disaster tolerance.
One, delete partial field duplicate data
Let's talk about how to query for duplicate data.
The following statement can query that the data is duplicated:
Select field 1, Field 2,count (*) from table name Group By field 1, Field 2 having count (*) > 1
Change the above > number to = number to query for no duplicate data.
To delete these duplicate data, you can use the following statement to delete
Delete from table name a where field 1, Field 2 in
(select field 1, Field 2,count (*) from table name Group By field 1, Field 2 having count (*) > 1)
The above statement is very simple, that is, the query to delete the data. However, this deletion is very inefficient and may hang the database for large amounts of data. So I suggest that you insert the duplicate data from the query into a temporary table, and then delete it, so that you don't have to do a query again when you perform the deletion. As follows:
CREATE Table temporary table as
(select field 1, Field 2,count (*) from table name Group By field 1, Field 2 having count (*) > 1)
The above sentence is to create a temporary table, and the query to insert the data.
The following can be done with this delete operation:
Delete from table name a where field 1, Field 2 in (Select field 1, Field 2 from temporary table);
This is a much more efficient way to delete the first temporary table than to remove it directly with one statement.
At this time, people may jump out and say, what? You told us to execute this statement, and that's not to delete all the duplicates? And we want to keep the latest record in the duplicate data! Let's not worry, let me just talk about how to do this.
In Oracle, there's a hidden automatic rowid that gives each record a single rowid, and if we want to keep the latest record,
We can use this field to keep the largest record of ROWID in duplicate data.
Here is an example of querying for duplicate data:
Select a.rowid,a.* from table name a
where A.rowid!=
(
Select Max (b.rowid) from table name B
Where a. Field 1 = B. Field 1 and
A. Field 2 = B. Field 2
)
Let me just explain that the statement in parentheses above is the largest record of ROWID in the duplicate data.
And the outside is to query out other than ROWID the largest number of other duplicate data.
So, we're going to delete the duplicate data and just keep the latest one, so we can write this:
Delete from table name a
where A.rowid!=
(
Select Max (b.rowid) from table name B
Where a. Field 1 = B. Field 1 and
A. Field 2 = B. Field 2
)
Casually speaking, the execution efficiency of the above statement is very low, you can consider the establishment of temporary tables, say need to judge the duplicate fields, rowID inserted in the temporary table, and then delete in the comparison.
CREATE table temporary table as
Select a. field 1,a. Field 2,max (A.rowid) dataid from official Table a GROUP by a. Field 1,a. field 2;
Delete from table name a
where A.rowid!=
(
Select B.dataid from temporary table B
Where a. Field 1 = B. Field 1 and
A. Field 2 = B. Field 2
);
Commit
Ii. complete deletion of duplicate records
For two rows in a table that are identical, you can get the record after you remove the duplicate data by using the following statement:
SELECT DISTINCT * FROM table name
You can place the records of the query in a temporary table, then delete the original table records, and finally return the data from the temporary table back to the original table. As follows:
CREATE table temporary table as (SELECT DISTINCT * from table name);
TRUNCATE table formal form; Note: Originally written as a drop table official table;
Insert into formal form (SELECT * from temporary table);
drop table temporary table;
If you want to delete duplicate data for a table, you can first create a temporary table, import data from the duplicate data into a temporary table, and then
The temporary table imports the data into the formal table as follows:
INSERT into T_table_bak
SELECT DISTINCT * from t_table;
Iii. How to quickly delete an Oracle database
The quickest way to enter the registry is to run it. Input regedit.
Expand HKEY_LOCAL_MACHINE SOFTWARE Sequentially
Locate the Oracle node. Delete.
The Oracle data file is then deleted and the path selected when the installation is installed.
Finally, delete the Oracle boot file and delete the Oracle folder in the system disk's program files.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More