Oracle FAST Delete mass data methods (delete all, condition Delete, delete large number of duplicate records) __oracle

Source: Internet
Author: User

Delete all

If you are deleting all the data for a table and do not need to roll back, use TRUNCATE to OK. About Trancate See here http://blog.csdn.net/gnolhh168/archive/2011/05/24/6442561.aspx

Sql> TRUNCATE TABLE table_name;

Conditional Deletion

If there is a condition for deleting the data, such as delete from tablename where col1 = ' Lucy '; At this time, in addition to Gazzo, you can delete the Add no logging option, do not write log speed up the deletion

Quote SB. "The tens of millions of-record table is not partitioned, obviously there is a problem." Oracle's technical support engineers suggest that more than 2,000,000 of the records of the table, should consider zoning, you can completely according to the time dimension to build the table, each month's data stored in a partitioned table, to delete the next one months of data, directly TRUNCATE table can not log, Very fast. ”

Delete a large number of duplicate records

"Turn" to do the project, when a colleague guide data, accidentally put the data in a table all heavy, that is to say, all the records in this form have a duplicate. The data for this table is tens, and it is a production system. In other words, you can't delete all the records, and you have to quickly delete the duplicate records.

In this case, we summarize the method for deleting duplicate records, and the pros and cons of each method.

For easy presentation, assume that the table name is TBL and there are three columns of col1,col2,col3 in the table, where Col1,col2 is the primary key and the col1,col2 is indexed.

1, by creating a temporary table

You can lead the data into a temporary table, and then delete the data from the original table, and then return the data back to the original table, the SQL statement is as follows:

creat table Tbl_tmp (select distinct* from TBL);

TRUNCATE TABLE TBL; Empty table record I

Nsert into the TBL select * from tbl_tmp;//inserts the data from the temporary table.

This approach can achieve requirements, but it is obvious that this approach is slow for a tens-logged table, which in production systems can be costly and not possible.

2. Using rowID

In Oracle, each record has a rowid,rowid that is unique across the database, ROWID determines which data files, blocks, and rows are in Oracle for each record. In duplicate records, all columns may have the same content, but ROWID will not be the same. The SQL statement is as follows:

Delete from tbl where rowID in (select A.rowid

From TBL A, tbl B

where A.rowid>b.rowid and a.col1=b.col1 and a.col2 = b.col2)

This SQL statement applies if you already know that there is only one duplicate of each record. But if each record has a duplicate record of N, this n is unknown, consider the following method.

3. Use Max or Min function

Here also to use ROWID, with the above different is the combination of Max or min function to achieve. The SQL statement is as follows

Delete from TBL A

where rowID not in (

Select Max (B.ROWID)

from TBL B

where a.col1=b.col1 and a.col2 = b.col2); Here max can also use min

Or use the following statement

Delete from tbl Awhere rowid< (

Select Max (B.ROWID)

from TBL B

where a.col1=b.col1 and a.col2 = b.col2); If you change Max to Min, you need to change "<" to ">" in the previous where clause.

With the above method of thinking is basically the same, but the use of group by, reduce the dominant comparison conditions, improve efficiency. The SQL statement is as follows:

Deletefrom TBL where rowID not in (

Select Max (ROWID)

From TBL Tgroup by T.col1, t.col2);

Delete from tbl where (col1, col2) in (

Select Col1,col2

From Tblgroup bycol1,col2havingcount (*) >1) and Rowidnotin (Selectnin (ROWID) Fromtblgroup bycol1, C Ol2havingcount (*) >1)----。。。

There is also a way to compare the number of records that have duplicate records in a table and have an indexed case. Assuming that there is an index on the col1,col2 and that there are fewer records in the TBL table, the SQL statement is as follows 4, using GROUP by to increase efficiency

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.