Quickly delete duplicate records (Oracle & amp; SqlServer)

Source: Internet
Author: User

Deleting duplicate records in the database has always been annoying. I have collected methods for quickly deleting duplicate records from Oracle and SqlServer for your reference. I hope this will help you.

SQL SERVER
Presumably, every SQL SERVER developer has had similar experiences, when querying or counting databases, you may encounter inaccurate query and statistical results due to repeated records in the table. The solution to this problem is to delete these duplicate records and keep only one of them.

In SQL Server, apart from manual deletion of tables with more than a dozen records, deleting duplicate records usually involves writing a piece of code and performing a row-by-row check using the cursor method, delete duplicate records. Because this method needs to traverse the entire table, it is feasible if the number of records in the table is not large. If the number of data records in a table reaches millions, it is a nightmare to use the cursor method to delete it, because it will be executed for a long period of time.

Four axes-easily eliminate repeated records

I do not know that there is a simpler method in SQL Server. It does not need to use a cursor. You only need to write a simple insert statement to delete duplicate records. To make it clear, we first assume that there is a product information table Products. Its table structure is as follows:


Create table Products (
ProductID int,
ProductName nvarchar (40 ),
Unit char (2 ),
UnitPrice money
)

 

Table data 1:

 

Chart

As shown in figure 1, the records of product Chang and Tofu are repeated in the product information table. To delete these duplicate records, only one of them is retained. The procedure is as follows:

The first axe -- create a temporary table with the same structure


Create table Products_temp (
ProductID int,
ProductName nvarchar (40 ),
Unit char (2 ),
UnitPrice money
)


Second Board ax-add an index to the table and make it ignore duplicate values

In Enterprise Manager, find the created temporary table Products _ temp, right-click, select all tasks, select Manage index, and choose create. 2.

Set the index option according to the area circled in Figure 2.

 

Figure 2

The third axe -- copy product information to the temporary table


Insert into Products_temp Select * from Products


At this time, SQL Server Returns the following prompt:

Server: Message 3604, level 16, status 1, Row 1

Duplicate keys are ignored.

It indicates that no duplicate rows exist in the temporary Products_temp table of product information.

Fourth board ax-import new data into the original table

Clear the original product information table Products, import data from the temporary table Products_temp, and delete the temporary table Products_temp.

Delete Products
Insert into Products select * from Products_temp
Drop table Products_temp


This completes the deletion of Repeated Records in the table. Regardless of the size of a table, the execution speed of the table is quite fast, and it is safe because there is almost no need to write statements.

TIPS: deleting duplicate records in the above method depends on the field selected when a unique index is created. In actual operation, the reader must first confirm that the unique index field is correct, to avoid deleting useful data.


ORACLE
In Oracle, the unique rowid can be used to delete duplicate records, and temporary tables can be created to implement... this article only mentions several simple and practical methods. I hope you can share them with us (using the table "employee" as an example ).

SQL> desc employee

Name Null? Type

Emp_id NUMBER (10)
Emp_name VARCHAR2 (20)

Salary NUMBER (10, 2)


You can use the following statement to query duplicate records:
SQL> select * from employee;

EMP_ID EMP_NAME SALARY

1 sunshine 10000

1 sunshine 10000

2 semon 20000

2 semon 20000

3. xyz 30000

2 semon 20000

SQL> select distinct * from employee;

EMP_ID EMP_NAME SALARY

1 sunshine 10000

2 semon 20000

3. xyz 30000

SQL> select * from employee group by emp_id, emp_name, salary having count (*)> 1

EMP_ID EMP_NAME SALARY

1 sunshine 10000

2 semon 20000

SQL> select * from employee e1

Where rowid in (select max (rowid) from employe e2
Where e1.emp _ id = e2.emp _ id and

E1.emp _ name = e2.emp _ name and e1.salary = e2.salary );


EMP_ID EMP_NAME SALARY

1 sunshine 10000

3. xyz 30000

2 semon 20000


2. delete several methods:

(1) create a temporary table.

SQL> create table temp_emp as (select distinct * from employee)

SQL> truncate table employee; (clear the data in the employee table)

SQL> insert into employee select * from temp_emp; (insert the content in the temporary table back)


(2) Unique rowid is used to delete duplicate records. in Oracle, each record has a rowid, which is unique throughout the database, rowid determines which data file, block, and row of each record in Oracle. In a duplicate record, the content of all columns may be the same, but the rowid may not be the same. Therefore, you only need to determine the rows with the largest or least rowid in the record, delete all others.

SQL> delete from employee e2 where rowid not in (
Select max (e1.rowid) from employee e1 where

E1.emp _ id = e2.emp _ id and e1.emp _ name = e2.emp _ name and e1.salary = e2.salary); -- min (rowid) can be used here.

SQL> delete from employee e2 where rowid <(
Select max (e1.rowid) from employee e1 where
E1.emp _ id = e2.emp _ id and e1.emp _ name = e2.emp _ name and e1.salary = e2.salary );

(3) It also uses rowid, but it is more efficient.

SQL> delete from employee where rowid not in (
Select max (t1.rowid) from employee t1 group by t1.emp _ id, t1.emp _ name, t1.salary); -- min (rowid) can be used here.


EMP_ID EMP_NAME SALARY

1 sunshine 10000

3. xyz 30000

2 semon 20000

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.