Deleting duplicate records in the database has always been annoying. I have collected methods for quickly deleting duplicate records from Oracle and SqlServer for your reference. I hope this will help you.
SQL SERVER
Presumably, every SQL SERVER developer has had similar experiences, when querying or counting databases, you may encounter inaccurate query and statistical results due to repeated records in the table. The solution to this problem is to delete these duplicate records and keep only one of them.
In SQL Server, apart from manual deletion of tables with more than a dozen records, deleting duplicate records usually involves writing a piece of code and performing a row-by-row check using the cursor method, delete duplicate records. Because this method needs to traverse the entire table, it is feasible if the number of records in the table is not large. If the number of data records in a table reaches millions, it is a nightmare to use the cursor method to delete it, because it will be executed for a long period of time.
Four axes-easily eliminate repeated records
I do not know that there is a simpler method in SQL Server. It does not need to use a cursor. You only need to write a simple insert statement to delete duplicate records. To make it clear, we first assume that there is a product information table Products. Its table structure is as follows:
Create table Products (
ProductID int,
ProductName nvarchar (40 ),
Unit char (2 ),
UnitPrice money
)
Table data 1:
Chart
As shown in figure 1, the records of product Chang and Tofu are repeated in the product information table. To delete these duplicate records, only one of them is retained. The procedure is as follows:
The first axe -- create a temporary table with the same structure
Create table Products_temp (
ProductID int,
ProductName nvarchar (40 ),
Unit char (2 ),
UnitPrice money
)
Second Board ax-add an index to the table and make it ignore duplicate values
In Enterprise Manager, find the created temporary table Products _ temp, right-click, select all tasks, select Manage index, and choose create. 2.
Set the index option according to the area circled in Figure 2.
Figure 2
The third axe -- copy product information to the temporary table
Insert into Products_temp Select * from Products
At this time, SQL Server Returns the following prompt:
Server: Message 3604, level 16, status 1, Row 1
Duplicate keys are ignored.
It indicates that no duplicate rows exist in the temporary Products_temp table of product information.
Fourth board ax-import new data into the original table
Clear the original product information table Products, import data from the temporary table Products_temp, and delete the temporary table Products_temp.
Delete Products
Insert into Products select * from Products_temp
Drop table Products_temp
This completes the deletion of Repeated Records in the table. Regardless of the size of a table, the execution speed of the table is quite fast, and it is safe because there is almost no need to write statements.
TIPS: deleting duplicate records in the above method depends on the field selected when a unique index is created. In actual operation, the reader must first confirm that the unique index field is correct, to avoid deleting useful data.
ORACLE
In Oracle, the unique rowid can be used to delete duplicate records, and temporary tables can be created to implement... this article only mentions several simple and practical methods. I hope you can share them with us (using the table "employee" as an example ).
SQL> desc employee
Name Null? Type
Emp_id NUMBER (10)
Emp_name VARCHAR2 (20)
Salary NUMBER (10, 2)
You can use the following statement to query duplicate records:
SQL> select * from employee;
EMP_ID EMP_NAME SALARY
1 sunshine 10000
1 sunshine 10000
2 semon 20000
2 semon 20000
3. xyz 30000
2 semon 20000
SQL> select distinct * from employee;
EMP_ID EMP_NAME SALARY
1 sunshine 10000
2 semon 20000
3. xyz 30000
SQL> select * from employee group by emp_id, emp_name, salary having count (*)> 1
EMP_ID EMP_NAME SALARY
1 sunshine 10000
2 semon 20000
SQL> select * from employee e1
Where rowid in (select max (rowid) from employe e2
Where e1.emp _ id = e2.emp _ id and
E1.emp _ name = e2.emp _ name and e1.salary = e2.salary );
EMP_ID EMP_NAME SALARY
1 sunshine 10000
3. xyz 30000
2 semon 20000
2. delete several methods:
(1) create a temporary table.
SQL> create table temp_emp as (select distinct * from employee)
SQL> truncate table employee; (clear the data in the employee table)
SQL> insert into employee select * from temp_emp; (insert the content in the temporary table back)
(2) Unique rowid is used to delete duplicate records. in Oracle, each record has a rowid, which is unique throughout the database, rowid determines which data file, block, and row of each record in Oracle. In a duplicate record, the content of all columns may be the same, but the rowid may not be the same. Therefore, you only need to determine the rows with the largest or least rowid in the record, delete all others.
SQL> delete from employee e2 where rowid not in (
Select max (e1.rowid) from employee e1 where
E1.emp _ id = e2.emp _ id and e1.emp _ name = e2.emp _ name and e1.salary = e2.salary); -- min (rowid) can be used here.
SQL> delete from employee e2 where rowid <(
Select max (e1.rowid) from employee e1 where
E1.emp _ id = e2.emp _ id and e1.emp _ name = e2.emp _ name and e1.salary = e2.salary );
(3) It also uses rowid, but it is more efficient.
SQL> delete from employee where rowid not in (
Select max (t1.rowid) from employee t1 group by t1.emp _ id, t1.emp _ name, t1.salary); -- min (rowid) can be used here.
EMP_ID EMP_NAME SALARY
1 sunshine 10000
3. xyz 30000
2 semon 20000