Method:
The code is as follows |
Copy Code |
GROUP BY XX has count (*) >1,rowid,distinct,temporary table,procedure |
The following statement can query that the data is duplicated:
The code is as follows |
Copy Code |
Select field 1, Field 2,count (*) from table name Group By field 1, Field 2 having count (*) > 1 |
Change the above > number to = number to query for no duplicate data.
To delete these duplicate data, you can use the following statement to delete
The code is as follows |
Copy Code |
Delete from table name a where field 1, Field 2 in (select field 1, Field 2,count (*) from table name Group By field 1, Field 2 having count (*) > 1) |
Cases
1. Duplicate data in the query table
A. Repeat a field
B. Repeat multiple fields
C. Repeat a whole line
To create a test table:
The code is as follows |
Copy Code |
CREATE TABLE CFA (Businessid number,customer varchar2), Branchcode varchar2 (a), data_date varchar2 (10)); INSERT into CFA values (1, ' Albert ', ' SCB ', ' 2011-11-11 '); INSERT into CFA values (2, ' Andy ', ' DB ', ' 2011-11-12 '); INSERT into CFA values (3, ' Allen ', ' HSBC ', ' 2011-11-13 '); ---------------The following are duplicate data---------------------------------------------- INSERT into CFA values (1, ' Alex ', ' ICBC ', ' 2011-11-14 '); INSERT into CFA values (1, ' Albert ', ' ctbk ', ' 2011-11-15 '); INSERT into CFA values (1, ' Albert ', ' SCB ', ' 2011-11-11 '); |
For a, only businessid repeats.
The code is as follows |
Copy Code |
SELECT * FROM CFA where Businessid to (select Businessid from CFA GROUP by Businessid has count (businessid) >1); |
In the case of B, Businessid and name are duplicated at the same time
The code is as follows |
Copy Code |
SELECT * FROM CFA where (Businessid,customer) in (select Businessid,customer to CFA group by Businessid,customer has COUNT (*) >1); |
For the case of C, repeat a whole line
Refer to B's method:
The code is as follows |
Copy Code |
SELECT * FROM CFA where (businessid,customer,branchcode,data_date) in (SELECT * to CFA GROUP by Businessid,customer,bran Chcode,data_date having Count (*) >1); |
2, delete the duplicate data in the table
A situation, delete the redundant records in the table, duplicate records are based on a single field (BUSINESSID) to judge, leaving only the ROWID minimal records
You can also only keep rowID is not a minimum record, you need to change the code in the Min to max here no longer repeat.
The code is as follows |
Copy Code |
Delete from CFA where Businessid in (select Businessid From CFA GROUP BY Businessid Having count (Businessid) > 1) and rowID not in (select min (rowid) From CFA GROUP BY Businessid Having count (Businessid) > 1); |
Alternatively, use the following simpler and more efficient statements
The code is as follows |
Copy Code |
DELETE from CFA t WHERE T.rowid > (SELECT MIN (X.rowid) from CFA X WHERE x.businessid = t.businessid); |
b case, delete redundant records (multiple fields) in the table, leaving only the smallest ROWID records
The code is as follows |
Copy Code |
Delete from CFA where (Businessid,customer) in (select Businessid,customer From CFA GROUP BY Businessid,customer Having count (*) > 1) and rowID not in (select min (rowid) From CFA GROUP BY Businessid,customer Having count (*) > 1); |
Alternatively, use the following simpler and more efficient statements
code is as follows |
copy code |
DELETE from CFA t WHERE t.rowid > (SELECT MIN (x.rowid) & nbsp; from CFA X WHERE x.businessid = T.businessid and x.customer = T.customer); |
The above statement is very simple, that is, the query to delete the data. However, this deletion is very inefficient and may hang the database for large amounts of data. So I suggest that you insert the duplicate data from the query into a temporary table, and then delete it, so that you don't have to do a query again when you perform the deletion. As follows:
CREATE Table temporary table as
The code is as follows |
Copy Code |
(select field 1, Field 2,count (*) from table name Group By field 1, Field 2 having count (*) > 1)
|
The above sentence is to create a temporary table, and the query to insert the data.
The following can be done with this delete operation:
The code is as follows |
Copy Code |
Delete from table name a where field 1, Field 2 in (Select field 1, Field 2 from temporary table);
|
This is a much more efficient way to delete the first temporary table than to remove it directly with one statement.
C, this situation is relatively simple, using the temporary table method
The code is as follows |
Copy Code |
CREATE TABLE Cfabak as SELECT DISTINCT * from CFA; TRUNCATE TABLE cfa;--If the production is best on the table backup Insert into the CFA select * from Cfabak; Commit |
We're going to delete the duplicate data and keep only the latest one, so we can write this:
code is as follows |
copy code |
Delete from Table name a where A.rowid!= ( Select Max (b.rowid) from table name B where a. Field 1 = B. Field 1 and A. Field 2 = b . Field 2 ) |