Query and delete duplicate records of the SQL statement 1, find redundant records in the table, duplicate records are based on a single field (ID) to determine the select * from table where Id in (select ID from table group Byid have COUNT (ID) > 1) 2, delete redundant records in the table, repeat records are based on a single field (ID) to judge, leaving only rowid minimum records delete from table WHERE (ID) in (SELECT ID from table Group BY ID has count (ID) > 1) and ROWID not in (SELECT MIN (ROWID) from table GROUP by ID has count (*) > 1);   3. Redundant records in lookup table (multiple fields) select * FROM Table A WHERE (A.ID,A.SEQ) in (Select Id,seq from table group by ID,SEQ have count (*) ; 1) 4, delete redundant duplicate records (multiple fields) in the table, leaving only the ROWID minimum records delete from Table A where (A.ID,A.SEQ) in (Select Id,seq from table GROUP by ID,SEQ Have count (*) > 1) and rowID not in (select min (rowid) from table group by ID,SEQ have Count (*) >1) 5, redundant duplicates in lookup table Record (multiple fields), does not contain ROWID minimum records select * from Table A WHERE (A.ID,A.SEQ) in (Select Id,seq from table group by ID,SEQ have count (*) > 1) and rowID not in (select min (rowid) from table group by ID,SEQ have Count (*) >1) One: Duplicate data is judged by a single field
1, first, query the table of redundant data, by the key field (name) to query.
SELECT * from Oa_address_book where name in (select name from Oa_address_book group by name has count (name) >1)
2, delete duplicate data in the table, repeating data is based on a single field (Name) to judge, leaving only the ROWID minimum records
Delete from Oa_address_book where (Name) in
(select Name from Oa_address_book GROUP by Name has count (name) >1)
and rowID not in (select min (rowid) from Oa_address_book GROUP by Name have Count (Name) >1)
Second: Repeating data is judged by multiple fields
1, first, the query table duplicate data, by the key field (NAME,UNIT_ID) to query.
SELECT * from Oa_address_book Book1 where (book1.name,book1.unit_id) in
(select book2.name,book2.unit_id from Oa_address_book book2 GROUP by BOOK2.NAME,BOOK2.UNIT_ID have Count (*) >1)
2, delete duplicate data in the table, repeat data is based on multiple fields (NAME,UNIT_ID) to judge, leaving only the ROWID minimum records
Delete from Oa_address_book a WHERE (a.name,a.unit_id) in
(select name,unit_id from Oa_address_book GROUP by NAME,UNIT_ID have count (*) > 1)
and rowID not in (select min (rowid) from Oa_address_book GROUP by NAME,UNIT_ID have Count (*) >1)
3, the query table duplicate data, the duplicate data is according to several fields (NAME,UNIT_ID) to judge, does not contain rowid smallest record select name,unit_id from Oa_address_book a where (a.name,a.unit _ID) in
(select name,unit_id from Oa_address_book GROUP by NAME,UNIT_ID have count (*) > 1)
and rowID not in (select min (rowid) from Oa_address_book GROUP by NAME,UNIT_ID have Count (*) >1) 1. Problem description
Bbscomment table for Bbsdetail from the table, record merchant evaluation information. Because the data Daoteng to Daoteng, there are a lot of duplicate data. The table structure is as follows:
COMMENT_ID not NULL number-primary key
DETAIL_ID not NULL number-foreign key, reference Bbsdetail table
Comment_body not NULL VARCHAR2 (500)--evaluation content
--Other fields ignored
Where the primary key is not duplicated, the repetition is detail_id+comment_body+ ... Information, that is, some business evaluation information is duplicated.
2. Troubleshooting step 2.1 Finding redundant duplicate records in a table
--Query out all duplicate data Select Detail_id,comment_body,count (*) from Bbscommentgroup by detail_id,comment_bodyhaving count (*) >1order by detail_id, Comment_body; --1955 Articles
2.2 Shows all non-redundant data
-This command shows all non-redundant data select min (comment_id) as Comment_id,detail_id,comment_bodyfrom Bbscommentgroup by detail_id, Comment_body; -21,453, this value is not equal to the total number of records in the table-1955, because 1955 records, some have been repeated more than once.
2.3 If the number of records is small (thousand), the above statement can be made into a subquery and then deleted directly
--If the table data volume is not very large (1000 or less), you can make the above statement into a subquery and then directly delete the delete from bbscomment where comment_id not in ( select min (comment_id) From bbscomment Group by detail_id,comment_body); --782 seconds, here I am, 20,000 records, repeat record 2000 more (too slow!!) )
2.4 Another way to delete
--This statement can also achieve the above function, but the test is not good, the data has been deleted by me--delete condition One: There is a record of duplicate data; condition two: Keep record of minimum rowid. Delete from Bbscomment awhere (a.detail_id,a.comment_body) in (select Detail_id,comment_body from Bbscomment Group by Detail_id,comment_body have count (*) > 1) and rowID not in (select min (rowid) from Bbscomment GROUP by Detail_ Id,comment_body having Count (*) >1);
2.5 Large data volumes or PL/SQL for quick and easy
declare--defines the storage structure type Bbscomment_type is record (comment_id bbscomment. Comment_id%type, detail_id bbscomment. Detail_id%type, Comment_body bbscomment. Comment_body%type); Bbscomment_record bbscomment_type;--the variable v_comment_id bbscomment that can be compared. comment_id%type;v_detail_id bbscomment. Detail_id%type;v_comment_body bbscomment. comment_body%type;--Other variables v_batch_size integer: = 5000;v_counter Integer: = 0;cursor cur_dupl is--Remove all duplicate records select C omment_id, detail_id, comment_body from Bbscomment where (detail_id, Comment_body) in (--these records have duplicates Selec T detail_id, comment_body from Bbscomment Group by detail_id, Comment_body have count (*) > 1) Order by detail_id, Comment_body;begin for Bbscomment_record in CUR_DUPL loop if v_detail_id is null or (bbscom ment_record.detail_id! = v_detail_id or NVL (Bbscomment_record.comment_body, ')! = NVL (V_comment_body, ") Then --first entry, exchange of records, all re-assigned value v_detail_id: = bbscomment_record.detail_id; V_comment_body: = Bbscomment_record.comment_body; Else--Other records delete the delete from bbscomment where comment_id = bbscomment_record.comment_id; V_counter: = V_counter + 1; If mod (v_counter, v_batch_size) = 0 THEN-commit each number of commits; End If; End If; End Loop; If V_counter > 0 then-last commit commit; End If; Dbms_output.put_line (To_char (v_counter) | | The record is deleted! '); exception when others then Dbms_output.put_line (' sqlerrm--> ' | | SQLERRM); Rollback;end;
Oracle deletes duplicate data only one line