Oracle deletes duplicate data only one message is left

Source: Internet
Author: User

Query and delete duplicate records of the SQL statement  1, find redundant records in the table, duplicate records are based on a single field (ID) to determine the  select * from table where Id in (select ID from table group Byid have COUNT (ID) > 1)  2, delete redundant records in the table, repeat records are based on a single field (ID) to judge, leaving only rowid minimum records  delete from table WHERE (ID) in (SELECT ID from table Group BY ID has count (ID) > 1) and ROWID not in (SELECT MIN (ROWID) from table GROUP by ID has count (*) > 1); &nbsp 3. Redundant records in lookup table (multiple fields)  select * FROM Table A WHERE (A.ID,A.SEQ) in (Select Id,seq from table group by ID,SEQ have count (*) ; 1)  4, delete redundant duplicate records (multiple fields) in the table, leaving only the ROWID minimum records  delete from Table A where (A.ID,A.SEQ) in (Select Id,seq from table GROUP by ID,SEQ Have count (*) > 1) and rowID not in (select min (rowid) from table group by ID,SEQ have Count (*) >1)  5, redundant duplicates in lookup table Record (multiple fields), does not contain ROWID minimum records  select * from Table A WHERE (A.ID,A.SEQ) in (Select Id,seq from table group by ID,SEQ have count (*) > 1) and rowID not in (select min (rowid) from table group by ID,SEQ have Count (*) >1)    One: Duplicate data is judged by a single field

1, first, query the table of redundant data, by the key field (name) to query.

SELECT * from Oa_address_book where name in (select name from Oa_address_book group by name has count (name) >1)

2, delete duplicate data in the table, repeating data is based on a single field (Name) to judge, leaving only the ROWID minimum records

Delete from Oa_address_book where (Name) in

(select Name from Oa_address_book GROUP by Name has count (name) >1)

and rowID not in (select min (rowid) from Oa_address_book GROUP by Name have Count (Name) >1)

Second: Repeating data is judged by multiple fields

1, first, the query table duplicate data, by the key field (NAME,UNIT_ID) to query.

SELECT * from Oa_address_book Book1 where (book1.name,book1.unit_id) in
(select book2.name,book2.unit_id from Oa_address_book book2 GROUP by BOOK2.NAME,BOOK2.UNIT_ID have Count (*) >1)

2, delete duplicate data in the table, repeat data is based on multiple fields (NAME,UNIT_ID) to judge, leaving only the ROWID minimum records

Delete from Oa_address_book a WHERE (a.name,a.unit_id) in
(select name,unit_id from Oa_address_book GROUP by NAME,UNIT_ID have count (*) > 1)
and rowID not in (select min (rowid) from Oa_address_book GROUP by NAME,UNIT_ID have Count (*) >1)
3, the query table duplicate data, the duplicate data is according to several fields (NAME,UNIT_ID) to judge, does not contain rowid smallest record select name,unit_id from Oa_address_book a where (a.name,a.unit _ID) in
(select name,unit_id from Oa_address_book GROUP by NAME,UNIT_ID have count (*) > 1)
and rowID not in (select min (rowid) from Oa_address_book GROUP by NAME,UNIT_ID have Count (*) >1) 1. Problem description

Bbscomment table for Bbsdetail from the table, record merchant evaluation information. Because the data Daoteng to Daoteng, there are a lot of duplicate data. The table structure is as follows:

COMMENT_ID not NULL number-primary key
DETAIL_ID not NULL number-foreign key, reference Bbsdetail table
Comment_body not NULL VARCHAR2 (500)--evaluation content

--Other fields ignored

Where the primary key is not duplicated, the repetition is detail_id+comment_body+ ... Information, that is, some business evaluation information is duplicated.

2. Troubleshooting step 2.1 Finding redundant duplicate records in a table
--Query out all duplicate data Select Detail_id,comment_body,count (*) from Bbscommentgroup by detail_id,comment_bodyhaving count (*) >1order by detail_id, Comment_body; --1955 Articles
2.2 Shows all non-redundant data
-This command shows all non-redundant data select min (comment_id) as Comment_id,detail_id,comment_bodyfrom Bbscommentgroup by detail_id, Comment_body;   -21,453, this value is not equal to the total number of records in the table-1955, because 1955 records, some have been repeated more than once.
2.3 If the number of records is small (thousand), the above statement can be made into a subquery and then deleted directly
--If the table data volume is not very large (1000 or less), you can make the above statement into a subquery and then directly delete the delete from bbscomment where comment_id not in (    select min (comment_id) From    bbscomment    Group by detail_id,comment_body);          --782 seconds, here I am, 20,000 records, repeat record 2000 more (too slow!!) )
2.4 Another way to delete
--This statement can also achieve the above function, but the test is not good, the data has been deleted by me--delete condition One: There is a record of duplicate data; condition two: Keep record of minimum rowid. Delete from Bbscomment awhere    (a.detail_id,a.comment_body) in (select Detail_id,comment_body from Bbscomment Group by Detail_id,comment_body have count (*) > 1) and rowID not in    (select min (rowid) from Bbscomment GROUP by Detail_ Id,comment_body having Count (*) >1);
2.5 Large data volumes or PL/SQL for quick and easy
declare--defines the storage structure type Bbscomment_type is record (comment_id bbscomment. Comment_id%type, detail_id bbscomment. Detail_id%type, Comment_body bbscomment. Comment_body%type); Bbscomment_record bbscomment_type;--the variable v_comment_id bbscomment that can be compared. comment_id%type;v_detail_id bbscomment. Detail_id%type;v_comment_body bbscomment. comment_body%type;--Other variables v_batch_size integer: = 5000;v_counter Integer: = 0;cursor cur_dupl is--Remove all duplicate records select C omment_id, detail_id, comment_body from Bbscomment where (detail_id, Comment_body) in (--these records have duplicates Selec    T detail_id, comment_body from Bbscomment Group by detail_id, Comment_body have count (*) > 1) Order by detail_id, Comment_body;begin for Bbscomment_record in CUR_DUPL loop if v_detail_id is null or (bbscom            ment_record.detail_id! = v_detail_id or NVL (Bbscomment_record.comment_body, ')! = NVL (V_comment_body, ") Then --first entry, exchange of records, all re-assigned value v_detail_id: = bbscomment_record.detail_id;        V_comment_body: = Bbscomment_record.comment_body;            Else--Other records delete the delete from bbscomment where comment_id = bbscomment_record.comment_id;            V_counter: = V_counter + 1;            If mod (v_counter, v_batch_size) = 0 THEN-commit each number of commits;        End If;    End If;    End Loop;    If V_counter > 0 then-last commit commit;    End If; Dbms_output.put_line (To_char (v_counter) | | The record is deleted! '); exception when others then Dbms_output.put_line (' sqlerrm--> ' | |        SQLERRM); Rollback;end;

Oracle deletes duplicate data only one line

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.