There are many ways to delete Repeated Records in DB2. Let's introduce several SQL statements that are more practical and have good performance.
-- 1. query duplicate data
| The Code is as follows: |
Copy code |
Select * from tableA where catentry_id in (select distinct catentry_id from tableA group by catentry_id, descitem_id having count (*)> 1 )@ |
-- 2. Create a temporary table and store all duplicate data in the table
| The Code is as follows: |
Copy code |
Create table detailtemp1 like tableA @ Insert into detailtemp1 Select * from tableA Where catentry_id in (select distinct catentry_id from tableA group by catentry_id, descitem_id having count (*)> 1 )@
Select * from detailtemp1 @ |
-- 3. Delete all records with duplicate conditions from the original table
| The Code is as follows: |
Copy code |
Delete from tableA where catentry_id in (select distinct catentry_id from tableA group by catentry_id, descitem_id having count (*)> 1 )@ |
-- 4. Use group by to insert non-Repeated Records in the temporary table into the original table.
| The Code is as follows: |
Copy code |
Insert into tableA Select catentry_id, descitem_id, max (content) From detailtemp1 Group by catentry_id, descitem_id @ Select * from tableA where catentry_id in (select distinct catentry_id from tableA group by catentry_id, descitem_id having count (*)> 1 )@
|
-- 5. Delete temporary table
| The Code is as follows: |
Copy code |
Drop table detailtemp1 @ -- End |
In addition, I would like to share several methods
2. DB2 deletes duplicate records. duplicate records are determined based on a single field (peopleId), leaving only the records with the smallest rowid
| The Code is as follows: |
Copy code |
| Delete from people where peopleId in (select peopleId from people group by peopleId having count (peopleId)> 1) and rowid not in (select min (rowid) from people group by peopleId having count (peopleId)> 1) |
3. Search for redundant duplicate records in the table (multiple fields)
| The Code is as follows: |
Copy code |
Select * from vitae a where (a. peopleId, a. seq) in (select peopleId, seq from vitae group by peopleId, seq having count (*)> 1) |
4. DB2 deletes duplicate records (multiple fields), leaving only the records with the smallest rowid
| The Code is as follows: |
Copy code |
Delete from vitae a where (. peopleId,. seq) in (select peopleId, seq from vitae group by peopleId, seq having count (*)> 1) and rowid not in (select min (rowid) from vitae group by peopleId, seq having count (*)> 1) |
5. Search for redundant duplicate records (multiple fields) in the table, excluding records with the smallest rowid
| The Code is as follows: |
Copy code |
Select * from vitae a where (. peopleId,. seq) in (select peopleId, seq from vitae group by peopleId, seq having count (*)> 1) and rowid not in (select min (rowid) from vitae group by peopleId, seq having count (*)> 1) |
6. Process de-duplication in bas_information.
| The Code is as follows: |
Copy code |
T. PISField001, t. PISField011, t. areaCode Export to e:/bas_information.txt of del select s. HOUSEHOLDS, s. PISFIELD000, s. PISFIELD001, s. PISFIELD002, S. PISFIELD003, s. PISFIELD004, s. PISFIELD005, s. PISFIELD006, s. PISFIELD011, S. PISFIELD012, s. PISFIELD013, s. PISFIELD014, s. PISFIELD015, s. PISFIELD016, S. DEADDATE, s. QIANRUDATE, s. ZHIYE, s. JIATING, s. AREACODE, s. REG_DATE, S. ISLOGOUT from (select t. *, rownumber () over (partition by t. PISField001, t. PISField011, t. areaCode) as rn from bas_information t) s where s. rn = 1; |