Using the new features of SQL Server 2005 Now_number and CTE can be implemented well.
Examples are as follows:
Set up test data:
Copy Code code as follows:
CREATE TABLE Dup1
(
Col1 int NULL,
Col2 varchar () null
)
INSERT INTO DUP1 values
(1, ' AAA '),
(2, ' AAA '),
(2, ' AAA '),
(2, ' AAA '),
(3, ' BBB '),
(3, ' BBB '),
(4, ' CCC '),
(4, ' ddd '),
(5, ' Eee ')
SELECT * FROM Dup1
You can see that duplicate data is:
Copy Code code as follows:
SELECT Col1, Col2, COUNT (*) as Dupcountfrom dup1group by Col1, col2having count (*) > 1
Next, we'll explain how to delete duplicate data:
1.now_number:sql Server 2005 adds a very useful ranking function (now_number,rank,dense_rank,ntile), using Now_number () over (PARTITION GY) The most direct, and most convenient, can not modify the table or produce redundant columns.
A column number is assigned first, and the sort is partitioned by a col1,col2 combination.
Copy Code code as follows:
SELECT Col1, Col2,row_number () over (PARTITION by Col1, Col2 Order by Col1) as Rnfrom Dup1
The serial numbers are as follows:
It is obvious that the repeating column is grouped into a sorted order, which only needs to delete the sorted ordinal number >1.
Copy Code code as follows:
--Use a CTE
With DUPSD
As (
SELECT Col1, Col2,
Row_number () over (PARTITION by Col1, Col2 Order by Col1) as RN
From Dup1
)
DELETE DUPSD
WHERE rn > 1;
--or
DELETE A from (
SELECT Col1, Col2,
Row_number () over (PARTITION by Col1, Col2 Order by Col1) as RN
From Dup1) A WHERE a.rn>1
2. Create a single list of identity keys.
Copy Code code as follows:
ALTER TABLE dbo. Dup1
ADD
PK INT IDENTITY
Not NULL
CONSTRAINT pk_dup1 PRIMARY KEY;
SELECT *
From Dup1;
Delete finds records that are the same as col1,col2 and larger than dup1.pk, which is the record that retains the lowest PK in duplicate values.
Copy Code code as follows:
DELETE Dup1
WHERE EXISTS (SELECT *
From Dup1 as D1
WHERE D1. Col1 = Dup1.col1
and D1. Col2 = Dup1.col2
and D1. PK > dup1.pk);
3.select distant into, this method moves the duplicate result set to the new table with the help of a new table.
Copy Code code as follows:
SELECT distinct Col1, Col2 into Nodupsfrom dup1;select * from nodups
The first and third methods are recommended, the first of which is more commonly seen in T-SQL programming, and the third is often used in ETL.