Business requirements
Recently made a small tool for the company to bring data from one database (data source) into another (target database). The data required to import the target database cannot be duplicated. But the situation is that the data source itself has duplicate data. So you need to clear the data source data first.
So we summarize the query and processing of the duplicate data. This is only a database-based solution. The implementation of the program is not considered.
Environment is: SQL Server 2005 and SQL Server 2005
Database-based Solutions
Database Test table dbo. Member
A group query method with having condition
(1) Querying a column for duplicate records
Statement:
SELECT from WHERE inch (SELECTfromGROUPby hasCOUNT(Name)> 1 ORDER by T.name
Query Result:
(2) Querying a column for records that are not duplicates
Statement:
SELECT * from WHERE ID in (SELECTMINfromGROUP by Name)
Query Result:
(3) Clear a column of duplicate data
Statement:
DELETE from WHERE not inch (SELECTMINfromGROUP by Name)
Execution Result:
Explanation: The above example only holds the minimum value for the respective name.
Second, the use of DISTINCT
Warm reminder:
Multi-column statistics not supported
Oracle and DB2 databases are also available
Use the DISTINCT keyword to return a unique different value
(1) Querying a column for non-repeating data
Statement:
SELECT DISTINCT from dbo. Member
Result set:
(2) DISTINCT query multiple columns are not duplicated (if any one of the columns of the query is not duplicated, this record is considered not to be duplicated)
Statement:
SELECT DISTINCT from dbo. Member
Query results
DISTINCT for Statistics
Statement
SELECT COUNT (DISTINCT from dbo.) Member
SQL Server Duplicate data query, deleting