Delete duplicate data in SQL Server

Source: Internet
Author: User

Recently, in the project, all the tables in the database have a large amount of duplicate data, and they are completely duplicated. That is, all the fields are consistent, including the primary keys are consistent, you need to delete duplicate data and retain only one data record. I have asked some people, and I have been online for half a day. Finally, I will summarize the following methods to delete duplicate data:

1. The cursor is used to delete a common type of method with low data volume and low efficiency requirements, as shown in the following script:

Code

 // Define two variables

Delcare @ max integer, @ id integer

// Define a local cursor

Declare cusCursor cursor local for select id, count (*) from tableName group by id having count (*)> 1

// Open the cursor

Open cusCursor

// Put the content read by the current cursor into the variable

Fetch cusCursor into @ id, @ max

// Judge the status after the cursor executes the fetch statement. If the execution succeeds, the loop starts.

While @ fetch_status = 0

Begin

Select @ max = @ max-1

// Set the number of lines processed by the following statement

Set rowcount @ max

// Delete duplicate data. Only one record is retained.

Delete from tableName where id = @ id

// Move the cursor down a row to continue reading data

Fetch cusCursor into @ id, @ max

End

// Close the cursor

Close cusCursor

// Delete the cursor

Deallocate cusCursor

Set rowcount 0

  

2. Use the temporary table method to delete duplicate records. This method is highly efficient but has some limitations, as shown in the following script:

    

Code

 // After the data in the source table is distinct, only one piece of data is left, and then the data is written to the temporary table # tempTable

Select distinct * into # tempTable from tableName

// Clear the data in the source table

Truncate table tableName

// Write data from the temporary table to the source table

Select * into tableName from # temTable

// Delete a temporary table

Drop # tempTable

 

The limitation of this method is that an error is reported when an image or text field exists in the Database Table, prompting that the distinct operation cannot be performed on this field.

3. The limitation of the second method, and the data size is also large, and the efficiency is also required, the third method can be as follows:

    

Code

 // Copy the data structure of the source table to the temporary table # tempTable

Select * into # tempTable from tableName where 1 <> 1

// Create an index in the temporary table and ignore duplicate column data

Create unique index temp on # tempTable (duplicate column name) with IGNORE_DUP_KEY

// Copy the data in the source table to the temporary table. When copying the data, duplicate column data is ignored, that is, only one copy of data is retained.

Insert into # tempTable select * from tableName

// Clear the source table
Truncate table tableName

// Write data from the temporary table to the source table

Insert into tableName select * from # tempTable

// Delete a temporary table

Drop table # tempTable

 

The above are three methods for deleting duplicate database records recently summarized. The third method is efficient and versatile, and has a good performance in the data volume of 10 million.

 

 

 

 

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.