Description of how to delete duplicate records of redundant data in SQL statements. Only one record is retained.

Source: Internet
Author: User

Let's take a look at the relevant data structure knowledge.

When learning linear tables, there was such an example.

It is known that an ordered table named La stores integers. We try to construct an ordered table Lb, which requires that the ordered table Lb only contain data elements with different values in the ordered table La.
Algorithm ideas:
First, the first element of the Order table La is paid to the Lb of the Order table. Then, starting from the 2nd elements of the Order table La, each element is compared with each element in the order table Lb, if they are different, the element is appended to the end of the Lb sequence table.
Copy codeThe Code is as follows:
Public SeqList <int> Purge (SeqList <int> La)
{
SeqList <int> Lb = new SeqList <int> (La. Maxsize );
// Assign the 1st data elements in Table a to table B
Lb. Append (La [0]);
// Process data elements in Table a in sequence
For (int I = 1; I <= La. GetLength ()-1; ++ I)
{
Int j = 0;
// Check whether table B has the same data element as Table.
For (j = 0; j <= Lb. GetLength ()-1; ++ j)
{
// Has the same data element
If (La [I]. CompareTo (Lb [j]) = 0)
{
Break;
}
}
// If no data element exists, append the data element in Table a to the end of Table B.
If (j> Lb. GetLength ()-1)
{
Lb. Append (La [I]);
}
Return Lb;
}
}

If you understand this idea, the processing in the database will be easy.

We can create a temporary table to solve the problem.
Copy codeThe Code is as follows:
Select distinct * into # Tmp from tableName
Drop table tableName
Select * into tableName from # Tmp
Drop table # Tmp

The reason for this repetition is that the table design is not weekly. You can add a unique index column.

But you said, I don't want to add any fields, but there is no explicit ID column at this time. How can I retrieve the ID column? (It can be an Sn column, GUID, etc)

Let's not talk about the last question. Let's take a look at it first.

Let's take a look at the solution in three databases: Sqlserver2000, Sqlserver2005, and Oracle 10 Gb.

1. SQL Server 2000 constructs the sequence number column

Method 1:
SELECT No. =
(Select count (customer number) FROM customer AS a WHERE a. Customer number <= B. Customer number ),
Customer ID, company name FROM customer AS B ORDER BY 1;
Method 2:

SELECT No. = COUNT (*),
A. Customer ID, a. company name FROM customer AS a, customer AS B
WHERE a. Customer No.> = B. Customer No. group by a. Customer No., B. company name ORDER BY no;
2. SQL Server 2005 constructs the sequence number column

Method 1:
Select rank () OVER (order by customer No. DESC) AS No., customer No., company name FROM customer;

Method 2:
WITH TABLE
(SELECT ROW_NUMBER () OVER (order by customer No. DESC) AS No., customer No., company name FROM customer)
SELECT * FROM TABLE
WHERE no. BETWEEN 1 AND 3;
3. rowid in Oracle can also be seen as the default identifier Column
In Oracle, each record has a rowid, which is unique throughout the database, rowid determines which data file, block, and row of each record in Oracle.
In a duplicate record, the content of all columns may be the same, but the rowid may not be the same. Therefore, you only need to determine those with the largest rowid in the record, and delete all the others.
Copy codeThe Code is as follows:
Select * from test; select * from test group by id having count (*)> 1 select * from test group by idselect distinct * from testdelete from test a where a. rowid! = (Select max (rowid) from test B where. id = B. back to the original problem. Besides using the data structure, the data can be cached in the thread pool because of the unique transaction processing of the database, this is also equivalent to the temporary table function. Therefore, we can use a cursor to delete duplicate records.
Declare @ max int,
@ Id int
Declare cur_rows cursor local for select id, count (*) from test group by id having count (*)> 1
Open cur_rows
Fetch cur_rows into @ id, @ max
While @ fetch_status = 0
Begin
Select @ max = @ max-1
Set rowcount @ max -- let the number of rows at this time be equal to the number of statistics for a row.
Delete from test where id = @ id
Fetch cur_rows into @ id, @ max
End
Close cur_rows
Set rowcount 0 or above is the idea of lightning reading some materials. If you have any questions, please note.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.