International - English

Cart Console

Topic Center

Contact Sales

Home > Others

Use SQL statements to delete duplicate data in a data table

Last Update:2018-12-03 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

During the internship, a large amount of data was collected using the "thief" program, but many of them were repeated.

Delete the duplicate data (keep one record) according to the practice requirements, and delete several data tables

And put the data in a data table. Based on your own experience, let's take a few points.

1. Merge data tables

There is a select into statement in the SQL statement.

Copy or archive records), for example: Select column_name (s) into newtable

[In externaldatabase] from source, which can assign values to data in one table to another

Table. Note that "newtable" does not exist.

I tried it. For some reason, I cannot insert a new data table. Report

Output "#1327-undeclared variable: newtable", and cannot be merged into one table

Data table.

The other is (1) creating a table with the same structure as the source table.
(2) Insert into newtable (column_name (s) Select

Distinct column_name (s) from source
I added an unqiue option to a newtable field during insertion.

You can do it after you drop it. The fields in the two tables are identical.

2. Delete duplicate fields

Many methods on the Internet are complicated. Use the distinct field in SQL. Select

Distinct * From sourse, the option of data duplication will be deleted. It should be noted that: heavy

The duplicate data is exactly the same, because each entry with an ID (auto_increment) has a separate ID, such

The data is different.

Some online materials:

There are two Repeated Records. One is a completely repeated record, that is, a record with all fields repeated,

Second, some records with duplicate key fields, such as duplicate name fields, are not necessarily duplicated or both

Repeat can be ignored.

1. For the first type of repetition, it is easier to solve.

Select distinct * From tablename

You can get the result set without repeated records.

If the table needs to delete duplicate records (one record is retained), you can delete the record as follows:

Select distinct * into # TMP from tablename
Drop table tablename
Select * into tablename from # TMP
Drop table # TMP

The reason for this repetition is that the table design is not weekly. You can add a unique index column.

2. Repeat problems usually require that the first record in the repeat record be retained. The procedure is as follows:

Assume that the duplicate fields are Name and address. You must obtain the unique result set of the two fields.

Select Identity (INT, 1, 1) as autoid, * into # TMP from tablename
Select min (autoid) as autoid into # tmp2 from # TMP group by name, autoid
Select * from # TMP where autoid in (select autoid from # tmp2)

The last select gets the result set with the name and address unique (but with an autoid added ).

Field. This column can be omitted in the select clause during actual writing)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

Related Keywords:

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

What's Trending

Top 10 Tags

datastax versions naming convention zookeeper client class definition md5 microsoft sql server 2005 data structures exception handling error handling

Top 10 Keywords

microsoft download center down wordpress address url site address url wordpress address url windows installer 4 0 download 302 not found web address url definition site address url wordpress db2 integer mac os installation step by step pdf abbreviation for return

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Use SQL statements to delete duplicate data in a data table

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support