Sometimes a table or result set contains duplicate records. Sometimes it is allowed, but sometimes it needs to stop repeating records. Sometimes it needs to recognize that duplicate records are removed from the table. This chapter describes how to prevent duplicate records from occurring in one table from deleting duplicates that already exist.
Prevent occurrences in a repeating table:
You can prevent duplicate records by using the corresponding fields on the primary key or a unique index table. Let's cite an example where the following table does not contain such an index or a primary key, so it allows you to record first_name and last_name duplicates
The code is as follows |
Copy Code |
CREATE TABLE Person_tbl ( First_Name CHAR (20), Last_Name CHAR (20), Sex CHAR (10) );
|
You can prevent duplicate records by using the corresponding fields on the primary key or a unique index table. Let's cite an example where the following table does not contain such an index or primary key, so it allows duplicate first_name and last_name Records
The code is as follows |
Copy Code |
CREATE TABLE Person_tbl ( First_Name CHAR (not NULL), Last_Name CHAR (not NULL), Sex CHAR (10) PRIMARY KEY (last_name, first_name) );
|
The existence of a unique index in a table usually results in an error, and if a record is inserted in the table, the existing record in the column or column of the index is defined repeatedly.
Use Insert ignore instead of insert. If the record does not duplicate an existing record, MySQL will insert it as usual. If the record is repeated ignore the keyword tells MySQL to silently discard it without generating an error.
There are no errors in the following example, and duplicate records are not inserted.
The code is as follows |
Copy Code |
Mysql> INSERT IGNORE into Person_tbl (last_name, first_name) -> VALUES (' Jay ', ' Thomas '); Query OK, 1 row Affected (0.00 sec) Mysql> INSERT IGNORE into Person_tbl (last_name, first_name) -> VALUES (' Jay ', ' Thomas '); Query OK, 0 rows Affected (0.00 sec)
|
Use substitution instead of insert. If the record is new it inserts the insert. If it is a duplicate, the new record will replace the old one:
The code is as follows |
Copy Code |
Mysql> REPLACE into Person_tbl (last_name, first_name) -> VALUES (' Ajay ', ' Kumar '); Query OK, 1 row Affected (0.00 sec) Mysql> REPLACE into Person_tbl (last_name, first_name) -> VALUES (' Ajay ', ' Kumar '); Query OK, 2 rows Affected (0.00 sec)
|
Use replace instead of insert. If the record is new it inserts the insert. If it is a duplicate, the new record will replace the old one:
Another way to enforce uniqueness is to add a unique index instead of a primary key table.
The code is as follows |
Copy Code |
CREATE TABLE Person_tbl ( First_Name CHAR (not NULL), Last_Name CHAR (not NULL), Sex CHAR (10) UNIQUE (last_name, first_name) );
|
Calculation and determination of duplicates:
The following are duplicate records in the query number first_name and last_name tables.
The code is as follows |
Copy Code |
Mysql> SELECT COUNT (*) as repetitions, last_name, first_name -> from Person_tbl -> GROUP by last_name, first_name -> having repetitions > 1;
|
This query returns duplicate records from all PERSON_TBL tables in a list. In general, duplicate values are recognized, do the following:
Determine which column contains values that may be duplicated.
Those columns listed with COUNT (*) in the column selection list.
And the columns listed in the GROUP BY clause.
The new HAVING clause eliminates the requirement that a unique value is greater than 1 of the number of groups.
To eliminate duplicate query results:
You can use distinct with the SELECT statement to find the only record in the table.
The code is as follows |
Copy Code |
Mysql> SELECT DISTINCT last_name, first_name -> from Person_tbl -> ORDER by Last_Name;
|
Distinct is an alternative method of adding a named column in a GROUP BY clause. This has the effect of removing the value of the specified column in a unique combination of duplicates and only selections:
The code is as follows |
Copy Code |
Mysql> SELECT last_name, first_name -> from Person_tbl -> GROUP by (last_name, first_name);
|
Remove duplicate use table replacement:
If you have duplicate records in a table, you want to remove all duplicate records from the table, and look at the following examples of programs.
The code is as follows |
Copy Code |
mysql> CREATE TABLE tmp SELECT last_name, first_name, sex -> from Person_tbl; -> GROUP by (last_name, first_name); mysql> DROP TABLE person_tbl; mysql> ALTER TABLE tmp RENAME to PERSON_TBL; |
A simple way to remove duplicate records from a table is to add a key, table index, or Primay. If the table is already available, use this method to delete duplicate records.
The code is as follows |
Copy Code |
mysql> ALTER IGNORE TABLE person_tbl -> ADD PRIMARY KEY (last_name, first_name); |
Here's a summary of some of the ways MySQL deletes duplicate records
My most common method is
The code is as follows |
Copy Code |
Delete ID duplicate data, suitable for ID is manual primary key Delete Person as a from as a, ( Select *,min (ID) from person GROUP by ID having count (1) > 1 ) as B where a.id = b.ID
Look for duplicates and get rid of the smallest one. Delete Tb_person as a from Tb_person as a, ( Select *,min (ID) from Tb_person GROUP by name has count (1) > 1 ) as B where a.name = B.name and a.id > b.id;
|
Okay, let's summarize some
1. Query the records that need to be deleted, and keep a record.
The code is as follows |
Copy Code |
Select A.id,a.subject,a.receiver from Test1 a LEFT join (select C.subject,c.receiver, Max (c.id) as bid from Test1 C where Status=0 GROUP by Receiver,subject has count (1) >1) b on a.id< b.bid where a.subject=b.subject and a.receiver = B.receiver and a.ID < b.bid
|
2. Delete duplicate records, keep only one record. Note that subject,receiver to index, otherwise it will be slow.
The code is as follows |
Copy Code |
Delete A from test1 A, (select C.subject,c.receiver, Max (c.id) as bid from Test1 C where status=0 GROUP by Receiver,subje CT has count (1) >1) b where a.subject=b.subject and a.receiver = B.receiver and a.ID < b.bid;
|
3. Find redundant records in the table, duplicate records are based on a single field (Peopleid) to determine
The code is as follows |
Copy Code |
SELECT * from People where Peopleid in (select Peopleid from People GROUP by Peopleid has count (Peopleid) > 1)
|
4. Delete Redundant records in the table, duplicate records are based on a single field (Peopleid) to judge, leaving only rowid minimal records
The code is as follows |
Copy Code |
Delete from people where Peopleid in (select Peopleid from People GROUP by Peopleid has count (Peopleid) > 1) and rowID not in (select min (rowid) from people GROUP by Peopleid have Count (Peopleid) >1)
|
5. Delete extra duplicate records (multiple fields) in the table, leaving only the smallest ROWID records
The code is as follows |
Copy Code |
Delete from Vitae a where (A.PEOPLEID,A.SEQ) in (select Peopleid,seq from Vitae GROUP by PEOPLEID,SEQ have count (*) > 1) and rowID not in (select min (rowid) from Vitae GROUP by PEOPLEID,SEQ have Count (*) >1)
|
For more details please see: http://www.111cn.net/database/mysql/47531.htm