There may be duplicate records in some MySQL tables, and in some cases we allow duplicate data to exist, but sometimes we also need to delete these duplicate data.
In this section we'll show you how to prevent duplicate data from appearing in a datasheet and how to delete duplicate data from a datasheet.
prevent duplicate data from appearing in tables
You can set the specified field in the MySQL datasheet PRIMARY key (primary key) or unique (unique) index to ensure the uniqueness of the data.
Let's try an example: there are no indexes and primary keys in the following table, so the table allows multiple duplicate records.
CREATE TABLE person_tbl
(
first_name char (),
last_name char (),
sex char)
;
If you want to set the field in the table First_name,last_name data cannot be repeated, you can set the dual primary key mode to set the uniqueness of the data, if you set a double primary key, then the default value of that key can not be null, can be set to NOT NULL. As shown below:
CREATE TABLE person_tbl
(
first_name char () NOT NULL,
last_name char (=) NOT null,
sex char (10), C13/>primary KEY (last_name, first_name)
);
If we set a unique index, then when inserting duplicate data, the SQL statement fails to execute successfully and throws an error.
The difference between insert IGNORE into and insert into is that the insert IGNORE ignores data that already exists in the database and, if the database has no data, inserts new data, skipping the data if there is data. This preserves the data that already exists in the database for the purpose of inserting data in the gap.
The following instance uses insert IGNORE into, without error, and without inserting duplicate data into the datasheet:
Mysql> INSERT IGNORE into Person_tbl (last_name, first_name)
-> VALUES (' Jay ', ' Thomas ');
Query OK, 1 row Affected (0.00 sec)
mysql> INSERT IGNORE into Person_tbl (last_name, first_name)
-> VALUES ( ' Jay ', ' Thomas ');
Query OK, 0 rows Affected (0.00 sec)
Insert IGNORE into when you insert data, after you set the uniqueness of the record, if you insert duplicate data, you do not return an error and return only in the form of a warning. The replace into is deleted if there is a primary or unique record. and insert a new record.
Another way to set the uniqueness of data is to add a unique index, as follows:
CREATE TABLE person_tbl
(
first_name char () NOT NULL,
last_name char (=) NOT null,
sex char (10)
UNIQUE (last_name, first_name)
);
Statistic Duplicate data
Here we will repeat the number of first_name and last_name records in the table:
Mysql> SELECT COUNT (*) as repetitions, last_name, first_name
-> from Person_tbl->
GROUP by last_name, fi Rst_name
-> having repetitions > 1;
The above query statement returns the number of duplicate records in the Person_tbl table. In general, query for duplicate values, do the following:
Determine which column contains values that may be duplicated.
Those columns listed using COUNT (*) in the column selection list.
The columns listed in the GROUP BY clause.
The HAVING clause sets the number of repetitions to be greater than 1.
Filtering Duplicate Data
If you need to read the data without duplication, you can use the DISTINCT keyword in the SELECT statement to filter the duplicate data.
Mysql> SELECT DISTINCT last_name, first_name-> from Person_tbl-> order by
last_name;
You can also use GROUP by to read data that is not duplicated in a datasheet:
Mysql> SELECT last_name, first_name
-> from Person_tbl
-> GROUP by (last_name, first_name);
Delete duplicate data
If you want to delete duplicate data from a datasheet, you can use the following SQL statement:
mysql> CREATE TABLE tmp SELECT last_name, first_name, sex
-> from person_tbl;
-> GROUP by (last_name, first_name);
mysql> DROP TABLE person_tbl;
mysql> ALTER TABLE tmp RENAME to PERSON_TBL;
Of course you can also add index (index) and Primay key (primary key) in a datasheet to remove duplicate records from a table. The method is as follows:
mysql> ALTER IGNORE TABLE person_tbl
-> ADD PRIMARY KEY (last_name, first_name);
The above is the MySQL processing duplicate data related data, hope to be helpful to everybody's study.