Mysql BULK Insert Encounter unique index avoidance method _mysql

Source: Internet
Author: User
Tags bulk insert

One, background

Before using SQL Server for table partitioning, there were a lot of questions about unique indexes: Step8:sql Server When table partitions encounter a unique constraint, I did not think that in the MySQL partition will encounter the same problem: MySQL table partition combat.

Today we have some knowledge of MySQL's unique index: including how to create, how to bulk INSERT, and some techniques for SQL;

Where are the root causes of these problems? What do you have in common? Is there a partition alignment concept in MySQL? A unique index is a requirement that will occur in many systems, what can be avoided? How much does it affect performance?

Second, Process

(i) Import variance data, ignore duplicate data, use of IGNORE into

When MySQL creates a table, we usually create a table with an ID value as the primary key, then MySQL will use primary key as the clustered index key and primary key, since it is the primary key, it is certainly the only one, So repeating the following INSERT statement will report a 1062 error: as shown in Figure1;

Copy Code code as follows:

--Create a test table
CREATE TABLE ' TestTable ' (
' Id ' INT (one) UNSIGNED not NULL auto_increment,
' UserId ' INT (one) DEFAULT NULL,
' UserName ' VARCHAR DEFAULT NULL,
' Usertype ' INT (one) DEFAULT NULL,
PRIMARY KEY (' Id ')
) Engine=innodb DEFAULT Charset=utf8;

--Inserting test data
INSERT into TestTable (Id,userid,username,usertype)
VALUES (1,101, ' AA ', 1), (2,102, ' BBB ', 2), (3,103, ' CCC ', 3);

(Figure1:duplicate entry ' 1 ' for key ' PRIMARY ')

But in a real production environment, the requirement is often to set a unique index in the UserID key value, and today I use this as an example for a unique index test:

Copy Code code as follows:

--Create Test table 1
CREATE TABLE ' Testtable1 ' (
' Id ' INT (one) UNSIGNED not NULL auto_increment,
' UserId ' INT (one) DEFAULT NULL,
' UserName ' VARCHAR DEFAULT NULL,
' Usertype ' INT (one) DEFAULT NULL,
PRIMARY KEY (' Id '),
UNIQUE KEY ' Ix_userid ' (' UserId ')
) Engine=innodb DEFAULT Charset=utf8;

--Create Test Table 2
CREATE TABLE ' Testtable2 ' (
' Id ' INT (one) UNSIGNED not NULL auto_increment,
' UserId ' INT (one) DEFAULT NULL,
' UserName ' VARCHAR DEFAULT NULL,
' Usertype ' INT (one) DEFAULT NULL,
PRIMARY KEY (' Id '),
UNIQUE KEY ' Ix_userid ' (' UserId ')
) Engine=innodb DEFAULT Charset=utf8;

--Insert test data 1
INSERT into Testtable1 (Id,userid,username,usertype)
VALUES (1,101, ' AA ', 1), (2,102, ' BBB ', 2), (3,103, ' CCC ', 3);

--Insert test Data 2
INSERT into Testtable2 (Id,userid,username,usertype)
VALUES (1,201, ' AAA ', 1), (2,202, ' BBB ', 2), (3,203, ' CCC ', 3), (4,101, ' xxxx ', 5);

(Figure2:testtable1 Records)

(Figure3:testtable2 Records)

By executing the SQL script above, we created a unique index in both Testtable1 and Testtable2: The unique KEY ' Ix_userid ' (' UserId '), This means that UserID is unique in both the Testtable1 and Testtable2 tables, and if you import the Testtable2 data into the Testtable1, if you perform the SQL "Import 1" below, a 1062 error occurs, resulting in the entire process rollback and did not achieve the purpose of importing differential data.

Copy Code code as follows:

INSERT into Testtable1 (Userid,username,usertype)
SELECT Userid,username,usertype from Testtable2;

(Figure4:duplicate entry ' for key ' Ix_userid ')

MySQL provides a keyword: IGNORE, this keyword determines whether every record exists, whether it violates the unique index in the hungry table, if it is not inserted, and the nonexistent record is inserted.

Copy Code code as follows:

--Import 2
INSERT IGNORE into Testtable1 (Userid,username,usertype)
SELECT Userid,username,usertype from Testtable2;

So when you finish "Import 2", you will produce FIGURE5 results, this has been done for us, but have you noticed that the ID value of the increment skipped some values, because we were doing the "import 1" failure before, and even though our transaction rolled back, the ID would have been faulted. This problem also occurs in SQL Server. Extended reading: Simple and practical SQL scripts part: Locating SQL Server self-Increasing ID value discontinuous records

(Figure5:ignore effect)

(ii) Import and overwrite duplicate data, use of REPLACE into

1. Roll back the Testtable1 and Testtable2 to the Figure2 and Figure3 respectively (using the TRUNCATE TABLE name to execute the INSERT statement), and then execute the following SQL to see what the effect is:

Copy Code code as follows:

--Import 3
REPLACE into Testtable1 (userid,username)
SELECT userid,username from Testtable2;

(Figure6:replace effect)

From the Figure6 above, we can see that the record of UserID 101 has changed, not only username modified, but also usertype has become null.

So, if duplicates are found in the import, delete and then insert, and if there are more than one field in the record, if any of the fields are not assigned at the time of insertion, the newly inserted records are blank (the usertype for the newly inserted record is null).

It is important to note that when you replace, if the inserted table does not specify a column, it is represented by null instead of the original contents of the table. Null is not present if the inserted content column is the same as the table column being inserted.

2. What happens when we perform "Import 3" If our table structure usertype fields are not allowed to be empty and there is no default value?

(Figure7: Return warning message)

(Figure8:usertype is set to 0)

Through Figure7 and Figure8, we know that the data record is still plugged in, only to return the warning of field ' usertype ' doesn ' t have a default value, and the Usertype field of the insert record is set to 0 (' usertype ' is an int data type.

3. If we want to update the value of the Usertype field together when we import it, it would be easy to do this by using the following SQL script:

Copy Code code as follows:

--Import 4
REPLACE into Testtable1 (Userid,username,usertype)
SELECT Userid,username,usertype from Testtable2;

(Figure9: Update usertype together)

(iii) Import reserved duplicate data not specified field,INSERT into on DUPLICATE KEY UPDATE use

Testtable1 and Testtable2 are rolled back to the Figure2 and Figure3 respectively (using the TRUNCATE TABLE name to execute the INSERT statement), and then execute the following SQL to see what the effect is:

Copy Code code as follows:

--Import 5
INSERT into Testtable1 (userid,username)
SELECT Userid,username from Testtable2
On DUPLICATE KEY UPDATE
Testtable1. UserName = Testtable2. UserName;

(Figure10: Preserve usertype value)

Comparison of Figure2, Figure3 and Figure10, UserID 101 records: Updated username values, preserving usertype values, but because usertype is not specified in import 5, So the usertype of the newly inserted record is null.

Copy Code code as follows:

--Import 6
INSERT into Testtable1 (Userid,username,usertype)
SELECT Userid,username,usertype from Testtable2
On DUPLICATE KEY UPDATE
Testtable1. UserName = Testtable2. UserName;

(Figure11: Preserve usertype value)

In contrast to Figure2, Figure3, and Figure11, only the Userid,username field of the Testtable2 table is inserted, but the Testtable1 field of the Usertype table is preserved. If a duplicate record is found, the update is done, the specified field content is updated on the original record, and the other field contents are retained.

(iv) Summary

When you insert a record that contains duplicate values on a unique key, the default insert reports 1062 errors, and MySQL can be handled in three different ways and your business logic.

Third, Reference Documents

MySQL inserts several methods to handle duplicate key values

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.