INSERT into on DUPLICATE KEY UPDATEAndREPLACE into, two commands can handle duplicate key-value problems, what difference does it make in practice?
The precondition is that the table must have aunique index or primary key。
Unique
1, replace found duplicate first delete and then insert, if the record has more than one field, when inserted when the field is not assigned value, then the newly inserted record these fields are empty.
2, insert found duplicates is the update operation. On the basis of the original record, the contents of the specified field are updated and the contents of other fields are preserved.
If the operation cost of replace is greater than the insert on DUPLICATE key Update, the insert on DUPLICATE key update should be chosen as the reason.
Some tests are as follows
2 data columns that are affected: 2
Insert Syntax
INSERT [Low_priority | DELAYED | High_priority] [IGNORE]
[Into] tbl_name [(Col_name,...)]
VALUES ({expr | DEFAULT},...), (...),...
[on DUPLICATE KEY UPDATE col_name=expr, ...]
Or:
INSERT [Low_priority | DELAYED | High_priority] [IGNORE]
[Into] Tbl_name
SET col_name={expr | DEFAULT}, ...
[on DUPLICATE KEY UPDATE col_name=expr, ...]
Or:
INSERT [Low_priority | High_priority] [IGNORE]
[Into] tbl_name [(Col_name,...)]
SELECT ...
[on DUPLICATE KEY UPDATE col_name=expr, ...]
First, the use of DELAYED
Using deferred insert operations
Delayed modifiers apply to insert and replace statements. When the delayed insert operation arrives, the server puts the data row in a queue and immediately returns a status message to the client so that the client can continue operations before the data table is actually inserted into the record. If the reader reads data from the data table, the data in the queue is persisted until there is no reader. The server then begins inserting data rows in the deferred data row (Delayed-row) queue. At the same time as the insert operation, the server also checks to see if a new read request arrives and waits. If there is, the deferred data line queue is suspended, allowing the reader to continue the operation. When there is no reader, the server begins inserting the deferred data row again. This process continues until the queue is empty.
A few things to note:
· Insert delayed should be used only for INSERT statements that specify a value list. Server ignored for insert DELAYED ... The delayed of the SELECT statement.
· Server ignored for insert DELAYED ... The delayed of the on DUPLICATE UPDATE statement.
· Because the statement returns immediately before the row is inserted, you cannot use last_insert_id () to get the auto_increment value. The auto_increment value may be generated by the statement.
· For SELECT statements, the delayed rows are not visible until the rows are actually inserted.
· Delayed is ignored in the subordinate replication server because delayed does not produce data that is not the same as the primary server in the secondary server.
Note that the rows currently in the queue are only saved in memory until they are inserted into the table. This means that if you forcibly abort the mysqld (for example, using kill-9) or if the mysqld stops unexpectedly, all rows that are not written to the disk will be lost.
Second, the use of ignore
Ignore is an extension of MySQL relative to standard SQL. If there are duplicate keywords in the new table, or if a warning occurs after strict mode is started, use ignore to control the operation of ALTER TABLE. If ignore is not specified, the copy operation is discarded when a duplicate keyword error occurs, returning to the previous step. If ignore is specified, for rows with duplicate keywords, only the first row is used, and the other conflicting rows are deleted. Also, correct the error value so that it is as close to the correct value as possible. Insert ignore into TB (...) value (...) This does not need to verify the existence of, there is ignored, no add
Third, on DUPLICATE KEY update use
If you specify an on DUPLICATE KEY update and the insert row causes duplicate values to appear in a unique index or primary KEY, the old line UPDATE is performed. For example, if column A is defined as unique and contains a value of 1, the following two statements have the same effect:
mysql> INSERT into table (a,b,c) VALUES
-On DUPLICATE KEY UPDATE c=c+1;
mysql> UPDATE table SET c=c+1 WHERE a=1;
If the row is inserted as a new record, the value of the affected row is 1, and if the original record is updated, the value of the affected row is 2.
NOTE: If column B is also the only column, the insert is equivalent to this UPDATE statement:
mysql> UPDATE table SET c=c+1 WHERE a=1 OR b=2 LIMIT 1;
If A=1 OR b=2 matches multiple rows, only one row is updated. In general, you should try to avoid using the on DUPLICATE key clause on a table with multiple unique keywords.
You can use the values (col_name) function from the Insert ... in the UPDATE clause. The insert portion of the UPDATE statement refers to the column value. In other words, if a duplicate keyword conflict does not occur, values (col_name) in the update clause can refer to the value of the col_name being inserted. This function is especially useful for multi-row insertions. The VALUES () function is only in the insert ... The UPDATE statement makes sense, and returns null at other times.
Example:
mysql> INSERT into table (a,b,c) VALUES (4,5,6)
-On DUPLICATE KEY UPDATE c=values (a) +values (b);
This statement works the same as the following two statements:
mysql> INSERT into table (a,b,c) VALUES
-On DUPLICATE KEY UPDATE c=3;
mysql> INSERT into table (a,b,c) VALUES (4,5,6)
-On DUPLICATE KEY UPDATE c=9;
When you use the on DUPLICATE KEY update, the delayed option is ignored.
Summary: DELAYED as a quick insert, not very concerned about the failure, improve the insertion performance.
Ignore only focus on the primary key corresponding record is not present, none is added, there is ignored.
On DUPLICATE Key UPDATE operates on add-on, focusing on non-primary key columns, noting the difference from ignore. The specified column is updated and none is added.
Insert into table values () on DUPLICATE KEY UPDATE field1 =?, Field2 =? Comma delimited
See http://www.itpub.net/forum.php?mod=viewthread&tid=1770206 for examples
In the final practice, the result is received as follows:
When the database data is very small, both methods are very fast, whether it is a direct insert or a conflict when the update is good, but in the database table content is relatively large (such as millions), the two ways are not the same,
The first is the direct insert operation, the two kinds of insertion efficiency is slightly lower, such as directly into the table 1000 data (millions table (InnoDB engine)), both of which need almost 5, 6 or even more than 10 seconds. The reason, my host performance is on the one hand, but in the large data table bulk INSERT data, every time the insertion to maintain the index, index can improve the efficiency of the query, but in the Update table especially large table, the index becomes a problem to be considered.
Next is the Update table, where the update is with the primary key value (because I am from another table to get the data and then insert, the primary key can not be changed) also directly update 1000 data, replace the operation of the insert on duplicate is much lower than the operation is too many, when insert instantaneous completion (feeling), replace to 7,8s, replace slow reason I know, in the update data, you want to delete the old, and then insert new, in the process, but also to re-maintain the index, so slow, but why insert on Duplicate's update was so fast. After consulting the boss, finally know that the insert on duplicate update operation will update the data, but its index to the primary key will not change, that is, the insert on duplicate update has no effect on the primary key index. Therefore, the maintenance cost of the index is lower ( If the updated field does not include the primary key, it will be said separately.
MySQL exists then is updated, does not exist insert