Prior to the wormhole (we developed a large volume data transmission tool, similar to Apache Sqoop, Taobao datax) MySQL writer supports insert into and replace into two ways into the MySQL table.
Now there is a need to support the update part of the field operations, such as a calculated pile of merchant ID (shopid) and corresponding PageView total (viewtotal), need to update to a multiple-field shop table corresponding field (Shopid is the primary key), if still with the insert into throws duplicated key exception, and if you use replace into, the corresponding field is updated successfully, but the other fields are automatically set to null or default values.
This is because replace into once found that the same primary key or unique key needs to be updated, delete the previous record first and reinsert one (equivalent to two operations). This results in the previous record in the other do not need to update the field data is also gone, and this situation, if this table has a auto-increment key, delete the middle of a record, will be inserted at the end of a row, the self added key is the last value, not the previous, and delete, the index will be invalidated, and then insert to rebuild the index, the efficiency is relatively low, this is not the effect we expected.
Later, with the DBA, you can use the "INSERT on DUPLICATE KEY UPDATE" syntax, and MySQL version 4.1 introduces this new feature.
When we insert a field that has a primary key or a unique key conflict, it will only do the specified update operation, and the rest of the insert action.
The syntax is as follows:
INSERT into
shops
SET
shopid = 10000,
viewtotal = 555555 on
DUPLICATE KEY UPDATE
viewtotal = 555555
Insert more than one record at a time:
Insert into shops (shopid, viewtotal)
values (1,123456),
(2, 234567),
(3, 345678)
on Duplicate key update
shopid = VALUES (shopid),
viewtotal = values (viewtotal)
So I add "INSERT on DUPLICATE KEY Update" operation in wormhole, through the SQL template stitching out update multiple records of the statement (such as update a batch size 1000), submit statements, batch update.
This article link http://blog.csdn.net/lalaguozhe/article/details/9150049, reprint please specify