MySQL sync data to ElasticSearch scheme

Source: Internet
Author: User
Tags mysql client

MySQL Binlog

To sync MySQL data to ES via MySQL Binlog, we can only use row mode Binlog. If you use statement or mixed format, we can only know the corresponding query in the Binlog, it is completely impossible to know what this statement changed the data, so to get the actual data from the Binlog, only with row mode.

Row mode can also set FULL,NOBLOB and minimal three image modes, the latter two mainly to reduce space consumption, the default is full. In fact, personal favorite full mode, so that the most complete data, and also feel that the space occupied for the current hard disk is not a particularly big problem, after all, we also have regular cleanup binlog mechanism.

Synchronization MySQL Binlog is very simple, according to the MySQL replication protocol, write a client, the simulation into the MySQL slave, register to MySQL master can be. MySQL Master sends the data updates in real-time via Binlog event to Slave, and then we can get the actual data after we parse the event ourselves.

Specific implementation here do not do too much to explain, you can refer to MySQL client/server Protocol detailed understanding of MySQL Protocol,binlog events and other related knowledge. The related replication functions are implemented in the Go-mysql project.

MySQL Dump

If it is a new MySQL, of course we can binlog the way to synchronize data conveniently. But if we want to sync a MySQL that's been running for a while, there might be a problem. Since earlier Binlog files have been deleted, we may miss some of the earlier updated data if we start syncing directly.

To solve this method is also relatively easy, refer to the MySQL general backup method, we can first use mysqldump to get the current MySQL entire snapshot, directly parse the generated dump file, you can get all the current data. Then synchronize from the Binlog position position corresponding to this snapshot.

The entire process was also implemented in the Go-mysql canal.

Alternatively, you can use the Mtime timestamp to control the addition of a secondary scalar flag (whether the tag is deleted), so that in the index system timed through the Mtime filter data, in the flag flag bit indicates whether the data has been deleted; There are two options to deal with: 1. When flag is marked for deletion, use Delete to delete the data from the ElasticSearch 2 directly when performing the index. Do not distinguish between flag tags, directly indexed to ElasticSearch, when searching the flag for deleted data filter out

MySQL sync data to ElasticSearch scheme

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.