Back-end distributed series: distributed storage-two-way synchronous replication of MySQL Databases
The last article on MySQL replication focuses on the design points and constraints of the two-way synchronous replication architecture. The dual-master, dual-write, and bidirectional synchronization scenarios of databases are restricted, mainly considering data integrity, consistency, and conflict avoidance. For the same database and table, the change of the same field in the same record will lead to data consistency judgment conflicts and avoid it through business scenario design as much as possible. Primary Key conflicts may occur due to dual-master/Dual-write and synchronous replication. Avoid using the database auto-incrementing primary key scheme. In addition, bidirectional synchronization may cause loop synchronization problems and requires loop control. As shown in, the replication program also generates binlogs when writing data. It is critical to avoid loop replication to identify and filter the binlogs generated by the replication program. The native Dual Master solution MySQL itself supports Dual-Master configuration, but it does not solve the potential data consistency conflicts caused by primary keys and Dual writes. MySQL records the current MySQL server-id in binlog for potential cyclic replication problems of Bidirectional synchronization. Once the value of server-id is available, MySQL can easily determine from which Server a change was originally generated, so it is easy to avoid loop replication. In addition, you can also configure the binlog option (-- log-slave-update) for logging slave without enabling it. MySQL will not record the changes in the replication process to the binlog, you don't have to worry about the possibility of loop replication. From the MySQL solution, you can find the starting point. If you can mark the binlog, you can determine which binlogs are generated by replication and filter them out. The MySQL solution is too coupled with the MySQL configuration. It is easy to cause problems in the large-scale deployment of online production systems due to incorrect MySQL configuration. The custom markup SQL solution allows you to consider a common markup SQL solution for decoupling from the MySQL configuration. To put it simply, a special mark SQL statement is inserted when the database is synchronized for replication to mark the change from the replication program. The Mark SQL will enter the binlog. When the replication program reads data, it filters and judges by identifying the Mark SQL. Binlog stores SQL statements that affect data changes. These SQL statements constitute a transaction, as shown in figure: the green zone is a normal transaction generated by business operations, the red area is the transaction written by the replication program, where the blue block marks the SQL. Mark SQL: Mark SQL to update a predefined tag table that is different from the business table after the transaction starts and before the transaction ends. Each time the replication program reads binlog content in batches, the following five conditions may exist, as shown in: the batch read range falls into the Green Zone. The start point of the batch read range falls in the green area, and the end point falls in the red area. The start point of the batch read range falls in the red area and the end point falls in the green area. The start and end points of the batch read range are in the green area, but the red area is covered in the middle. The batch read range falls in the red area. In the preceding 5th cases, a transaction is split into three segments for synchronization. Because there is no transaction header or tail mark in the middle segment, the replica program cannot judge it during reading, resulting in cyclic synchronization, which needs to be avoided. By setting the replication program's batch read range to at least or equal to the write transaction length range, 5th cases are avoided. When the replication program reads binlog events in batches, it filters by marking SQL to avoid loop replication and implement loop control. This article summarizes some design points and restrictions for Bidirectional synchronous replication in the MySQL dual-master write scenario. Taking native implementation as a reference, the design points of a custom implementation method are presented. Changes in the same database, table, and record of the same field at the same time will inevitably lead to a data consistency conflict. At the replication and synchronization level, the updating of the edge cannot be distinguished. It is usually considered to restore to the consistent state with the last timestamp, but the timestamp actually produces errors. If this type of scenario is rare, it is best to avoid it in the business scenario design as much as possible.