Back-end distributed series: Distributed Storage-MySQL database transaction and replication _ MySQL

Source: Internet
Author: User
I haven't written any technical articles for a long time, because I have been thinking about how to write the "backend distributed" series. Recently, I have basically figured out that "backend distributed" includes "distributed storage" and "distributed computing. I haven't written any technical articles for a long time looking for answers based on the problems I encountered in my work, because I have been thinking about how to write the "backend distributed" series.

Recently, I have basically figured out that "backend distributed" includes "distributed storage" and "distributed computing.

In combination with the problems encountered in actual work and looking for answers to explore the technology, many times we are not creating new technologies, but applying them.

To improve efficiency and effectiveness, we need to understand the principles and working methods of some technologies.

The technical principles are analyzed from the perspective of users, and open-source technical products and frameworks are used as a reference for implementation of a type of technology.

The main purpose is to clarify the principle. if there are no special technical details for the specific implementation, try to get it started.

Transactions and replication

Recently, I participated in a data distribution project involving data distribution of MySQL databases.

In short, it is necessary to implement multi-point writability in a remote data center and ensure that the distributed data can achieve eventual consistency.

In the past, data distribution for MySQL was only a separation of read and write. the master database and read slave database can be written through the master-slave replication of the database itself.

Now we need a dual-write master database and reach a final consistency after a short delay. this problem is complicated at first glance, but it is still the final consistency of data.

First, let's go back to the simplest case. how can we ensure data consistency when there is only one MySQL database?

All users who know the database know that this is guaranteed by the transaction features of the database. transactions include the following four features:

Atomicity

Consistency

Isolation

Durability persistence

The four major features of ACID in transactions are not the focus of this article, so we will not discuss them in academic circles. if you do not know about ACID, you can refer to the relevant articles in the references below.

Here, I just want to raise a question. a single database transaction can ensure data consistency. how can I ensure data consistency between the master and slave databases when MySQL is deployed in a master-slave architecture?

To provide the master-slave replication function, MySQL introduces a new log file called binlog, which contains a collection of event logs that cause data changes.

The slave database requests the master database to send binlogs and restores data through log events to write data to the slave database. Therefore, the data source of the slave database is binlog.

In this way, only the binlog and local data of the MySQL master database can be consistent to ensure that the data of the master and slave databases is consistent (the master database is inconsistent due to network transmission temporarily ).

We know that the guarantee of local data consistency is achieved by the database transaction feature, and how is the database transaction implemented? First look at the figure below:

MySQL itself does not provide transaction support, but opens the storage engine interface, which is implemented by a specific storage engine. Specifically, the storage engine that supports MySQL transactions is InnoDB.

The general method for implementing transactions by the storage engine is based on redo log and undo log.

In short, the redo log records the modified data of the transaction and the undo log records the original data before the transaction.

Therefore, when a transaction is executed, the actual process is described as follows:

First, record the undo/redo log to ensure that the log is flushed to the disk for persistent storage.

Update data records, cache operations, and asynchronous disk flushing.

Commit the transaction and write the commit record in the redo log.

If a fault is interrupted during MySQL transaction execution, you can use the redo log to redo the transaction or roll back and forth through the undo log to ensure data consistency.

These are all done by the transaction storage engine, but the binlog is not recorded by the transaction storage engine, but by the MySQL Server.

Therefore, the consistency between binlog data and redo log must be ensured. Therefore, after binlog is enabled, the actual transaction execution is performed one more step, as shown below:

First, record the undo/redo log to ensure that the log is flushed to the disk for persistent storage.

Update data records, cache operations, and asynchronous disk flushing.

Persists transaction logs to binlog.

Commit the transaction and write the commit record in the redo log.

In this case, as long as the binlog is not successfully written, the entire transaction needs to be rolled back. after the binlog is successfully written, the transaction can be restored and committed even if MySQL Crash is complete.

To achieve this, binlog and transaction need to be associated. only by ensuring the consistency of binlog and transaction data can the consistency of master and slave data be guaranteed.

Therefore, the write process of binlog has to be embedded into the execution process of the pure transaction storage engine, and the two-phase commit is completed in the internal distributed transaction (xa transaction) mode.

For more information, see references.

Summary

We first raised a question, and then thought about it from the perspective of data consistency, referring to the implementation method of MySQL.

Clarified and analyzed how MySQL standalone environment ensures data consistency of the replication mechanism, that is, the consistency between binlog and transaction data.

Later, we can implement replication and ensure the consistency of master-slave replication based on the binlog mechanism.

Master-slave replication introduces network factors, which further increases the complexity of ensuring master-slave data consistency, and will be further analyzed later.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.