In the previous article we described the approximate flow of transaction submissions in the case of closing binlog. The reason for shutting down Binlog is because the transaction commit process turns into a two-phase commit after the Binlog is turned on, and the two-phase commit does not involve a distributed transaction, and of course MySQL calls it an internal XA transaction (distributed transactions). An external XA transaction is also associated with it.
The so-called two-phase commits are prepare phase and commit phase respectively.
Internal XA transactions exist primarily within MySQL to ensure the consistency of data between Binlog and redo logs, which is also determined by its architecture (Binlog in the MySQL layer, and redo log on the storage Engine layer);
External XA transactions refer to support for multi-instance distributed transactions, which is truly a distributed transaction.
Since it is an XA transaction, it inevitably involves a two-phase commit, and for internal XA, there are two stages of submission.
The following is a detailed explanation of the two-phase commit process of internal XA in conjunction with the source code and, in various cases, how MySQL recovers to ensure transactional consistency after mysqld crash.
Test environment
Os:win7
ENGINE:
Db:
Configuration file Parameters:
log-bin=d:\mysql\log\5-6-21\mysql-Binbinlog_format=ROWset autocommit=0; innodb_support_xa =1sync_binlog=1; Innodb_flush_log_at_trx_commit= 1;
"Innodb_flush_log_at_trx_commit=1,sync_binlog=1
The difference between the different modes is that the write file calls the write and the Fsync calls are different in frequency, the consequence is that the mysqld or OS crash, the non-strict settings may lose the transaction update.
The double mode is the most restrictive mode, in which case the single machine will not lose the transaction update under any circumstances. 】
Test conditions
Set autocommit=0;
for ' user '------------------------------int(a) is not NULL and ' account ' varchar ) not NULL, ' name ' varchar (null,primary key (' ID '), key ' id ' (' id ') using Btree,key ' name ' (' name ') using B TREE) ENGINE=innodb DEFAULT Charset=utf8;
Test statement
Insert into user values (1, ' Sanzhang ', ' Zhang San ');
Commit
Prepare stage:
1. Set undo state=trx_undo_prepared; //trx_undo_set_state_at_ Prepare Call
2. Brush the redo log generated by the transaction update ; " redo generated by step 1 The log will also be brushed into the "
mysql_bin_log::p repareha_prepare_low {engine:binlog_prepareinnobase_xa_preparemysql:trx_prepare_for_mysql{ 1.trx_undo_set_state_at_prepare // Set the UNDO segment to mark Trx_undo_prepared 2. Set the transaction status to trx_state_prepared3.trx_flush _log_if_needed // The resulting redolog will be brushed into the disk } }
Commit phase:
1. write file, swipe into disk;
2. set undo , trx_undo_to_free or trx_undo_to_purge;   //   trx_undo_set_state_at_finish call
3. Record the Binlog offset of the transaction and write to the system table space ; //trx_sys_update_mysql_binlog_offset Call
mysql_bin_log::commit Ordered_commit {1. Flush_stage Flush_cache_to_file//Brush Binlog2. Sync_stage Sync_binlog_file//Call Fsync () to sync, the file to disk.3. Commit_stage Ha_commit_low {binlog_commit innobase_commit Trx_commi T (TRX) {trx_write_serialisation_history (TRX, MTR); //Update the Binlog site, set the undo Statetrx_commit_in_memory (TRX, LSN);//Release the lock resource, clean up the savepoint list, clean up the rollback segment } } }
In any case (the machine is out of power) mysqld crash or OS Crash,mysql can still guarantee the consistency of the database. How is data consistency achieved? It is the two phase commit.
We combine several scenarios to analyze how the next two-phase commit is done:
1.prepare stage, redo log before landing, Mysqld crash
2.prepare stage, redo log after landing, Binlog before the plate, mysqld crash
3.commit stage, binlog after landing, mysqld crash
In the first case, since Redo has no disk, there is no doubt that the update of the transaction is not written to disk, and the consistency of the database is affected;
In the second case, when the redo log write is complete, but the Binlog is not written, the transaction is in Trx_state_prepared state, is this a commit or rollback?
For the third case, at this point, redo log and binlog have been dropped, but the undo state is not updated, although redo log and binlog are consistent, should the transaction be committed?
We combine the execution logic of the MYSQLD exception reboot with the key source code.
For the third case, we can collect the Binlog event for uncommitted transaction, so we need to submit it.
In the second case, because Binlog is not written, the database consistency needs to be ensured by performing a rollback operation.
How to determine whether a transaction is committed or rolled back after an abnormal restart
1. Read the Binlog log to get the event that was not submitted when the crash occurred;//info->commit_list contains this element
2. If present, the corresponding transaction is to be submitted, otherwise it needs to be rolled back.
Determine the transaction commit or rollback source as follows:
The basic process of two-phase commit is discussed above, and after the server exception crash, how MySQL restarts recovery ensures consistency of binlog and data.
In short, for an abnormal XA transaction, the transaction should be committed if the binlog has been dropped, and the transaction should be rolled back if the binlog is not on the disk.
Rollback process after an abnormal restart
innobase_rollback_by_xidrollback_by_xidtrx_rollback_resurrected trx_rollback_active row_undo { // get the undo record from the rollback page // analyze the Undo record type if (insert) row_undo_ins else row_undo_mod }
After an abnormal restart, the commit process
Commit_by_xidtrx_commit_for_mysql
Write Binlog interface
handler.cc:binlog_log_rowsql/binlog.cc:commitmysys/my_sync:my_syncsql/binlog.cc : Sync_binlog_filehandler/ha_innodb.cc:innobase_xa_prepare
the Binlog log file is intended to address Mysql A new log file introduced by the master-slave replication feature that contains a collection of event logs that raise data changes.
The request from the library to the main library sends Binlog and the log event restores the data written from the library, so the data source from the library is Binlog.
In this way, the MySQL main library only needs to do binlog with the local data to ensure that the master-slave database data is consistent (ignoring the master-slave inconsistency caused by network transmission).
Reference
1, "High-performance MySQL"
2. MySQL Transaction submission process
MySQL Transaction commit process (ii)