The realization principle of mysqldump

Source: Internet
Author: User
Tags flushes savepoint table definition

We can understand the rationale behind mysqldump by opening the general log and looking at the commands executed during mysqldump full library backup.

Open General Log

mysql> set global general_log=on;

Where the general log's storage path can be viewed using the following command

Mysql> Show variables like '%general_log_file% ';

Perform a full library backup

# mysqldump--master-data=2-r--single-transaction-a-phello > 3306_20160518.sql

which

The--master-data specified as 2 refers to a comment that generates change master in the backup file. Specifically in this case, it refers to the

--Change MASTER to master_log_file= ' mysql2-bin.000049 ', master_log_pos=587;

If the value is set to 1, the command for change master is generated instead of the comment.

-r backup stored procedures and functions

--single-transaction get a consistent backup of the InnoDB table.

-A equals--all-databases.

Let's look at the contents of general log

160518 11:00:59 Connect [email protected] on + Query/*!40100 SET @ @SQL_MODE = ' * * * Query/*!40103 SET time_zone= ' +00:00 ' */query FLUSH/*!40101 LOCAL */tabl ES query FLUSH TABLES with READ LOCK + query SET SESSION TRANSACTION ISOLA                   tion level repeatable READ in Query START TRANSACTION/*!40100 with consistent SNAPSHOT */                   Query show VARIABLES like ' Gtid\_mode ' in query show MASTER STATUS Query UNLOCK TABLES query SELECT logfile_group_name, file_name, Total_extents, Initial_si ZE, ENGINE, EXTRA from INFORMATION_SCHEMA. FILES WHERE file_type = ' UNDO LOG ' and file_name are not NULL for GROUP by Logfile_group_name, file_name, ENGINE ORDER by Logfi Le_group_name Query SELECT DISTINCT tablespace_name, file_name, Logfile_group_nAME, Extent_size, Initial_size, ENGINE from INFORMATION_SCHEMA. FILES WHERE file_type = ' datafile ' ORDER by Tablespace_name, Logfile_group_name Query SHOW Databa SES Query SHOW VARIABLES like ' ndbinfo\_version '

Among them, it is more important to have the following points:

1. FLUSH/*!40101 LOCAL */TABLES

Closes all open tables, forces all tables on use to be closed, and flushes the query cache.

2. FLUSH TABLES with READ LOCK

To perform the flush tables operation and add a global read lock, many children's shoes may be curious, and the two commands seem to be duplicated, why not add the lock when the first flush tables operation is performed?

Here's a look at the source of the explanation:

/* We do first    a FLUSH TABLES. If a long update is running, the FLUSH TABLES'll wait but won't    stall the whole mysqld, and when the long update Is do the    FLUSH TABLES with READ LOCK would start and succeed quickly. So,    FLUSH TABLES was to lower the probability of a stage where both mysqldump and most    client connections are stall Ed. Of course, if a second long    update starts between the flushes, we have this bad stall.  */

In short, it is to avoid lengthy transaction operations that cause the flush TABLES with READ lock operation to be slow to lock, while blocking other client operations.

3. SET SESSION TRANSACTION Isolation level repeatable READ

Set the transaction isolation level for the current session to RR,RR to avoid non-repeatable reads and Phantom reads.

4. START TRANSACTION/*!40100 with consistent SNAPSHOT */

Gets a snapshot of the current database, which is determined by--single-transaction in mysqldump.

This only applies to tables that support transactions, in MySQL, only InnoDB.

Note: Start TRANSACTION and start TRANSACTION with consistent snapshot are not the same,

The start TRANSACTION with consistent snapshot is a consistent snapshot of the open transaction.

Let's take a look at the official statement.

How to understand it?

In a nutshell, the transaction is turned on and a select operation is performed on all tables, which guarantees that the data obtained at the SELECT * from table is executed at any point in time for the backup and execution of the start TRANSACTION with consistent Snapshot data at the same time.

Note that the with consistent snapshot is only valid at the RR isolation level.

The following example looks at the difference between start TRANSACTION with consistent snapshot and start TRANSACTION

Note: Session 2 is auto-commit

START TRANSACTION with consistent SNAPSHOT

START TRANSACTION

Visible, if only start TRANSACTION, after the insert operation of Transaction 2 commits, Session 1 is visible (note that it is visible if the insert operation of Session 2 before session 1 of the Select operation)

In the case of start TRANSACTION with consistent SNAPSHOT, even if the insert operation of Session 2 is not visible to session 1 before the select operation of Session 1.

5. SHOW MASTER STATUS

This is determined by--master-data, which records the status information of Binlog when the backup was started, including Master_log_file and Master_log_pos

6. UNLOCK TABLES

Release the lock.

Because I only have the following four libraries in my database

Mysql> Show databases;+--------------------+| Database           |+--------------------+| information_schema | | mysql              | | performance_schema | | Test               |+---------- ----------+4 rows in Set (0.03 sec)

Back up, you can find that only MySQL and test are backed up, and there is no backup of Information_schema and Performance_schema.

Here's a look at the log output information for backing up MySQL and test

Because the log output information is too much, here, only the test library log information is selected. A total of two tables test and test1 are in the test library.

                   Init DB Test + Query SHOW CREATE DATABASE IF not EXISTS ' test ' Query SavePoint SP + query show tables query Show Table Statu s like ' test ' query set sql_quote_show_create=1 query set SESSION charac Ter_set_results = ' binary ', query show create TABLE ' test ', query set Sess ION character_set_results = ' utf8 ' query show fields from ' Test ' to query S                   Elect/*!40001 Sql_no_cache */* from ' test ' in Query SET SESSION character_set_results = ' binary ' Query use ' Test ' query select @ @collation_database Q                   Uery SHOW TRIGGERS like ' test ' in Query SET SESSION character_set_results = ' UTF8 ' Query RollbaCK to SavePoint SP-Query Show table status like ' test1 ' query SET sql_qu     ote_show_create=1 query SET SESSION character_set_results = ' binary ' Show create TABLE ' Test1 ' Query SET SESSION character_set_results = ' UTF8 ' 1                   4 query Show fields from ' test1 ' to Query SELECT/*!40001 Sql_no_cache */* from ' test1 '                   Query SET SESSION character_set_results = ' binary ' query use ' test '                   @collation_database query SHOW TRIGGERS like ' test1 '                   Query SET SESSION character_set_results = ' UTF8 ' in query ROLLBACK to SavePoint SP                   Query RELEASE savepoint SP-Query use ' test ' Query Select @@cOllation_database query SET SESSION character_set_results = ' binary '                   Show function STATUS WHERE Db = ' test ' in Query SHOW CREATE FUNCTION ' Mycat_seq_currval ' Query SHOW PROCEDURE STATUS WHERE Db = ' test ' in query SET SESSION character_set_re sults = ' UTF8 ' Quit

From the above output you can see:

1. The core of the backup is the Select/*!40001 Sql_no_cache */* from ' test1 ' statement.

The statement queries all the data in the table test1, and the corresponding INSERT statement is generated in the backup file.

The effect of Sql_no_cache is that the results of the query are not cached in the query cache.

2. Show Create DATABASE IF not EXISTS ' test ', show create table ' Test1 '

Generate the Genesis statement and the creation statement.

3. SHOW TRIGGERS like ' test1 '

As you can see, if you do not add the-r parameter, the default is to back up the trigger.

4. SHOW FUNCTION STATUS WHERE Db = ' Test '

SHOW CREATE FUNCTION ' Mycat_seq_currval '

SHOW PROCEDURE STATUS WHERE Db = ' Test '

Used to back up stored procedures and functions.

5. Set the savepoint, and then roll back to the savepoint after you have finished backing up each table.

Why did you do it?

Transactions opened with the start TRANSACTION with consistent snapshot in front can only end with commit or rollback, not rollback to savepoint sp.

In fact, doing so does not block DDL operations against already backed-up tables during backup.

/**      ROLLBACK to savepoint in--single-transaction mode to release metadata      lock on table which was already dumped. This allows to avoid blocking      concurrent DDL on this table without sacrificing correctness, as we      won ' t access ta ble second time and dumps created by--single-transaction mode has validity point at the      start of transaction anyway.      Note that this doesn ' t make--single-transaction mode with concurrent      DDL safe in the general case. It just improves situation for people for whom      it might is working.    */

Here's a concrete test:

First case:

Session 1 initiates a transaction and queries the value of the test table, then session 2 adds a column operation that is stuck.

Second case:

Session 1 initiates a transaction, and then session 2 adds a column operation that discovers that the operation was successful.

The third case:

Mimic mysqldump's backup principle and set breakpoints.

Note that the DDL operation is initiated at the time after the select * from test is executed, and if it is before, the DDL operation is possible based on the test of the second case above.

At this point, if the rollback to savepoint SP,DDL operation is not performed, the DDL operation can continue to execute after the operation has been performed.

Thus, ROLLBACK to SavePoint can actually improve the concurrency of DDL.

However, it is also important to note that if the DDL operation occurs before the SELECT * from Test, as demonstrated in the second case, the DDL operation succeeds, and the data that is viewed in the Test table will report the following error:

[Email protected] 04:32:49 > select * from Test; ERROR 1412 (HY000): Table definition has changed, please retry transaction

Corresponding to the mysqldump, the following error will be reported:

Mysqldump:error 1412:table definition have changed, please retry transaction when dumping Table ' test ' at row:0

Summarize:

1. The essence of mysqldump is to get the data of the table through the SELECT * from tab.

2. Start TRANSACTION/*!40100 with consistent SNAPSHOT */must be placed between flush TABLES with READ lock and unlock TABLES, before placing to start TRAN The DML statements executed between the Saction/*!40100 with consistent SNAPSHOT * * and flush TABLES with READ lock are lost and placed behind, causing the data to be repeatedly inserted from the library.

3. mysqldump only suitable for low-peak business, if the backup process of data operation is very frequent, will cause the undo table space is more and more large, undo table space By default is placed in the shared table space, and the Ibdata feature is that once increased, will not shrink.

4. Mysqldump efficiency is still relatively low, START TRANSACTION/*!40100 with consistent SNAPSHOT * * can only wait until all tables are finished after backup, in fact, the high efficiency is to back up a table to submit once, This frees up the space taken up by the undo table space Snapshot as soon as possible. However, you cannot implement a consistent backup of all tables.

5. Why there is no commit operation after the backup is complete

  /*    No reason to explicitely commits the transaction, neither to explicitely UNLOCK Tables:these would be    Automatica Lly be do by the server when we are    disconnect now. Saves Some code here, some network trips, adds nothing to    server.  */

Reference:

http://tencentdba.com/blog/mysqldump-backup-principle/

The realization principle of mysqldump

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.