Implementation principle of mysqldump and implementation principle of mysqldump
MySQL backup can be divided into the following two types:
1. Cold backup
2. Hot Standby
Among them, cold backup, as its name implies, is to turn off the database and use operating system commands to copy database-related files. Hot Standby refers to online Hot Standby, that is, backing up the database without shutting down the database. The actual production is basically the latter.
Hot Backup can also be divided into two methods:
1. Logical backup
2. Physical backup
For the former, the commonly used tool is mysqldump provided by MySQL, and for the latter, the commonly used tool is XtraBackup provided by Percona.
For databases with relatively small scale and not busy businesses, mysqldump is generally used.
So what is the backup principle of mysqldump?
Aside from the source code, we can open the general log and view the commands executed during mysqldump full-database backup to understand the principle behind mysqldump.
Open general log
mysql> set global general_log=on;
The storage path of general log can be viewed using the following command:
Mysql> show variables like '% general_log_file % ';
Perform full-Database Backup
# Mysqldump -- master-data = 2-R -- single-transaction-A-phello> 3306_20160518. SQL
Where
-- Master-data: 2 indicates that the comment of change master is generated in the backup file. In this example
-- CHANGE MASTER TO MASTER_LOG_FILE='mysql2-bin.000049', MASTER_LOG_POS=587;
If this value is set to 1, the command for changing the MASTER is generated instead of the comment.
-R backup stored procedures and functions
-- Single-transaction: obtain consistent backup of InnoDB tables.
-A is equivalent to -- all-databases.
Let's take a look at the content in general log.
160518 11:00:59 14 Connect root@localhost on 14 Query /*!40100 SET @@SQL_MODE='' */ 14 Query /*!40103 SET TIME_ZONE='+00:00' */ 14 Query FLUSH /*!40101 LOCAL */ TABLES 14 Query FLUSH TABLES WITH READ LOCK 14 Query SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ 14 Query START TRANSACTION /*!40100 WITH CONSISTENT SNAPSHOT */ 14 Query SHOW VARIABLES LIKE 'gtid\_mode' 14 Query SHOW MASTER STATUS 14 Query UNLOCK TABLES 14 Query SELECT LOGFILE_GROUP_NAME, FILE_NAME, TOTAL_EXTENTS, INITIAL_SIZE, ENGINE, EXTRA FROM INFORMATION_SCHEMA.FILES WHERE FILE_TYPE = 'UNDO LOG' AND FILE_NAME IS NOT NULL GROUP BY LOGFILE_GROUP_NAME, FILE_NAME, ENGINE ORDER BY LOGFILE_GROUP_NAME 14 Query SELECT DISTINCT TABLESPACE_NAME, FILE_NAME, LOGFILE_GROUP_NAME, EXTENT_SIZE, INITIAL_SIZE, ENGINE FROM INFORMATION_SCHEMA.FILES WHERE FILE_TYPE = 'DATAFILE' ORDER BY TABLESPACE_NAME, LOGFILE_GROUP_NAME 14 Query SHOW DATABASES 14 Query SHOW VARIABLES LIKE 'ndbinfo\_version'
Among them, the more important are the following:
1. FLUSH /*! 40101 LOCAL */TABLES
Closes all open tables, forces all tables in use to be closed, and flushes the query cache.
2. FLUSH TABLES WITH READ LOCK
Execute the flush tables operation and add a global read lock. Many shoes may be curious. These two commands seem to be repeated. Why didn't I add the lock when I first executed the flush tables operation? In fact, the reason for doing so is to minimize the impact of locking.
The global read lock is added to allow read only, and update operations are not allowed.
3. SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ
Set the transaction isolation level of the current session to RR to avoid repeated read and phantom read.
4. start transaction /*! 40100 with consistent snapshot */
Obtain the snapshot of the current database, which is determined by -- single-transaction in mysqldump.
This is only applicable to tables that support transactions. In MySQL, only Innodb is supported.
Note: The start transaction and start transaction with consistent snapshot are not the same,
Start transaction with consistent snapshot is equivalent to executing the SELECT Operation on each Innodb table after the start transaction is executed.
If you do not understand the transaction, the child shoes may think that this will be more difficult. In fact, the so-called unrepeatable read and phantom read can be simply understood as that in the same transaction, the two SELECT statements have different results.
Start transaction with consistent snapshot is used because the backup time of each table is different. Therefore, operations on the second table must be performed during the backup of the first table, it is not reflected in the SELECT Operation executed when the second table starts the backup. (Note: The underlying implementation of mysqldump backup is select * from tab ). This cannot be implemented using start transaction.
5. SHOW MASTER STATUS
This is determined by -- master-data. It records the status information of binlog when the backup is started, including MASTER_LOG_FILE and MASTER_LOG_POS.
6. UNLOCK TABLES
Release the lock.
Because my database only has the following four databases:
mysql> show databases;+--------------------+| Database |+--------------------+| information_schema || mysql || performance_schema || test |+--------------------+4 rows in set (0.03 sec)
During the backup, we can find that only mysql and test are backed up, And information_schema and cece_schema are not backed up.
Let's take a look at the log output information for backing up mysql and test,
Because there are too many log output information, only the log information of the test database is selected here. There are two tables in the test Database: test and test1.
14 Init DB test 14 Query SHOW CREATE DATABASE IF NOT EXISTS `test` 14 Query SAVEPOINT sp 14 Query show tables 14 Query show table status like 'test' 14 Query SET SQL_QUOTE_SHOW_CREATE=1 14 Query SET SESSION character_set_results = 'binary' 14 Query show create table `test` 14 Query SET SESSION character_set_results = 'utf8' 14 Query show fields from `test` 14 Query SELECT /*!40001 SQL_NO_CACHE */ * FROM `test` 14 Query SET SESSION character_set_results = 'binary' 14 Query use `test` 14 Query select @@collation_database 14 Query SHOW TRIGGERS LIKE 'test' 14 Query SET SESSION character_set_results = 'utf8' 14 Query ROLLBACK TO SAVEPOINT sp 14 Query show table status like 'test1' 14 Query SET SQL_QUOTE_SHOW_CREATE=1 14 Query SET SESSION character_set_results = 'binary' 14 Query show create table `test1` 14 Query SET SESSION character_set_results = 'utf8' 14 Query show fields from `test1` 14 Query SELECT /*!40001 SQL_NO_CACHE */ * FROM `test1` 14 Query SET SESSION character_set_results = 'binary' 14 Query use `test` 14 Query select @@collation_database 14 Query SHOW TRIGGERS LIKE 'test1' 14 Query SET SESSION character_set_results = 'utf8' 14 Query ROLLBACK TO SAVEPOINT sp 14 Query RELEASE SAVEPOINT sp 14 Query use `test` 14 Query select @@collation_database 14 Query SET SESSION character_set_results = 'binary' 14 Query SHOW FUNCTION STATUS WHERE Db = 'test' 14 Query SHOW CREATE FUNCTION `mycat_seq_currval` 14 Query SHOW PROCEDURE STATUS WHERE Db = 'test' 14 Query SET SESSION character_set_results = 'utf8' 14 Quit
From the above output, we can see that:
1. The core of backup is SELECT /*! 40001 SQL _NO_CACHE */* FROM 'test1' statement.
This statement queries all data in table test1 and generates an insert statement in the backup file.
The function of SQL _NO_CACHE is that the query results are not cached in the query cache.
2. show create database if not exists 'test', show create table 'test1'
Generate the database creation statement and table creation statement.
3. show triggers like 'test1'
We can see that if the-R parameter is not added, the trigger will be backed up by default.
4. show function status where Db = 'test'
Show create function 'mycat _ seq_currval'
Show procedure status where Db = 'test'
Used to back up stored procedures and functions.
5. Set the SAVEPOINT, back up each table, and roll back to the SAVEPOINT.
I don't know why I need to set it like this. It seems completely unnecessary,
Because the TRANSACTION that was previously enabled through start transaction with consistent snapshot can only end through commit or rollback, rather than rollback to savepoint sp.
Summary:
1. The essence of mysqldump is to use the select * from tab to obtain table data.
2. start transaction /*! 40100 with consistent snapshot */must be placed between the flush tables with read lock and the unlock tables. Putting it before will cause start transaction /*! The DML statements executed between 40100 with consistent snapshot */and flush tables with read lock are lost. After they are placed, the data will be inserted repeatedly from the database.
3. mysqldump is only applicable to low-cost businesses. If data operations are performed frequently during the backup process, the Undo tablespace becomes larger and larger. The undo tablespace is stored in the shared tablespace by default, the feature of ibdata is that it will not contract once it increases.
4. The efficiency of mysqldump is still relatively low. start transaction /*! 40100 with consistent snapshot */it can only end after all the tables are backed up. In fact, it is highly efficient to submit a table after the backup, in this way, the space occupied by the Undo tablespace snapshot can be released as soon as possible. However, by doing so, consistent backup of all tables cannot be achieved.
Refer:
Http://tencentdba.com/blog/mysqldump-backup-principle/