This article introduced the MySQL backup principle, welcome everybody to read.
Backup is the last line of defense for data security, and for any data loss scenario, backups may not necessarily restore the absolute data (depending on the backup cycle), but at least minimize the loss. There are two important metrics for measuring backup recovery: Recovery-point objectives (RPO) and recovery-time objectives (RTO), which focus on how much time to recover, while the latter focus on how long the recovery will take. This article focuses on MySQL backup scenarios, focusing on the principles of several methods of backup, including file system snapshots (LVM), logical Backup tool mysqldump,mydumper, and physical Backup tool Xtrabackup, as well as a detailed explanation of the pros and cons of several scenarios, and problems that may be encountered.
Cold backup
The simplest way to do this is to shut down the MySQL server and then copy all the files underneath the data directory, and then copy the directory to the machine that needs to be recovered. This approach is really convenient, but it doesn't work in the production environment. Since all machines are to be serviced, even slave sometimes need to provide read-only services, so it is unrealistic to turn off MySQL stop-and-serve backups. A concept corresponding to cold backup is hot backup, the so-called hot backup is without affecting the external services of MySQL, backup, hot backup is the focus of this article discussion.
Snapshot backup
The first hot backup to describe is a snapshot backup, which means that the database is backed up by a snapshot feature supported by the file system. The rationale behind the backup is to put all the database files in the same partition, and then perform a snapshot of the partition, which, in the case of Linux, needs to be implemented via LVM (Logical volumn Manager). LVM uses write-time replication (copy-on-write) technology to create snapshots, such as a logical copy of an instant of the entire volume, similar to the MVCC of the InnoDB storage engine in the database, except that the LVM snapshot is at the file system level, and MVCC at the database level, And only the INNODB storage engine is supported. The LVM has a snapshot reservation area, and if the original volume data changes, LVM ensures that the affected blocks are replicated to the snapshot reservation area before any changes are written. In simple terms, the snapshot area retains all old data consistent with the start of the snapshot point. Snapshots are also very small for updates to very few databases. For MySQL, in order to use snapshot backups, you need to place data files, log files in a logical volume, and then back up the volume snapshots. Because a snapshot backup is local only, the snapshot is corrupted if the local disk is damaged. Snapshot backups are more biased against misoperation, you can quickly restore the database to the point at which the snapshot was generated, and then combine the binary logs to restore to a specified point in time. The basic principle is the following diagram:
Logical backup
Cold and snapshot backups due to its drawbacks are rarely used in production environments, using more MySQL-backed logical backup and physical backup tools, this section is mainly about logical backup, the MySQL official provides mysqldump logical backup tool, although good enough, but there is a problem of single thread backup slow. Provides better logical backup tool mydumper in the community, its advantages are mainly reflected in multithreaded backup, faster backup.
Mysqldump
mysqldump for backup, you have to mention two key parameters:
--single-transaction: Before starting the backup, execute the start transaction command to obtain a consistent backup that is valid only for the InnoDB storage engine.
--master-data=2: A bit that is primarily used to record consistent backups.
To understand how mysqldump works, be sure to treat the transaction table (INNODB) and non-transaction tables (such as MyISAM) differently, because the process of backup is closely related. And, so far, we can't circumvent the MyISAM table, even if all our business tables are INNODB because the MyISAM table is still used in the system tables in the MySQL library.
The basic process of backup is as follows:
1. Call FTWRL (flush tables with read lock) to prohibit read-write globally
2. Open the snapshot read, get the snapshot at this time (only the InnoDB table works)
3. Back up non-InnoDB table data (*.FRM,*.MYI,*.MYD, etc.)
4. InnoDB table after backup, release FTWRL lock
5. Backup InnoDB table Data
6. Backup complete.
The entire process, you can refer to my colleague's diagram, but his diagram only consider the InnoDB table backup, in fact, unlock tables before the completion of the InnoDB table has been backed up, the t1,t2 and T3 in the back of the essence are InnoDB tables, And the 5.6 mysqldump uses the savepoint mechanism to release the MDL lock on a table every time a table is backed up, avoiding a longer lock on a table.
You may have a question, why was the lock released before the InnoDB table was backed up, which actually took advantage of the MVCC mechanism of the InnoDB engine, and when it was opened, it was possible to get consistent data for that time, no matter how long it took to backup, until the end of the transaction (commit).
Mydumper
mydumper principle is similar to the principle of mysqldump, the biggest difference is that the introduction of multithreaded backup, each backup thread backup part of the table, of course, the concurrent granularity can go to the row level, to achieve the purpose of multithreaded backup. To solve the biggest problem here is how to ensure the consistency of the backup, in fact, the key is FTWRL. For a InnoDB table, you need to complete the table backup before releasing the lock. For InnoDB tables, it is necessary to ensure that multiple threads can get a consistent bit, which is also done during the holding of a global lock, because the database is not read-write and the bit is guaranteed to be consistent. So the basic process is as follows:
Physical Backup (Xtrabackup)
Using a query to extract all the records in the data relative to the logical backup, the physical backup is more straightforward, copying the database files and logs to complete the backup, and therefore faster. Of course, both open source Mydumper or the official latest Backup tool (5.7.11 mysqlpump) Support multi-threaded backup, so the speed difference may be further reduced, at least from the current production environment, physical backup use is still more. Because Xtrabackup supports backing up the InnoDB table, the tool we use in the actual production environment is Innobackupex, which is a layer of encapsulation of xtrabackup. The Innobackupex script is used to back up a InnoDB table, and the Xtrabackup command is invoked to back up the InnoDB table, and the basic process of Innobackupex is as follows:
1. Open the Redo log copy thread, starting from the latest checkpoint to copy the redo log;
2. Open IDB file copy thread, copy innodb table data
3.IDB file copy end, notify call FTWRL, get consistency bit
4. Back up non-InnoDB tables (System tables) and frm files
5. Because no new transaction commits at this time, wait for redo log copy to complete
6. After the latest redo log copy is complete, the equivalent InnoDB table and non innodb table data are all up to date
7. Get the Binlog bit, at which point the database state is consistent.
8. Release the lock and end the backup.
Improvement of Xtrabackup
From the logical backup and physical backup described earlier, no matter what backup tool, it is strongly dependent on FTWRL in order to obtain a consistent bit point. This lock is very lethal because the entire database is virtually incapable of providing a write service for the duration of the lock. In addition, because the FTWRL needs to close the table, if there is a large query, it will cause the FTWRL to wait, resulting in a longer DML blocking time. Even if it is a standby, there are SQL threads that cause the primary standby to be delayed when the update from the Master library is replicated on the global lock. From the previous analysis, FTWRL this lock holding time is mainly related to the amount of InnoDB table data, if the InnoDB table data volume is very large, backup is slow, then hold the lock time will be very long. Even if all is InnoDB table, also because has the MySQL Storehouse system table existence, causes will lock a certain time. To address this problem, the Percona company has improved the MySQL server layer by introducing backup LOCK, in particular, backing up non-InnoDB table data through the "Locktables for Backup" command, through "LOCK Binlog For backup to obtain a consistent bit point, minimizing the damage to the services caused by the database backup. Let's take a look at the difference between the two locks and the FTWRL:
LOCK TABLES for BACKUP
role: Backing up data
1. Prohibit non-InnoDB table update
2. Disable DDL for all tables
Optimization point:
1. Will not be blocked by the large query (close the table)
2. The InnoDB table will not be blocked read and updated, this is very important, for the business table is all INNODB situation, the backup process, the DML is completely intact
Unlocktables
LOCK Binlog for BACKUP
Function: Gets the consistency bit.
1. Disable the operation of the update to the bit point
Optimization point:
1. Allow DDL and updates until Binlog is written.
Unlockbinlog
The above is the entire content of this article, I hope to help you learn.