MySQL backup principles

Source: Internet
Author: User
Tags mysql backup
Backup is the last line of defense for data security. For any data loss scenario, although backup may not be able to restore of the data (depending on the backup cycle), it can at least minimize the loss. There are two important indicators for measuring Backup Recovery: recovery point objective (RPO) and recovery time objective (RTO). The former focuses on the extent to which recovery can be achieved, while the latter focuses on recovery.

Backup is the last line of defense for data security. For any data loss scenario, although backup may not be able to restore of the data (depending on the backup cycle), it can at least minimize the loss. There are two important indicators for measuring Backup Recovery: recovery point objective (RPO) and recovery time objective (RTO). The former focuses on the extent to which recovery can be achieved, while the latter focuses on recovery.

Backup is the last line of defense for data security. For any data loss scenario, although backup may not be able to restore of the data (depending on the backup cycle), it can at least minimize the loss. There are two important indicators for measuring Backup Recovery: recovery point objective (RPO) and recovery time objective (RTO). The former focuses on how much recovery can be achieved, the latter focuses on how long recovery takes. This article mainly discusses the MySQL backup solution and focuses on the principles of several backup methods, including File System Snapshot (LVM), logical backup tool Mysqldump, Mydumper, and physical backup tool Xtrabackup, at the same time, we will explain in detail the advantages and disadvantages of several solutions and possible problems.

Cold backup
The simplest backup method is to close the MySQL server and copy and save all the files under the data Directory. To restore the files, copy the directory to the machine to be restored. This method is indeed convenient, but it does not work in the production environment. Because all machines provide services, even Slave sometimes needs to provide read-only services, it is unrealistic to disable the MySQL service stop backup. One concept that corresponds to cold backup is hot backup, which backs up data without affecting MySQL's external services. Hot Backup is the focus of this article.

Snapshot Backup
The hot backup is snapshot backup, which backs up the database through the snapshot function supported by the file system. The principle of backup is to put all the database files in the same partition and then take snapshots of the partition. For Linux, it needs to be implemented through LVM (Logical Volumn Manager. LVM uses the copy-on-write technology to create snapshots. For example, the logical copy of the entire volume at a certain moment is similar to the MVCC of the innodb Storage engine in the database, however, LVM snapshots are at the file system level, while MVCC is at the database level, and only supports the innodb Storage engine. LVM has a snapshot reserved area. If the raw volume data changes, LVM ensures that the affected blocks are copied to the snapshot reserved area before any change is written. Simply put, the snapshot area retains all the consistent old data at the beginning of the snapshot point. For databases with few updates, snapshots are also very small. For MySQL, data files and log files must be stored in a logical volume for snapshot backup. Because Snapshot Backup can only be performed locally, if the local disk is damaged, the snapshot is damaged. Snapshot Backup is more inclined to prevent misoperation. You can quickly restore the database to the time point when the snapshot is generated, and then restore the database to the specified time point using binary logs. The basic principles are as follows:

Logical backup
Cold backup and snapshot backup are rarely used in the production environment due to their drawbacks. They are mostly used by MySQL's built-in logical backup and physical backup tools. This section focuses on logical backup, mySQL officially provides the Mysqldump logical backup tool. Although it is good enough, the single-thread backup is slow. Mydumper, a better logical backup tool, is provided in the community. Its advantages are mainly reflected in multi-threaded backup and the backup speed is faster.

Mysqldump
Mysqldump is used for backup and has to provide two key parameters:
-- Single-transaction: run the start transaction command before starting the backup to obtain consistent backup. this parameter is only valid for the innodb Storage engine.
-- Master-data = 2: used to record consistent backup points.
To understand how Mysqldump works, you must treat the transaction table (innodb) and non-transaction table (such as myisam) differently, because the backup process is closely related to this. Furthermore, we have been unable to avoid myisam tables so far, even if all our business tables are innodb, because the myisam table is still used in the system tables in the mysql database. The basic backup process is as follows:
1. Call FTWRL (flush tables with read lock) to disable read/write globally.
2. Enable snapshot reading to obtain the snapshot at this time (only for innodb tables)
3. Back up non-innodb table data (*. frm, *. myi, *. myd, etc)
4. Release the FTWRL lock after non-innodb table backup is completed
5. Backup innodb table data one by one
6. the backup is complete.
For the whole process, you can refer to a figure from my colleague. However, this figure only takes into account the backup of the innodb table. In fact, the non-innodb table has been backed up before the unlock tables execution is complete, the subsequent t1, t2, and t3 are essentially innodb tables, and the 5.6 mysqldump uses the retention point mechanism to release the MDL lock on a table after each backup, avoid locking a table for a longer time. For more information, see my previous blog: flush table with read lock.
You may have a question about how to release the lock before backing up the innodb table. This actually uses the MVCC mechanism of the innodb engine to enable snapshot reading, you can obtain consistent data at that time, no matter how long it takes to back up until the end of the entire transaction (commit.

Mydumper
The principle of Mydumper is similar to that of Mysqldump. The biggest difference is that multi-threaded backup is introduced. Each backup thread backs up a part of the table. Of course, the concurrency granularity can be Row-level to achieve multi-threaded backup. The biggest problem to be solved here is that the key to ensuring backup consistency lies in FTWRL. For non-innodb tables, you must back up the tables before releasing the lock. For the innodb table, make sure that multiple threads can obtain the consistent position. This action must also be completed during the holding of the Global lock, because the database does not read or write at this time, and the point consistency can be ensured. The basic process is as follows:


Physical backup (Xtrabackup)
Compared with logical backup, the physical backup is more direct than querying and extracting all records in data. Copying database files and logs is faster. Of course, both open-source Mydumper and the latest official backup tool (mysqlpump in 5.7.11) support multi-threaded backup, so the speed difference may be further reduced, at least from the current production environment, physical backup is still widely used. Because Xtrabackup supports innodb table backup, the tool we use in the actual production environment is innobackupex, which is a layer encapsulation of xtrabackup. The innobackupex script is used to back up non-InnoDB tables and the xtrabackup command is called to back up InnoDB tables. The basic process of innobackupex is as follows:
1. Enable the redo log copy thread to copy the redo log sequentially from the latest checkpoint;
2. Enable the idb file copy thread to copy innodb table data.
3. The copy of the idb file ends and FTWRL is called to obtain the consistent position.
4. Back up non-innodb tables (system tables) and frm files
5. Because no new transaction is committed at this time, wait until the redo log copy is complete.
6. After the latest redo log is copied, the innodb table and non-innodb table data are both up-to-date.
7. Obtain the binlog point. In this case, the database status is consistent.
8. Release the lock and the backup is complete.

Improvement of Xtrabackup
From the logic backup and physical backup described above, both tools depend heavily on FTWRL to obtain consistent points. This lock is highly lethal, because during the lock period, the entire database cannot provide write services externally. In addition, because FTWRL needs to close the table, if there is a large query, it will lead to FTWRL waiting, which leads to a longer DML blocking time. Even in the standby database, some SQL threads copy updates from the primary database. When the master database is locked globally, the Standby database may be delayed. According to the previous analysis, the FTWRL lock held mainly depends on the data volume of the non-innodb table. If the data volume of the non-innodb table is large and the backup is slow, the lock will take a long time. Even if all the tables are innodb tables, a certain period of time will be locked due to the existence of mysql database system tables. To solve this problem, Percona improved the Mysql Server layer and introduced backup lock. Specifically, it used the "lock tables for backup" command to back up non-innodb table data; use "lock binlog for backup" to obtain consistency points and minimize service damage caused by database BACKUP. Let's take a look at the difference between the two locks and FTWRL:

LOCK TABLES FOR BACKUP
Role: Back up data
1. Prohibit Non-innodb table updates
2. Disable ddl for all tables
Optimization points:
1. will not be blocked by large queries (close tables)
2. Reading and updating innodb tables will not be blocked. This is very important. If all business tables are innodb, DML will not be damaged during the backup process.
UNLOCK TABLES

LOCK BINLOG FOR BACKUP
Purpose: Obtain the consistent position.
1. Do not update the position.
Optimization points:
1. Allow DDl and update until binlog is written.
UNLOCK BINLOG

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.