Xtrabackup Execution Process

Last Update:2018-11-01 Source: Internet

Author: User

Tags table definition rsync

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Xtrabackup Execution Process

Perform operations on the database during full backup
Https://www.cnblogs.com/digdeep/p/4946230.html

We can see that xtrabackup has two threads in total for full backup.

Set session lock_wait_timeout = 31536000:Because if a session uses the lock tables statement to lock a table, a session is executing DDL, or a session is conducting a large transaction, then, flush tables and flush tables with read lock will be blocked. The lock wait time is set to prevent innobackup from failing to exit because the execution of Global lock retrieval times out.
Flush no_write_to_binlog tables:Disable all opened tables, force disable all tables in use, and refresh the query cache and pre-prepared statement cache. The query results are also deleted from the query cache. By default, the flush statement will write data to BINLOG. Here, use no_write_to_binlog to disable record. Check the BINLOG and find that nothing is recorded in the BINLOG.
Flush tables with read lock:Close all opened tables and use global locks to lock all tables in all databases (only select tables can be locked and no other operations can be performed ). This is the most efficient method when we back up data or need a consistent database status. If a transaction exists, the transaction will be hang and will not be rolled back. However, it does not prevent the database from inserting data into log tables (such as general_log and slow log.
The function of flush no_write_to_binlog engine logs is as follows:Store the redo logs at the InnoDB layer to the disk and then copy the logs. To put it bluntly, the transaction and non-transaction tables are backed up to obtain the global read lock. After the show Master Status statement is used to obtain the POs of the binlog, refresh the logs in the redo log buffer to the disk, and then the redo log copy thread copies the final redo log data. Why is the data complete? Because the global read lock is obtained before the unlock tables is released, no more requests will be sent in.
The role of unlock tables is:Release the global read lock.
The following operations are performed between flush tables with read lock and unlock tables.
A. Copy all non-transaction tables, such as the system MyISAM table.
B. Disk the redo log buffer.
C. Copy the redo log
Xtrabackup backup process
1. Connect to MySQL for version check.
2. Read the configuration file to obtain the data and log file location.
3. Scan monitoring to read the redo log, and copy the new redo log to the logfile of xtrabackup.
4. Copy the shared tablespace file and InnoDB. IBD Data File
5. Close all opened tables, obtain the global read lock, and start copying non-InnoDB tables and files starting to backup non-InnoDB tables and files.
6. Copy the redo log to the xtrabackup logfile.
7. Release the global read lock.
8. Record BINLOG information, and end the backup.
When the full backup is restored, no operations are performed on the database, and no records are recorded in the full log.
Incremental Backup only differs from full backup in the following ways:
A. before copying an incremental InnoDB table, the number of changed pages is counted.
Select 'innodb _ changed_pages ', count (*) from information_schema.plugins where plugin_name like 'innodb _ changed_pages'

B. Use full scan for Incremental backup. The shared tablespace and IBD files backed up are incremental files with the suffix. Delta.
======================
6. innobackupex option optimization/best practices 6.1 ftwrl lock optimization: When backing up a non-InnoDB database, the global lock flush tables with read lock will be used to lock the entire database. If a long query is running in the database, ftwrl cannot be obtained and will be blocked, blocking all DML operations. At this time, even if we kill the ftwrl global lock, it cannot be recovered from the blocking. In addition, after we successfully obtain the ftwrl global lock, the entire database is also locked during the process of copying non-transactional files. So we should make the ftwrl process as short as possible. (When copying files of non-transaction engine data, InnoDB Transaction engine is blocked. Of course, it will also block all other non-transaction engines .) 1> prevent blocking: innobackupex provides multiple options to avoid blocking: -- ftwrl-wait-Timeout = # Replace -- lock-wait-Timeout this option specifies time in seconds that innobackupex shocould wait for queries that wocould block ftwrl before running it. if there are still such queries when the timeout expires, innobackupex terminates with an error. default is 0, in which case innobackupex does not wait for queries to complete and starts ftwrl IMM Ediately. -- ftwrl-wait-Threshold = # Replace -- lock-wait-threshold this option specifies the query run time threshold which is used by innobackupex to detect long-running queries with a non-zero value of -- ftwrl-wait-Timeout. ftwrl is not started until such long-running queries exist. this option has no effect if -- ftwrl-wait-timeout is 0. default value is 60 seconds. -- lock-wait-Timeout = 60 this option indicates that we are in ftwrl If there is a long query, we can wait up to 60 seconds. If the long query is finished within 60 seconds, we can successfully execute ftwrl, if the execution is not completed within 60 seconds, an error is reported and quit. The default value is 0 -- lock-wait-Threshold = 10. This option indicates how long the SQL statement runs as a long query. For long queries, you can wait for a maximum of -- lock-wait-timeout seconds. -- Kill-Long-queries-Timeout = 10 this option indicates that after ftwrl is sent, it will wait for several seconds. If there is a long query, kill it. The default value is 0, not to kill. -- kill-Long-query-type = {All | select} This option indicates that we only kill select statements, or kill all other types of long SQL statements. We do not need to have these options. Generally, we only need to use -- lock-wait-Timeout = 60. Note that the -- lock-* And -- kill-* options are different. One is to wait for several seconds to execute ftwrl again. If the execution fails, an error is returned and the other is to run ftwrl, kill the instance upon timeout. 2> shorten the ftwrl global lock time: -- rsync uses this option to shorten the lock time for backing up non-transaction engine tables. If you need to back up a large number of databases and tables, it can speed up. -- Rsync uses the rsync utility to optimize local file transfers. when this option is specified, innobackupex uses rsync to copy all non-InnoDB files instead of spawning a separate CP for each file, which can be much faster for servers with a large number of databases or tables. this option cannot be used together with -- stream.3> Parallel Optimization: -- Parallel = # In the backup phase, compression/Decompression phase, encryption/Decryption phase, -- apply-log, -- copy-back Parallel on backup, this option specifies the number of threads the xtrabackup child process shocould use to back up files concurrently. the option accepts an integer argument. it is passed directly to xtrabackup's -- parallel option. see the xtrabackup documentation for details.4> memory optimization: -- use-memory = # in the crash recovery stage, that is, this option is used in the -- apply-log stage. This option accepts a string argument that specifies the Amou Nt of memory in bytes for xtrabackup to use for crash recovery while preparing a backup. multiples are supported providing the Unit (e.g. 1 MB, 1 GB ). it is used only with the option -- apply-log. it is passed directly to xtrabackup's -- use-memory option. see the xtrabackup documentation for details.3> Backup slave: -- safe-slave-Backup stop slave SQL thread and wait to start backup until slave_open_temp_ta Bles in "show status" is zero. if there are no open temporary tables, the backup will take place, otherwise the SQL thread will be started and stopped until there are no open temporary tables. the backup will fail if slave_open_temp_tables does not become zero after -- safe-slave-backup-timeout seconds. the slave SQL thread will be restarted when the backup finishes. -- safe-slave-backup-Timeout = # H Ow seconds -- safe-slave-Backup shocould wait for slave_open_temp_tables to become zero. (default 300) -- slave-info this option is useful when backing up a replication slave server. it prints the binary log position and name of the master server. it also writes this information to the "xtrabackup_slave_info" file as a "change master" command. A new slave for this master can be set up by starting A slave server on this backup and issuing a "change master" command with the binary log position saved in the "xtrabackup_slave_info" file. 7. backup principle: 1) innobackupex is a script written in Perl. It calls xtrabackup to back up the InnoDB database. Xtrabackup is a C language program that calls the InnoDB function library and the mysql client function library. The InnoDB function library provides the redo log function applied to data files, while the mysql client function library provides the function of parsing command line parameters. Innobackupex backs up the InnoDB database by calling xtrabackup -- backup and xtrabackup -- prepare. We do not need to directly use xtrabackup for backup, so it is more convenient to use innobackupex. Xtrabakup redirects to the datadir directory and then completes the backup process using two threads: 1> log-copy thread: When the backup starts, this background thread monitors the redo log (check the redo log every second) and copies the modification to the backup file xtrabackup_logfile. If the redo log generation speed is extremely high, the log-copy thread may not be able to keep up with the redo log generation speed, xtrabakcup reports an error when the redo log file is switched to overwrite. 2> data-file-copy thread: There is a thread that copies data files. Note that this is not a simple copy, but the InnoDB function library is called to open data files like the InnoDB database, read, copy a page each time, and then verify the page. If a verification error occurs, it will be repeated up to ten times. When the data file copy is complete, xtrabackup stops the log-copy thread and creates a file xtrabackup_checkpoints to record the Backup Type, LSN at the beginning and lsn at the end. The xtrabackup_binlog_info file generated by the backup means the position information of the corresponding BINLOG when the backup is complete, similar to: mysql-bin.000002 120 records the lsn at the backup start, and then a thread copies the data file, one thread monitors the redo log and copies the newly generated redo log during the backup process. Although our data files are obviously not consistent, we use the crash-recovery function of InnoDB to apply the redo log files generated during the backup process, you can obtain consistent data at the moment when the backup is complete. Note that copying a data file is divided into two processes: one is to copy the data file of the InnoDB Transaction engine and there is no need to hold the lock; the other is to copy non-transaction engine data files and table definition files. FRM. When copying these files, you must first use ftwrl and then copy the files. This will cause the entire database to be blocked. During Incremental backup, the table is fully scanned and compared to the lsn. If the lsn of the page is greater than the lsn of the last time, copy the page to the table_name.ibd.delta file. At the time of reply, Delta and redo log will be applied to the full backup data file. During Incremental Backup recovery, except for the last Incremental backup file, other Incremental backup files can only be rolled forward during the application, and cannot be rolled back because no transaction has been committed, it may have been committed in the next Incremental backup. If you rolled back during the previous Incremental backup, the next Incremental backup application will obviously report an error because it cannot commit transactions, the transaction is rolled back.
==========
1. Set the timeout time
Xtrabackup sets a timeout period to avoid waiting for an indefinite period of time. Xtrabackup provides the following parameters to implement this function:
-- Lock-wait-Timeout = seconds: Once the flush table with read lock is blocked for more than the specified time, xtrabackup returns an error and exits. The default value is 0, that is, once it is blocked, failed to return immediately.
-- Lock-wait-query-type = All | update: this parameter allows users to specify which SQL statements need to be flush table with read lock, at the same time, you can use-lock-wait-Threshold = seconds to set the waiting time. If the waiting time is not within the type specified by query-type or exceeds the time specified by wait-threshold, errors are returned for all xtrabackup statements. If the update type is specified, update/alter/replace/insert will wait. All indicates all SQL statements.
2. kill other blocking threads
Kill all threads blocking flush table with read lock:
-- Kill-Long-queries-Timeout = seconds.
-- Kill-Long-query-type = All | select: The default value is all. If select is selected, only the SELECT statement is killed. If flush table with read lock is blocked by the update statement, xtrabackup will not be processed.
When backing up a database, the database O & M personnel should select the correct xtrabackup version to avoid this problem. At the same time, when I used xtrabackup for slave backup, I also encountered a deadlock with the SQL thread. Mariadb performs parallel replication. The deadlock information is as follows:

Xtrabackup Execution Process

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More