MySQL Online DDL Improvement and Application, mysqlddl
This article analyzes the implementation principle and precautions of Online DDL. For any DDL operation, the performer must perform a pre-test or clearly understand whether the impact of this operation on the database is within the tolerable range of the database during the business period, especially for DDL operations on large tables, pay close attention to the I/O, memory, and CPU usage of the server (each DBA has a history of tears from DDL statements of large tables ).
If it is reprinted, please indicate the source of the blog: www.cnblogs.com/xinysu/. The copyright belongs to xiaoradish in the blog garden. Thank you for your support!
1 early DDL implementation principle (
Before 5.6.7
)In the early days, Innodb supported executing DDL statements through copy table and inplace. The principle is as follows:
- Copy table
- CreateTemporary tableAnd execute DDL statements on the temporary table.
- Lock the original table. DML is not allowed and query is allowed.
- Copy row-by-row data from the original table to the temporary table(This process is not sorted)
- After the copy is completed, the read operation of the original table is disabled, that is, the original table does not provide the read/write service at this time.
- Perform the rename operation to complete the DDL Process
- Inplace method (fast index creation, only for index creation and deletion)
- Create a temporary frm File
- Lock the original table. DML is not allowed and query is allowed.
- Query the data in the order of clustered indexes, locate the required index column data, sort the data, and insert it to the new index page.
- The original table does not allow read operations, that is, the original table does not provide read/write services at this time.
- Perform the rename operation,Replace frm filesTo complete the DDL process.
Inplace makes a major improvement on the basis of copy table. instead of copying the entire table, you only need to create the required index page on the original ibd file, this process saves a lot of I/O resource usage than copy table, and the ddl SQL Execution speed is greatly improved, reducing the duration of writing services not provided for this table. However, inplace only supports index creation and deletion, and does not support other DDL operations. Other DDL operations are still executed in copy table mode.
For an online business database, whether it is the copy table or inplace method, there is still an obvious drawback: during the operation, tables involved do not provide the write service! Table-to-table INSERT, UPDATE, and DELETE operations cannot be performed. SELECT is supported only.
2 Online DDL implementation principleWhen a table is DDL, the table may be inaccessible for several minutes or even hours, and the performance and response are abnormal. to effectively improve this situation, MySQL releases Online DDL in version 5.6.7. (This article will refer to the documentation on the official website 5.7 for sorting and testing ). Online DDL also includes the copy and inplace methods. For ddl SQL statements that do not support Online DDL, the COPY method is used. For ddl SQL statements that support Online DDL, the Inplace method is used. Inplace is classified into two categories: whether to rebuild a table and whether to modify the row record format. If you have modified the row record format, you need to rebuild the table, such as modifying the column type and adding or removing columns. If you have not modified the row record format and only modify the table metadata, you do not need to rebuild the table, modify only metadata, such as deleting indexes, setting default values, and renaming column names. For more information, see section 4th '. How is the internal implementation principle of the newly added Online DDL? (Here refer to: http://blog.itpub.net/22664653/viewspace-2056953/) There are three phases: prepare, execute, commit.
- PREPARE
- Create a new temporary frm File
- EXCLUSIVE_MDL lock held, read/write prohibited
- Determine the execution method (copy, rebuild, no-rebuild) based on the alter type)
- Memory Object for updating data dictionary
- If rebuild is required, allocate the row_log Object Record Increment
- Generate a new temporary ibd file if rebuild is required
- EXECUTE
- If only metadata is modified:
- No operation for this part
- Others, then:
- Lower EXCLUSIVE-MDL lock, allow read/write (copy not allow write)
- Record the incremental row-log generated during ddl execution (only required for the rebuild type)
- Scan the clustered index of the old_table to record each record.
- Traverse the clustered index and secondary index of the new table and process them one by one
- Construct corresponding index items based on record
- Insert the constructed index into the sort_buffer Block
- Insert the sort_buffer block into a new index.
- Apply the operations in row-log to the new temporary table and apply them to the last Block.
- COMMIT
- Upgrade to EXECLUSIVE-MDL lock to disable read/write
- Redo the row_log increment of the last part.
- Update the innodb data dictionary table
- Commit transactions and write redo logs
- Modify statistics
- Rename temporary ibd files and frm files
- DDL completed
Pay attention to row-log, which records data changes in tables during DDL execution. This ensures the concurrency of DDL tables, in the EXCUTE stage, the write service can be provided normally without blocking. Finally, apply row-log to the new table. During the test on version 5.7.17, it was found that the DDL that supports inplace and requires rebuild is directly refreshed to the original idb file if data changes occur during the DDL period, in the test environment, delete a field for the large table testddl. During this process, INSERT rows of records, and we can see that the original ibd file has increased by about 1 GB.
There is a doubt: the row-log should not be the modification format of the record row Records, because the efficiency is too slow, and it is preliminarily estimated that only the primary key is recorded, then, search for the new table based on the primary key.
Online DDL can effectively improve the impact on the database during the DDL period:
- During the Online DDL period, the query and DML operations can be normally executed in most cases, and the table lock time will be greatly reduced to ensure database scalability as much as possible;
- Allows in-place DDL operations to avoid occupying too much disk I/O and CPU resources by recreating tables, reducing the overall load on the database, so that the database can maintain high performance and high throughput during the DDL process;
- DDL operations that allow in-place operations consume less buffer pools than those that require copying to temporary files. This avoids temporary performance degradation during DDL operations, because data needs to be copied to a temporary table before, this process will occupy the buffer pool, and some frequently accessed data in the memory will be cleared.
3 Online DDL parameters and options
3.1 innodb_online_alter_log_max_sizeWhen DML occurs in the online ddl process, data changes are recorded in row-log. The row-log size is set by innodb_online_alter_log_max_size. The default value is 128 MB, when the table is large and frequently operated, you can set this parameter up during the DDL process to avoid 1799 errors:
3.2 Online DDL syntax
- Alter table .... ,ALGORITHM[=] {DEFAULT | INPLACE | COPY },LOCK[=] {DEFAULT | NONE | SHARED | EXCLUSIVE}
3.3 lock optionsThis option is used to adjust the DDL locking method. There are four options in total.
LOCK=EXCLUSIVE
- Adding an exclusive lock (x lock) to the entire table does not allow query and Modification
LOCK=SHARED
- Add (s lock) to the entire table to allow query operations, but does not support data change operations
LOCK=NONE
- No locks are added. Both query operations and Database Change operations are allowed. In this mode, concurrency is best.
LOCK=DEFAULT
- If no LOCK is specified, this option is used by default.
- Supports query and 0DML operations as much as possible based on the DDL operation type and the minimum lock.
- First, determine whether the current operation can use the NONE mode. If not, determine whether the SHARED mode can be used. If not, determine whether the operation can be used.
EXCLUSIVE Mode
3.4ALGORITHM Option
The impact of DDL on database performance is largely influenced by operation methods, such as whether to allow in-place, whether to request COPY operations, and whether to recreate the entire table. For example
A table, Modifying or adding the default value does not affect the data in the table, so it can be completed within 1 s. Adding an index takes dozens of seconds, you should add index data pages and modify frm files, but do not need to rebuild table data. Modifying the column data type may take several minutes or more, because it needs to re-build the entire table, a large amount of resources are requested for CPU, IO, and buffer pool during execution. The INPLACE, COPY, and REBUILD caused by DDL can be selected by specifying ALGORITHM. (Note that not all DDL statements support in-place. For details, see section 4th)
ALGORITHM=INPLACE
ALGORITHM=COPY
Among the two options, INPLACE is better than COPY, because INPLACE does not record UNDO logs or REDO logs, and provides DML operations during execution.
4 Online DDL syntaxOnline DDL has different execution rules for different DDL statements. The following table describes the support for Online DDL statements in detail. Column description:
- In-Place?
- Note: supported?
ALGORITHM=INPLACE
- Rebuilds Table?
- Note: will the table be rebuilt?
- There are two ways to recreate a table: INPLACE and COPY (modify in situ or COPY to a temporary file)
- If
ALGORITHM=INPLACE
Then, INPLACE is modified in the original place.(Pale yellow mark)
- If not
ALGORITHM=INPLACE
Is copied to the temporary file for modification, and the update delete insert operation is not supported.(Dark brown mark)
- Permits Concurrent DM
- Note: whether dml SQL operations on the table can be performed concurrently during DDL operations
- DML operations are not supported when you add spatial indexes and full-text indexes.
- When allowed, you can use the LOCK option to control whether to provide query or modification operations.
- LOCK = NONE, supports query and update insert delete operations
- LOCK = SHARED, only query supported
- Only Modifies Metadata?
- Modify metadata only
5 test records
5.1 Analysis of Four typical DDL operationsTo determine whether INPLACE is supported, whether REBUILD is required, and whether only metadata is modified for classification, select a ddl SQL statement for each type for testing. For details, refer to: Considering the variable length of varchar, add more tests here.
5.1.1 DDL test content
- Test DB environment: Table Name tbddl, table size: 1 GB, rows record
- Test process: Enable transaction query, do not submit => execute DDL => submit query transaction => execute DML => enable transaction, execute DML not submit => submit DML
- Test DDL SQL
- Alter table tbddl modify column ItemId VARCHAR (20 );
- Alter table tbddl ADD xinysu int;
- Create index IX_PROID ON tbddl (providerid );
- Alter table tbddl alter column xinysu set default 123456;
- Alter table tbddl alter column ItemId VARCHAR (50); # UTF8 Character Set, three bytes a character, 50 characters is 150 bytes, less than 256 bytes
- Alter table tbddl alter column ItemId VARCHAR (100); # UTF8 Character Set, three bytes a character, 100 characters is 300 bytes, more than 256 bytes
- Test focus
- Enable and disable old_alter_table
- What is the lock in the prepare and commit phases?
- What is the lock in the excute phase?
- Server performance during execution (zabbix Monitoring)
- Database concurrency during execution (sysbench stress test)
5.1.2 DDL test conclusionIf you do not describe the testing process, paste the testing results directly. You can test the programs you are interested in. VARCHAR is stored by character. Each character is calculated by character set. UTF8 is a three-byte character. When the number of VARCHAR bytes is less than bytes, only one byte is required to store the actual length, when the number of VARCHAR bytes is greater than or equal to 256, two bytes are required to store the actual length. For example, for VARCHAR (10) In the UTF8 Character Set, if N (0 <= N <= 10) is stored, the number of bytes occupied is N * 3 + 1; VARCHAR (100) in the UTF8 character set. Assume that N (0 <= N <= 100) is stored, the number of bytes occupied is N * 3 + 2. After understanding this, you can understand the DDL Processing Method for increasing or shortening the column length. Assume that the column VARCHAR (M) needs to be increased or reduced to VARCHAR (N ), the character set is UTF8:
- When 3 M <256, 3N <256, all the bytes in the storage length do not need to change to 1, you do not need to change the row record and only need to modify the metadata;
- When 3 M> 256, 3N> 256, the bytes in the storage length do not need to change. If both are 2, you do not need to change the row record. You only need to modify the metadata;
- When 3 M <256, 3N> 256, the length of the stored bytes needs to change from 1 to 2, the row record needs to be changed, and Online DDL uses the copy table method;
- When 3 M> 256, 3N> 256, the length of the stored bytes needs to change from 2 to 1, and the row record needs to be changed. Online DDL uses the copy table method.
5.2 multiple DDL operations in the same tableBefore the Online DDL operation, all DDL statements in the same table are combined into one SQL statement, avoiding the disadvantages of repeated Rebuild and multiple locks, resulting in the increase of DML duration. However, the introduction of Online DDL requires two changes:
- Except for DDL statements that do not support inplace, other DDL statements do not apply the X lock during execution, that is, Tables still provide DML operations.
- Lock granularity. In the same DDL statement, the lock is processed at the highest level.
- Ease of maintenance
We recommend that you use the following three methods (
Personal suggestions after testing, for reference only), See.
- Why is the copy table independent?
- Because DML operations are not allowed in this type of operation, we recommend that you merge this type of operation into a separate ddl SQL statement and not merge it with the ddl SQL statement of ipbench;
- Why are two types of IP addresses?
- Easy Maintenance
- The execution of DDL statements modified by metadata only ends quickly. To facilitate management and maintenance, all SQL statements are not pasted together. Only DDL statements modified by metadata are classified into one type.
- To avoid repeated REBUILD and waste disk I/O and CPU resources.
For example, to launch a project now, you need to modify the table tbddl with one field from INT to VARCHAR, and add three fields, two indexes, and two default values, the growth length of the two columns. Separate SQL statements are: alter table tbddl alter column ItemId varchar (20); ALTER TABLE tbddl ADD su int; ALTER TABLE tbddl ADD xin varchar (40 ); alter table tbddl ADD yu int; create index IX_SU ON tbddl (SU); create index IX_yu ON tbddl (yu); alter table tbddl alter column CreatedById set default 123456; alter table tbddl alter column ItemID set default 654321; alter table tbddl alter column CreatedByName VARCHAR (70); alter table tbddl alter column ModifiedByName VARCHAR (100); the test suggests the following execution methods: alter table tbddl alter column ItemId varchar (20); alter table tbddl ADD su int, ADD xin varchar (40), ADD yu int, alter column ModifiedByName VARCHAR (100 ), add index ix_su (SU), add index ix_yu (yu); alter table tbddl alter column CreatedById set default 123456, alter column ItemID set default 654321, alter column CreatedByName VARCHAR (70 );
5.3 database performance Exception Handling During DDL executionDuring DDL execution, you need to pay close attention to the CPU and I/O of the database server and check the connection pool and slow query of the database. What should I do if an exception occurs during the execution? Assume that a new column is added to the tbddl table. When the column is added, it is found that it affects online services and needs to be stopped urgently. perform the following steps:
- Show processlist;
- Kill process id;
For details, see.
5.4 database downtime during DDL executionDuring the DDL period, what is the impact of downtime on database recovery and startup? Does the temporary file still exist? Will the uncompleted DDL operations be automatically executed during the recovery process? If yes, what should I do? If not, will manual creation affect the operation? In version 5.7.17, four types of ddl SQL statements are tested. When a database goes down during DDL execution, the DDL statement does not affect the recovery and startup of the database, this unfinished DDL statement is not automatically executed. Because it is too late to clear temporary files during the downtime process, the temporary files still exist after the database is restored. If the DDL statement does not have a commit, the data dictionary of the database and the metadata of the table are not modified. If you manually execute the DDL statement again, no conflict is reported. (This is different from the analysis of some blog posts. This test version is 5.7.17.) During the testing process, I will not describe it too much here. I will directly post a conclusion. Interested packages can be tested on their own. You are welcome to discuss this.
5.5 DDL impact on Master/SlaveDuring the DDL period, it is assumed that the execution time of the SQL statement takes 10 h, except for the waitting metadata lock time, And the rebuild or inplace time takes 5 hours, therefore, when the slave database is a single-threaded SQL THREAD that applies the relay log, you need to consider the impact of slave database lagging. DDL execution in the master database. Since DDL statements are not submitted, they are not synchronized to the slave database. The slave database can synchronize other data modification operations normally. This step is correct, however, after the DDL statement is submitted in the master database, the binlog is transmitted to the slave database relay log through IO_THREAD. The SQL _Thread of the slave database is a single thread and the RELAY log is applied, it takes at least five hours, that is, the five hours are used to execute the relay log, and the bin log generated by the master database within several hours cannot be synchronized, A serious lag occurs in the slave database. whether the problem is acceptable must be included in the impact caused by DDL execution. If you cannot accept the large lag of slave databases, what can be done? You can use this idea to start parallel replication from the database. Note the following when starting parallel replication:
- Usage notes
- When the slave database lags behind the master database, you can enable this parameter for multi-thread parallel execution.
- It is not recommended to enable slave Database Synchronization for databases with low business volume.
- Configuration notes
- Note: master_info_repository relay_log_info_repository is set to table. By default, the data is written to mater_info.log and relay_info.log. The frequency of flushing these two files has a great impact on the performance, performance differences between 20 and 50%
- We recommend that you set slave_parallel_workers to the number of slave database cores.
- Slave_parallel_type
- Database, transactions of different databases, triggering concurrent playback of slave Databases
- Logical_clock: the transaction committed by the Group. The transaction is committed by the group and played back from the database in parallel. This configuration should be used to improve the DDL lag.
6 Online DDL considerations
- Disk Space
- Whether the datadir space is sufficient during rebuild
- Because the ibd file will be copied, make sure that the space is sufficient.
- Whether innodb_online_alter_log_max_size is sufficient during rebuild
- During the rebuild process, the generated DML involves the line record change log, and whether it is enough to store
- When using inplace, check whether the tmpdir space is sufficient.
- Is the ddl impact on slave database latency acceptable?
- Because no commit exists in the online ddl process of the master database, other concurrent operations can be synchronized to the slave database normally.
- After the master database is commit, DDL is synchronized to the slave database.
- Since the slave database executes SQL _THREAD in a single thread, assuming that the DDL execution takes one hour, the slave database will lag for one hour +
- Whether the slave database is allowed to lag. If not, parallel replication can be used to optimize the processing.
- Will row-log check duplicate values or modify conflicts?
- Checks based on the primary key and unique constraints.
- Copy table, how to pause DDL operations under inplace
- Show full processlist;
- Kill id; # (ddl SQL id)
- After the kill command is completed, you can still run the DDL statement normally again without any conflict. The temporary idb and frm files created by the command are automatically deleted.
- Copy table, inplace down
- In both cases, the uncompleted DDL statements will not continue to be executed
- However, the generated temporary frm and idb files are not deleted. They can be deleted manually or manually. Even if they are not deleted, the execution of DDL again is not affected.
- However, we recommend that you delete useless temporary files after the mysql service.
- Multiple DDL statements in the same table. Do not execute them one by one.
- Perform the merge operation based on whether inplace is supported and whether rebuild is required.
- How to view ddl progress (unsolved)
- If rebuild exists, the evaluation is based on the growth of ibd files; but if it is inplace, how can we check it? Is there any better way to view it? Does performance_schema provide the corresponding query method?