[Reprint]github online Change MySQL table structure tool gh-ost

Source: Internet
Author: User
Tags table definition

GitHub has officially announced the release of Gh-ost:github's MySQL no-trigger online Change table Definition tool in open source mode!

Gh-ost was developed by GitHub in recent months to address a frequently encountered problem: changing product requirements constantly require changes to the MySQL table structure. Gh-ost changes the structure of the online table by means of a small, controllable, auditable and easy-to-operate scheme.

Before introducing Gh-ost, let's look at the various existing scenarios and why you should develop a new tool for yourself.

Existing online modification table definition scheme

Currently, the task of modifying the table definition online is accomplished in these three ways:

    • Modify the table definition from the library, and then promote it to the new Main library.
    • The InnoDB online DDL feature via MySQL.
    • Use the Modify Table Definition tool. Now the most popular is the Percona company's Pt-online-schema-change and Facebook's OSC, others use LHM or the earliest oak-online-alter-table.

There are other examples such as Galera cluster's rolling Schema Upgrade, or non-INNODB engine tables. GitHub's MySQL database uses a master-slave replication architecture, using a reliable InnoDB engine.

Why did we decide to design a new solution instead of just choosing one of the options above? Existing solutions have their own limitations, and the following is a simple analysis of their shortcomings. We will mainly analyze the shortcomings of the trigger-based online modification table Definition tool.

    • We've been using Pt-online-schema-change for years. However, as our data grows and our business pressure increases, we are confronted with more and more problems, even to the point where many modifications are considered "dangerous operations". There are some actions that you can only dare to perform during non-business hours or weekends, and the rest will always cause MySQL to stop service. All existing online modification table definition tools use MySQL triggers to migrate data, so there are some problems.

What's wrong with a trigger-based solution?

All tools used to modify table definitions are similar: Create a temporary table with the same definition as the original table, change the table definition when there is no data on it, and then copy the data from the original table to the temporary table incrementally, and continue to manipulate the data on the original table in progress (all inserts applied to the original table , delete, update operations) are also applied. When the tool copies all the data and the data on both sides is synchronized, it replaces the original table with this temporary table. The modification process is over.

Like Pt-online-schema-change, LHM and oak-online-alter-table, these tools use synchronous replication, and every data modification to a table is immediately applied to the temporary table in the same transaction. The Facebook tool uses asynchronous mode, which is written in a modified log table and then taken out to execute, applying the modification to the temporary table. These tools all use triggers to extract actions that are applied on the target table.

Triggers are stored procedures that are triggered when an insert, delete, modify operation is on a table. Triggers can include a lot of statements that are run in the same transaction space as the one that raised the trigger, thus guaranteeing the atomicity of these operations.

In general, triggers, especially trigger-based table definition modification operations, have the following problems:

  • Triggers are stored procedures, which are interpreted code, and MySQL does not precompile. By hard embedding them into the transaction space of a business operation, each operation performed on the table you want to modify increases the cost of the command parsing and interpreting.
  • Lock: The trigger shares the same transaction space as the action statement, and when the action statement releases the lock on the original table, the trigger then releases the lock on the other table. The consequences of such behavior are particularly severe in synchronous mode. Lock contention on the main library is directly related to write concurrency. We have been in a production environment where there has been an almost complete lock-up in the lock competition, completely inaccessible to the table or the entire database. Another lock caused by a trigger is a lock on the metadata when the trigger is created or destroyed. When we finished modifying the table definition and removed the trigger from the busy table, we had a situation where we could not provide the service for a few 10 seconds or even a few minutes.
  • Cannot pause: When the main library business load starts to increase, you may want to pause or cancel tasks that have not yet been completed to modify the table definition. But the trigger-based scenario has no way of doing this. You may be able to pause the operation of a row copy, but you cannot pause the trigger, because stopping the trigger causes the data to be dropped in the temporary table. Therefore, the trigger must always be in a working state throughout the process. On some busy servers, we have seen that even if the online operation is completely stopped, the main library is still being dragged to death by triggers.
  • Concurrent modifications: We all want to be able to modify the definition of multiple tables at the same time. Given the cost of the triggers analyzed above, we are not afraid to change the definition of multiple tables at the same time as the trigger pattern, and we have not heard of any company actually doing this online.
  • Test: You may want to test the feasibility of modifying the scheme and evaluate the load. A trigger-based scenario can only be simulated from the library through statement-based replication, because the copy operation from the library is single-threaded (that is, using multithreaded replication scenarios, in most cases), so that it is far from simulating the actual situation in the modification process on the main library.

Gh-ost

GH-OST is the acronym for GitHub's online Schema transmogrifier/transfigurator/transformer/thingy, meaning GitHub's on-Line table definition Converter.

Gh-ost has the following features:

    • No trigger
    • Lightweight
    • Can be paused
    • Dynamic controllable
    • Can be audited
    • Can be tested
    • Reliable

No trigger

Gh-ost does not use a trigger, it tracks the binary log file, and after the modification of the original table is committed, the modified content is applied asynchronously to the temporary table.

Gh-ost expects the binary to use the row-based log format, but this does not mean that if a statement-based log format is used on the main library, it cannot be used to modify the table definition online. In fact, the way we used to do this is to turn the log statement pattern from the library to the row mode, and then read the log from the library. It's not complicated to take one of these from the library.

Lightweight

Because there is no need to use triggers, gh-ost decouples the load defined by the modified table from the normal business load. It does not need to consider concurrent operations and competition on the modified tables, which are serialized in the binary log, gh-ost only the temporary table, completely unrelated to the original table. In fact, Gh-ost also serializes the write operation of the row copy to the binary log, so that for the main library there is only one connection that writes data to the temporary table sequentially, which is quite different from the common ETL.

Can be paused

Because all writes are gh-ost generated, and the read binary itself is an asynchronous operation, Gh-ost is completely able to pause all writes to the main library when paused. Pausing means that there are no writes and updates to the main library. However, the Gh-ost also has an internal status tracking table that writes the heartbeat information to the table even in a paused state, with negligible write volumes.

Gh-ost provides more features than a simple pause, which can be done in addition to pausing:

    • Load: A function similar to Pt-online-schema-change, the user can set the threshold value of MySQL indicator, such as setting threads_running=30.
    • Replication delay: Gh-ost built-in heartbeat feature to check for replication latency. The user can specify which delay to view from the library, and the Gh-ost default is to directly view the one from the library on which it is connected.
    • Command: The user can write some commands, depending on the output results to decide whether to start the operation. For example: SELECT HOUR (now ()) between 8 and 17.

All of the above indicators can be modified dynamically even in the process of modifying the table definition.

    • Flag bit file: Generate a flag bit file, Gh-ost will immediately pause. Delete the file, Gh-ost will resume work again.
    • User command: Connect the gh-ost over the network and let it pause by command.

Dynamic controllable

If another tool produces a higher load during the modification, the DBA has to stop it and modify the configuration, such as making a copy of the data smaller and then modifying the process from the beginning. Such repeated operations cost a lot.

Gh-ost gets the command by listening to a TCP or UNIX socket file. Even with ongoing modifications, the user can issue commands to the Gh-ost to modify the configuration, such as:

    • Echo Throttle | Socat-/tmp/gh-ost.sock: This is the pause command. You can also enter No-throttle
    • Modify the operating parameters, Gh-ost can accept such modifications to change its behavior: chunk-size=1500, max-lag-millis=2000, max-load=thread_running=30

Can be audited

Using the same interface described above, you can also view the status of the Gh-ost, view the current task progress, the main configuration parameters, the relevant MySQL instance, and so on. This information can be obtained by sending commands over the network, thus giving operators a great deal of flexibility, and if you are using a different tool, you can usually only share the screen or keep track of the latest content of the log file.

Can be tested

The operation of reading the contents of a binary file does not increase the load on the main library at all, and the operation of modifying the table structure from the library is very similar to that of the main library (although not exactly the same, but mostly the same).

The gh-ost comes with the--test-on-replica option to support the test feature, which allows you to run a modify table structure operation from the library, pauses the master-slave copy at the end of the operation, keeps both tables in sync, ready, and then switches the tables back on. This allows the user to take a leisurely look at the two tables and compare them.

We do this on GitHub in the production environment Test Gh-ost: We have a number of designated production from the library, not service on the above, but the cycle of constantly changing all the table definitions to change. For every table in our production environment, as small as hundreds of GB in size, it is modified by modifying the storage engine (ENGINE=INNODB) so that it does not really modify the table structure. At the end of each such modification we will stop the master-slave copy, and then the original table and the temporary table of the full amount of data each to do a checksum, and then compare two checksum, they are required to be consistent. Then we resume the master-slave copy, and then continue to test the next table. Every table in our production environment is so used Gh-ost has done several modifications to the library.

Reliable,

All of the above and what is not mentioned is to give you confidence in your ability to gh-ost. After all, people have been doing this for years with similar tools, and Gh-ost is just a new tool.

  • We tested the gh-ost from the library and we successfully tried thousands of times from the library before we made the first real change to the main library. So, please also start testing from the library, verify that the data is intact, and then use it in the production environment. We hope you can go and try.
  • After you have executed the gh-ost, you may see the load of the main library getting higher, so you can issue a pause command. Use the echo throttle command to generate a file to see if the load on the main library will become normal again. Try these commands, and you'll know how you can control the behavior and your heart will settle a lot.
  • You initiated a modification operation, and then estimated that the completion time is 2 o'clock in the morning, but you are very concerned about the last switch operation, very want to see it switch, what can I do? Only one flag bit file is required to tell Gh-ost to postpone the switch, so that Gh-ost will only finish copying the data, but will not switch tables. It also continues to synchronize data, keeping the data in the staging table in sync. After you return to the office the next morning, delete the flag file or send a command to Gh-ost Echo Unpostpone, it will switch. We don't want the software to force us to watch it do things, it should liberate us and let people do what they want to do.
  • When it comes to estimating completion time, the--exact-rowcount option is useful. At the very beginning, to make a costly select COUNT (*) operation on the target table to find out how many rows of data to copy, Gh-ost will have a more accurate estimate of how much work it will do. Next, during the copy process, it will continually try to update this estimate. Because the estimated time to complete is always changing, the percentage that has been completed is rather accurate. If you have had a very painful experience, watching that you have finished 99%, but the rest of the operation continues for one hours, you will enjoy the functionality that we provide.

Gh-ost working mode

Gh-ost can connect to multiple MySQL instances while working, and also get binary log events by connecting one of the instances from the library. Depending on your configuration, the database cluster schema, and where you want to perform the modification, there are a number of different modes of operation.

1, connect from the library, modify on the main library

This is the default working mode of Gh-ost, which will look at the library situation, find the main library of the cluster and connect to it. The specific steps to modify the operation are:

    • Read and write row data on the main library;
    • To read the binary log events from the library and apply the changes to the main library;
    • View table format, field, primary key, total number of rows, etc. from the library;
    • Read the Gh-ost internal event log (such as heartbeat) from the library;
    • Complete table switching on the main library;

This mode can be used if the binary log format of the main library is statement. But from the library must be paired with the Enable binary log (Log_bin, log_slave_updates), also set to ROW format (binlog_format=row), in fact Gh-ost will help you to do these settings from the library.

In fact, even if you change from library to row format, this is still the least intrusive mode of operation for the main library.

2, connected to the main library

If you are not from the library, or do not want to operate from the library, it is also possible to use the main library directly. Gh-ost will do all the work directly on the main library. The master-slave replication delay can still be viewed above.

    • The main library must produce a binary log in row format;
    • You must use the--allow-on-master option to turn on this mode when starting gh-ost.

3. Modify and test from the library

This mode will be modified from the library. Gh-ost will still connect to the main library, but all operations are made from the library and will not have any impact on the main library. During the operation, the Gh-ost is also paused occasionally, so that the data from the library can be kept up to date.

    • The--migrate-on-replica option allows Gh-ost to modify the table directly from the library. The final switchover process is also done in a state of normal replication from the library.
    • --test-on-replica indicates that the operation is for testing purposes only. The replication is stopped before the final switchover operation. The original table and the temporary table will switch to each other, then switch back, and eventually the original table is not passive. When the master-slave replication is paused, you can examine and compare the data in both tables.

Application of Gh-ost in GitHub

The table modification operations for GitHub production environments are now all done with Gh-ost. Every day whenever there is a need to come, it will run, sometimes a day will be done many times. Because it has audit and control functions, we can also integrate it with our chatops system. The technician can have a very clear understanding of the progress of its work, so it can control its behavior. Various indicators and events are collected in the production environment, allowing you to visualize the operation in a graphical manner.

Open source

Gh-ost is published to the open source community under the MIT license Agreement.

Although it is now stable, there are some aspects that want to continue to improve. GitHub now publishes it in the hope of getting involved and contributing to the community. They will also keep the community's advice on the public.

[Reprint]github online Change MySQL table structure tool gh-ost

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.