Pagerduty, an emerging internet start-up, is a product that can send reminders on the server, including on-screen displays, phone calls, SMS notices, email announcements, and more. At present, AdMob, 37Signals, StackOverflow, Instagram and so on have adopted the pagerduty as the message notification as well as the sudden event processing tool. Doug Barth, author of this article, shared how Pagerduty successfully migrated existing system MySQL to the XTRADB cluster and the pros and cons of the process.
Six months ago, Pagerduty successfully migrated existing systems from MySQL to the xtradb cluster and ran Amazon EC2 on it.
old system Configuration Analysis
From a configuration perspective, this is a very typical MySQL environment:
A pair of Percona servers are responsible for writing data to a DRBD volume.
The DRBD volume is backed up with the main vice-EBS two-machine.
Set up two synchronous replication databases, which can seamlessly transfer the business system to the secondary server when there is a problem with the primary server.
Configure a series of asynchronous replication machines to respond to major disasters, emergency backups, and sudden change maintenance.
Problems
After years of dedicated old system services, the face of increasingly prominent reliability problems, began to appear powerless. In addition, every time the main server switch, there is no doubt that a tragedy: to carry out DRBD host switching, first of all, in the main server to interrupt MySQL, offline DRBD volume, change from the server state to the main server, reload DRBD, and finally restart MySQL. This whole process can lead to a service outage, because MySQL has a cooling buffer pool in the process of interruption to reboot, and the cooling mechanism needs time to warm up before the system service is back on track.
We tried to reduce the interruption time by Percona buffer pool recovery (buffer-pool-restore), but this is as She as our large buffer pool. At the same time, this feature adds additional system resource overhead. Another problem is that once an unexpected primary server switch occurs, the asynchronous from the server will stop functioning and must be restarted manually.
The reason for embracing the xtradb cluster
XTRADB Cluster Features: Phase in the previous two-machine system, the cluster is the three-machine operation at the same time, 22 for synchronous backup. Therefore, the connection switching time is greatly reduced.
Supports multiple master servers simultaneously online, each host server has a hot buffer pool. Asynchronous from the server can select any node as the host, the transfer between the nodes does not interrupt the backup replication process.
The automated node mechanism works well with our current automation system. After configuring the new node, we only need to submit a node address, the new node will automatically receive a data backup set, synchronized data will be loaded into the main server farm.
Pre-preparation
It is necessary to prepare the XTRADB cluster to access the current system. Part of the simple MySQL fine-tuning, the rest is some foundational operation.
Operations on MySQL:
Make sure that you only set the primary key in the InnoDB datasheet.
Make sure that query cache is not used because cluster is not supported.
Replication is changed from a statement based approach to a line based approach.
In addition to the above MySQL-side operations, in order to be able to carry out independent testing on the DRBD server, the application system side needs to make the following changes:
A distributed lock mechanism is used, because MySQL uses a local to cluster node, for example: Executing a SELECT FOR UPDATE statement.
Replace the MySQL lock with a zookeeper lock.
In order to verify that all written data can be synchronized on all nodes, we change the operation logic to replace the one-time large-scale data processing with a large number of small-scale data processing.
Selection of pattern changes
It is far-reaching to make the mode change in the XTRADB cluster. There are two ways to implement the cluster, one is the total order isolation (TOI, the general sequence is separated), the other is the fellow schema upgrade (RSU, rolling mode upgrade).
In RSU mode, the node is allowed to be updated individually. When executing DDL statements, synchronize each node in sequence, then rejoin the cluster after execution completes. However, this function incurs instability, and the system problems caused by a large number of data refresh actions are unavoidable, because RSU needs to wait for the DDL statement to complete before caching.
In contrast, the TOI update operation is a one-time synchronization of all nodes, blocking cluster communication until the update completes. After weighing it, we decided to adopt the TOI model. The cluster was not blocked this time due to a shorter system outage.
Migration process
First of all, we build a cluster in the current system as a subordinate of the current DRBD database. When the subordinate database receives all the write operations, we can conduct a stress test to see how it is carrying capacity, and collect and analyze the relevant data.
After a series of related benchmark tests, we found that two technical details were able to help achieve consistency in the system before and after the migration:
The optimal write performance can be obtained by setting the value of the Innodb_flush_log_at_trx_commit to 0 or 2. Since all changes are replicated to 3 nodes, no data loss occurs even in the event of a failure.
Innodb_log_file_size values need to be set to a larger value, we set it to 1GB.
After a test, the XTRADB cluster's indicators were satisfactory, and we proceeded to actually switch.
First, back up the configuration in all test environments. Because once cluster comes down, we can quickly recover it to a single node cluster. We have prepared specific operating procedures and conducted the relevant stress tests.
After the secondary server settings for the two DRBD servers of an existing system, we also set up subordinate settings for the remaining servers (for example, disaster recovery, backup, and so on). When we were ready, we performed a regular subordinate upgrade to switch the system to a new environment.
The schema changes before and after the switch are shown in the following figure:
Analysis of the advantages after switching
When you restart and update a running cluster, you successfully avoid the impact of the communication disruption that was previously caused. Schema change succeeded in toi mode (Pt-online-schema-change). The ability to write conflict processing is optimized. When a conflict is found, XTRADB cluster returns a deadlock error message that can also trigger the error message when the DDL statement is executed in TOI mode. After that, the conflict error causes the application server to return a 503 error, and our load balancing layer setting catches the error and then attempts to submit the write request again on the other server.
Analysis of the disadvantages after switching
A partial cluster key status counter is changed by state, for example, after performing a display global status Directive (show global status), its value is reset to 0. This can make it difficult to perform important state monitoring based on the counter, such as flow control, because the frequent changes in value can result in the inability to accurately monitor the state of the system (the problem is resolved in the Galera 3.x system using XTRADB Cluster 5.6). When a write operation conflict occurs, the MySQL dynamic record adapter ignores exceptions thrown from the interactive statement.
The system cooling preheating problem remains to be further improved. At present, our application server is connected to a local haproxy instance, which sends its own connection data to a cluster node. When performing a scheduled maintenance task, we can only slowly push the data into another node to preheat the buffer pool before it can fully host the entire system load. In the future, we will completely switch to multiple-host environment settings as planned to ensure that all nodes have a ready buffer pool.