MariaDB Parallel Replication Parallel replication

Source: Internet
Author: User

Official Document: Https://mariadb.com/kb/en/mariadb/parallel-replication


Starting with version 10.0.5, MARIADB began to support parallel replication

MariaDB10.0 can perform queries and copy operations in parallel from the server, this article will explain how it is implemented and what you can do to tune it.


Note: The version of MariaDB on the master-slave server must be a later version of 10.0.5 and 10.0.5 to enable parallel replication


Parallel Replication Overview--Parallel Replication overview


MariaDB replication is done in three steps:

1. Read the change events from the service's IO thread to the main library and place the sequence of the read events into the relay log

2. A SQL thread from the server reads one event in relay log at a time

3.SQL threads apply the events in relay log in turn


In versions prior to MariaDB 10, the third step was performed through SQL threads, which means that only one event can be executed at a time, and replication is inherently single-threaded.

In versions after MariaDB 10, the third step could be through a set of independent replication threads, because multiple events can be replicated in parallel at one time, improving replication performance.


How to enable parallel slave--how do I turn on parallel replication

The my.cnf Specifies slave-parallel-threads = n is passed as a parameter to MySQL.

Parallel replication can also be used for multi-source connection Settings "@ @connection_name. Slave-parallel-mode" To none, which is set @ @connection_name. Slave-parallel-mode = None


The value of Slave_parallel_threads determines how many worker threads in the pool will be created to perform parallel replication events from the server

If the value is 0, the default is to use the old copy method (that is, slave SQL thread to replay event events)

Typically, if the value is not 0, it should be set to at least twice times the number of links to the multi-source master server. A connection with a worker thread is of little significance, as this will not only increase the burden of internal thread traffic (that is, the traffic between SQL threads and worker threads), and when a connection uses a worker thread, events cannot be replicated in parallel.

The value of the slave_parallel_threads can be dynamically adjusted to make changes without restarting MySQL, but when the change occurs, all connections from the library are stopped.


What can is run in parallel--how to run parallel replication


What are the ways of parallel replication?

Two ways of In-order order and Out-of-order disorder

Orderly Way

The transactions are executed in parallel, but the commit order is sorted to ensure that the transaction commits from the library are consistent with the order of the transactions on the main library. Only those transactions that are automatically identified as not causing conflicts will be executed in parallel, meaning that parallel replication is completely transparent to the application.


Unordered mode

Unordered means that the order in which transactions are executed and committed from the library may not be consistent with the main library, that is, the application must have an issue that tolerates an inconsistent order of data updates on the master and slave servers, and this approach requires the application to ensure that there is no conflict between transactions. Unordered mode is only used when the Gtid pattern and application explicitly specify the use of unordered, and the replication domain is part of the Gtid.


Conservative mode of In-order parallel-optimistic parallel copy mode (optimistic)

Optimistic mode, supported from MARIADB version 10.1.3.


This pattern provides a large number of parallel application slave, while still preserving the exact transactional semantics from the application's point of view.

Turn on the Use configuration option--slave-parallel-mode=optimistic.


Any transaction DML (INSERT/update/delete) can run in parallel, reaching the limit @ @slave_domain_parallel_threads. This may cause a conflict in slave if two transactions attempt to modify the same row, detect any such conflicts, and the latter of the two transactions is rolled back, allowing the former to continue, once the former has been completed, the latter's transaction is re-tried.


The optimistic mode works in this way because the server will have some conflicting optimistic assumptions, and this extra work takes a rollback after a transaction conflict when the most trades are run reasonably parallel.


There is some inspiration to try to avoid unnecessary conflicts if a row lock wait is performed on a transaction in master. It will no longer run slave in parallel,

A transaction can also be explicitly identified as a potential conflict on master by setting the variable @ @skip_parallel_replication. This heuristic may be added later in the MARIADB version (not currently supported)

This is the next option--slave-parallel-mode called aggressive "aggressive" mode. When these heuristics are disabled, allow more transactions to be applied in parallel.


Non-transactional DML and DDL insecure optimistic applications are parallel because it can no longer be rolled back in case of conflict. Therefore, in optimistic mode, non-transactional (such as MyISAM) updates do not apply to parallel or earlier events. (However, it may be appropriate to update after a MyISAM update in parallel), and DDL statements do not apply to any other transaction, either early or late.


Different transaction types can be determined in the Mysqlbinlog output, for example:

#150324 13:06:26 Server ID 1 end_log_pos 6881 GTID 0-1-42 DDL

...

#150324 13:06:26 Server ID 1 end_log_pos 7816 GTID 0-1-47

...

#150324 13:06:26 Server ID 1 end_log_pos 8177 GTID 0-1-49 Trans

/*!100101 SET @ @session. skip_parallel_replication=1*//*!*/;

...

#150324 13:06:26 Server ID 1 end_log_pos 9836 GTID 0-1-59 Trans waited


Gtid 0-1-42 indicates DDL. Gtid 0-1-47 is marked as non-transactional DML. While Gtid 0-1-49 is a transactional DML ("trans" keyword), Gtid 0-1-49 runs on master @@ Skip_parallel_replication,gtid 0-1-59 is transactional, There is a row waiting for a DML lock on master. ("waited" keyword).


Conservative mode of In-order parallel-conservative parallel replication (Conservative) mode


The default is conservative mode, the only mode available in version 10.0, which can be turned on using--slave-parallel-mode=conservative.

In conservative mode, parallel replication uses "group submissions" on master to discover the slave of potential parallel application events.

On Mater If two things are committed in a group and they are all written into the same commit ID, the binlog of such events must not conflict with each other, and they can schedule the schedule to be run by replication to a different thread.


Two transactions that are committed separately by the Lord May conflict (for example, modifying one row of the same table). Therefore, the worker applying the second transaction does not start immediately.

But wait until the first transaction begins to commit the step, at which point it is safe to start a second transaction because it can no longer disturb the first execution.


This is an example of the Mysqlbinlog output, which shows that Gtid events is marked with a commit Id,gtid 0-1-47 no commit ID, cannot run in parallel,

Gtid 0-1-48 and 0-1-49 have the same commit ID 630, which can be replicated in parallel on another slave.


Binlog information is as follows:

#150324 12:54:24 Server ID 1 end_log_pos 20052 GTID 0-1-47 Trans

...

#150324 12:54:24 Server ID 1 end_log_pos 20212 GTID 0-1-48 cid=630 Trans

...

#150324 12:54:24 Server ID 1 end_log_pos 20372 GTID 0-1-49 cid=630 Trans


In either case, when two transactions arrive at a low-level committed point and determine the order of submission, the two commits occur in the same order on the Mater, and the operation is transparent to the application.

If more transactions are committed in a group submission, you can increase the chances of parallel replication to the slave.

This can be adjusted using Binlog_commit_wait_count and BINLOG_COMMIT_WAIT_USEC variables, for example, if the application can tolerate an additional delay of 50 milliseconds in master, you can set the Binlog_commit_ Wait_usec = 50000 and Binlog_commit_wait_count = 20, in a usable time, parallel replication can get up to 20 transaction processing.

But be careful not to set the binlog_commit_wait_usec too high, because this could cause the application to slow down and run a lot of serial small transactions one after another.


Note: There is still an opportunity to increase parallel speeds on master even without the parallelism of group submissions.

Because the actual submission steps for different transactions can be run. Particularly effective on slave, Binlog Open (Log_slave_updates=1)

Even if slave is configured for crash repair security (sync_binlog=1 and Innodb_flush_log_at_trx_commit=1), this group submission may be on slave


In--slave-parallel-mode=minimal mode, only the transaction commit step is applied in parallel. All other transactional replication occurs consecutively.


Out-of-order parallel--unordered parallel replication

Out-of-order parallel replication is used only in Gtid mode, when a different replication domain gtid is used, the replication domain is a variable gtid_domain_id that is used in dba/applications.

There are two transaction gtids in parallel replication, different domain_id are set on different threads, and allow execution to be completely independent of this, it is the application's responsibility to set different domain_ids only for those truly independent transactions,

and ensure that they do not conflict, the application must also work properly, even if there are different domain_id in the transaction, which are considered in different order, between slave and master, or in different slave.


Because the application can explicitly provide more opportunities to run transactions concurrently. Rather than the server can be determined automatically, parallel replication in the order of transactional parallel replication may automatically determine itself


A simple but effective usage is to run a long-running statement in a separate replication domain, such as change table. This allows replication of other transactions to continue uninterrupted:

SET SESSION gtid_domain_id=1

ALTER TABLE T ADD INDEX Myidx (b)

SET SESSION gtid_domain_id=0


Typically, a long-running ALTER TABLE statement or other query will stop all subsequent transactions. will cause slave to lag behind master at least a long time to execute this long-running query.

Sets the replication domain ID through unordered parallel replication. can be avoided. When you run ALTER TABLE, the dba/application must ensure that no conflicting transactions will be replicated.


Another common condition for out-of-order parallel replication is Yudoyan replication. Assuming we have two different master M1 M2, we take advantage of the multi-source replication S1 as a M1 and M2 SLAVE,S1 receive events from the M2 event parallel M1.

If we now have a one-third-level slave, copy master from S1 S2, we want S2 to also be able to apply events originating from M1 and M2 parallel events. This can be achieved from sequential parallel replication, which is different by setting the M1 and M2 gtid_domain_id.



Note that there is no special restriction that you can use out-of-order parallel replication, so that the only restriction on the same database/schema, even in the same table, is that the operation must be non-conflicting, that they must be able to apply in any sequence and eventually get the same result.



When an unordered parallel copy is used,

The current slave position is the master location

In master Binlog at any one time, become multidimensional-each replication domain can reach a different point,

The current position can be seen in the Gtid_slave_pos variable. When slave is stopped, restarted, or switched to replicate from different master using change master, MARIADB automatically handles restarting each replication domain at Binlog appropriate point.


When--slave-parallel-mode=minimal (or none) mode, unordered parallel replication is off.


Checking worker thread status in Show Processlist--check the thread state of the display list

In show Processlist, the worker threads will list "System User", their status will show the query they are currently working on, or it can display one of them:


"Waiting for work from main SQL threads". "Wait for work from the primary SQL thread".

This means that the worker thread is idle and there is no time for work to be used for it.

"Waiting for prior transaction to start commit before starting next transaction". "Wait until the next trade starts before you commit".

This means that the first batch of committed transactions must be completed in the primary. This worker thread waits for it to happen before it can start working in the following batch.

This worker thread was waiting for, happen before it can start working on the following batch.

"Waiting for prior transaction to commit". "Awaiting priority transaction submission".

This means that the transaction has been executed by the worker thread, and in order to ensure the order is committed, the worker thread waits for the commit, knowing that the last transaction is ready before it is committed.



Expected performance gain--performance expectations test article


Here's a performance article about 10 times times improved when using parallel replication

Article address: http://kristiannielsen.livejournal.com/18435.html.


slave-parallel-max-queued Configuration parameters:

The variable slave_parallel_max_queued is the only meaningful, assuming that the parallel replication technique is used (the Slave_parallel_threads parameter is greater than 0 o'clock), and when parallel replication is used, SQL threads Pre-read the event in Relay-log, queue time in memory, and in parallel replication, looking for opportunities for parallel execution of events. The @ @slave_parallel_max_queued variable is set to the SQL thread to read how much memory limits the log, looking for such an opportunity. Each thread is limited, so this read-ahead value is set @ @slave_parallel_threads variable is worth the


If this value is set too high and slave is too far behind master (such as gigabytes of Binlog), then the SQL thread can read quickly, and the Binlog event that fills a lot of memory is faster than the worker thread consumes


On the other hand, if the value is set too low, the SQL thread may not have enough space to keep enough of the event queue busy threads, which can degrade performance.


Note that the @ @slave_parallel_max_queued is not a hard limit because the current execution of the Binlog event always needs to be kept in memory.

For example, at least two events per worker thread can always be queued in memory, regardless of whether slave_parallel_threads is multiple


Typically, slave_parallel_threads should be set large enough that SQL threads take advantage of all possible parallelism to read far enough Binlog, and in normal operation, slave will not want to lag too far behind, so a large number of data queues will not have to be put in memory. So Slave_parallel_threads may set a fairly high (such as hundreds of KB) without limiting throughput. It should be set low enough that slave_parallel_threads * slave_parallel_max_queued does not cause the server to be running out of memory.


Slave_domain_parallel_threads Configuration parameters:

Share replication worker threads in all multi-source primary connections, in all replication domains that can be replicated in parallel using sequential replication.

If a primary connection or a replication domain is currently executing processing a long query, it is possible that it will be assigned to all worker threads in the pool and can only wait for their long-running queries to complete.

Delay any master master library connection or replication domain, which will have to wait for the replication domain worker to become idle.


You can avoid values below the Slave_parallel_threads setting by setting the value of the slave_domain_parallel_threads variable.

When the setting is not equal to 0 o'clock, each replication domain in a primary connection can retain multiple threads for up to one time period.

The rest, reaching the value of the Slave_parallel_threads parameter, has no other primary connection or replication domain to use in parallel replication.


The slave_domain_parallel_threads variable can be dynamically modified and can be changed without restarting the service. All slave must stop changing.


This article from the "Flying Ants" blog, declined to reprint!

MariaDB Parallel Replication Parallel replication

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.