High-performance MySQL master-slave architecture replication principle and configuration detailed

Source: Internet
Author: User

Reprint: http://blog.csdn.net/hguisu/article/details/7325124

Review "high-performance MySQL" copy.

1 Replication Overview

MySQL's built-in replication capabilities are the foundation for building large, high-performance applications. The distribution of MySQL data across multiple systems is done by copying data from one of the MySQL hosts to the other host (slaves) and re-executing it again. One server acts as the primary server during replication, while one or more other servers act as slave servers. The primary server writes the update to the binary log file and maintains an index of the file to track the log loop. These logs can record updates that are sent to the slave server. When a primary server is connected from the server, it notifies the primary server where the last successful update was read from the server in the log. Receive any updates from the server from then on, and then block and wait for the primary server to notify the new updates.

Note that when you make a copy, all updates to the tables in the replication must be made on the primary server. Otherwise, you must be careful to avoid conflicts between updates to tables on the primary server and updates made to tables on the server.

1.1 MySQL Supported Replication types:

(1): statement-based replication: SQL statements executed on the primary server execute the same statement from the server. MySQL uses statement-based replication by default and is more efficient.
Row-based replication is automatically selected as soon as it is found that it cannot be copied accurately.
(2): Row-based replication: Copy the changed content past, rather than executing the command from the server again. Support Starting from mysql5.0
(3): Mixed-type replication: statement-based replication is used by default, and row-based replication occurs when a statement-based, inaccurate copy is found.

1.2. Replication-Resolved issues

MySQL replication technology has some of the following features:
(1) Data distribution (distribution)
(2) Load balancing (load Balancing)
(3) Backup (Backups)
(4) High-availability and fault-tolerant lines of availability and failover

1.3 How replication works

As a whole, there are 3 steps to replication:

(1) Master changes the record to binary log (these are called binary log events, binary logs event);
(2) Slave copies the binary log events of master to its trunk logs (relay log);
(3) Slave redo the event in the trunk log and change the data to reflect its own.

Describes the process of replication:

The first part of the process is the master record binary log. Master records these changes in two logs before each transaction update data is complete. MySQL writes the transaction serially to the binary log, even if the statements in the transaction are cross-executed. After the event is written to the binary log, master notifies the storage engine to commit the transaction.
The next step is to slave copy the binary log of master to its own trunk logs. First, slave starts a worker thread--i/o thread. The I/O thread opens a normal connection on master and then starts Binlog dump process. Binlog dump process reads the event from the binary log of master, and if it has been followed by master, it sleeps and waits for master to produce a new event. The I/O thread writes these events to the relay log.
The SQL slave thread (SQL slave thread) handles the last step of the process. The SQL thread reads events from the log and replays the events in them to update the slave data so that it is consistent with the data in master. As long as the thread is consistent with the I/O thread, the trunk log is typically located in the OS cache, so the overhead of the trunk log is minimal.
In addition, there is a worker thread in master: As with other MySQL connections, slave opening a connection in master will also cause master to start a thread. The replication process has a very important limitation-replication is serialized on slave, meaning that parallel update operations on Master cannot operate concurrently on slave.

2. Replication Configuration

There are two MySQL database servers master and Slave,master as the primary server, slave is the same as the data information from the server, initial state, Master and slave, when the data in master is changed, Slave also follow the corresponding changes, so that the master and slave data information synchronization, to achieve the purpose of backup.

Points:
The medium responsible for transmitting various modification actions in the main and from the server is the binary change log of the primary server, which records the various modifications that need to be transmitted to the slave server. Therefore, the primary server must activate the binary logging feature. The slave must have sufficient permission to connect to the primary server and request that the primary server transfer the binary change log to it.
Environment:
The MySQL database version of Master and slave is the same as 5.0.18
Operating system: Unbuntu 11.10
IP Address: 10.100.0.100

2.1. Create a Copy Account

1. Establish a backup account in Master's database: Each slave uses the standard MySQL username and password to connect to master. The user who makes the copy operation grants replication slave permissions. The password for the user name is stored in the text file Master.info

The command is as follows:
MySQL > GRANT REPLICATION slave,reload,super on * *
to [email protected] ' 10.100.0.200 '
Identified by ' 1234 ';

Create an account backup, and only allow to log in from the 10.100.0.200 address, the password is 1234.

(If the new old password algorithm for MySQL version is different, you can set: Set password for ' backup ' @ ' 10.100.0.200 ' =old_password (' 1234 '))

2.2. Copy data

(If you are completely new to installing MySQL master server, this step is not required.) Because the newly installed master and slave have the same data)

Shut down the master server, copy the data from master to the B server, synchronize the data in master and slave, and ensure that write operations are not performed on the master and slave servers until all the setup operations are complete. Make the data in both databases must be the same!

2.3. Configure Master

Next, the master is configured, including opening the binary log, specifying a unique SERVR ID. For example, add the following values to the configuration file:

Server-id=1
Log-bin=mysql-bin

Server-id: ID value for primary server A
Log-bin: Binary Change Day value

Restart MASTER, run the show Master STATUS, and the output is as follows:

2.4, Configuration slave

Slave configuration is similar to master, you also need to restart slave mysql. As follows:
Log_bin = Mysql-bin
server_id = 2
Relay_log = Mysql-relay-bin
Log_slave_updates = 1
READ_ONLY = 1
SERVER_ID is a must, and the only. Slave there is no need to open the binary log, but in some cases it must be set, for example, if Slave is the master of the other slave, you must set the Bin_log. Here we have the binary log turned on, and the display is named (the default name is hostname, but if the hostname changes there will be a problem).
Relay_log configures the trunk log, log_slave_updates indicates that slave writes the copy event into its own binary log (which you will see later).
Some people turn on the slave binary log, but do not set log_slave_updates, and then see if slave's data changes, which is a bad configuration. So, try to use READ_ONLY, which prevents the data from changing (except for special threads). However, read_only is useful, especially for applications that need to create tables on slave.

2.5. Start slave

The next step is to have slave connect to master and begin to redo the events in the master binary log. Instead of doing this with a configuration file, you should use the change MASTER to statement, which completely supersedes the modification of the configuration file, and it can specify a different MASTER for slave, without having to stop the server. As follows:

mysql> change MASTER to master_host= ' Server1 ',

Master_user= ' Repl ',

Master_password= ' P4ssword ',

Master_log_file= ' mysql-bin.000001 ',

master_log_pos=0;

The value of Master_log_pos is 0 because it is the starting position of the log.

You can use the show SLAVE status statement to see if the SLAVE settings are correct:

mysql> SHOW SLAVE Status\g

1. Row ***************************

Slave_io_state:

Master_host:server1

Master_user:repl

master_port:3306

Connect_retry:60

master_log_file:mysql-bin.000001

Read_master_log_pos:4

relay_log_file:mysql-relay-bin.000001

Relay_log_pos:4

relay_master_log_file:mysql-bin.000001

Slave_io_running:no

Slave_sql_running:no

..... omitted ...

Seconds_behind_master:null

Slave_io_state, slave_io_running, and slave_sql_running are no

Indicates that slave has not started the replication process. The location of the log is 4 instead of 0, because 0 is only the start of the log file, not the log location. In fact, MySQL knows the location of the first event is 4.

In order to start copying, you can run:

Mysql> START SLAVE;

Run Show SLAVE status to view the output:

mysql> SHOW SLAVE Status\g

1. Row ***************************

Slave_io_state:waiting for Master to send event

Master_host:server1

Master_user:repl

master_port:3306

Connect_retry:60

master_log_file:mysql-bin.000001

read_master_log_pos:164

relay_log_file:mysql-relay-bin.000001

relay_log_pos:164

relay_master_log_file:mysql-bin.000001

Slave_io_running:yes

Slave_sql_running:yes

..... omitted ...

seconds_behind_master:0

Here the main is to see:
Slave_io_running=yes
Slave_sql_running=yes

Slave I/O and SQL threads have started running, and seconds_behind_master is no longer null. The location of the log is increased, which means that some events have been acquired and executed. If you make changes on master, you can see the changes in the location of the various log files on the slave, as well as you can see the changes in the data in the database.

You can view the status of Master and slave on-line. On master, you can see the connection created by the slave I/O thread:

Enter show Processlist\g on master;

Mysql> Show Processlist \g

1. Row ***************************

Id:1

User:root

host:localhost:2096

Db:test

Command:query

time:0

State:null

Info:show processlist

2. Row ***************************

Id:2

User:repl

host:localhost:2144

Db:null

Command:binlog Dump

time:1838

State:has sent all binlog to slave; Waiting for Binlog to be updated

Info:null

2 rows in Set (0.00 sec)

Line 2 is the connection to the I/O thread that handles slave.

Run the statement on the slave server:

Mysql> Show Processlist \g

1. Row ***************************

Id:1

User:system User

Host:

Db:null

Command:connect

time:2291

State:waiting for Master to send event

Info:null

2. Row ***************************

Id:2

User:system User

Host:

Db:null

Command:connect

time:1852

State:has read all relay log; Waiting for the slave I/O thread to update it

Info:null

3. Row ***************************

Id:5

User:root

host:localhost:2152

Db:test

Command:query

time:0

State:null

Info:show processlist

3 Rows in Set (0.00 sec)

Row 1 is the I/O thread state and row 2 is the SQL thread state.

2.5. Add a new slave server

If master has been running for a long time, you want to synchronize data with the newly installed slave, even if it does not have master data.
At this point, there are several ways to get slave to start from another service, such as copying data from Master, cloning from another slave, and starting a slave from the most recent backup. When synchronizing with master, you need three things: Slave
(1) Data snapshot at a certain time of master;
(2) The master current log file, and the byte offset at the time the snapshot was generated. These two values can be called log file coordinate, because they determine the location of a binary log, and you can use the show MASTER Status command to find the coordinates of the log file;
(3) The binary log file of master.
You can clone a slave in the following ways:
(1) Cool copy (Cold copy)
Stop master, copy Master's files to slave, and restart master. The disadvantage is obvious.
(2) Hot copy (warm copy)
If you use only the MyISAM table, you can use the mysqlhotcopy copy even if the server is running.
(3) using mysqldump
Using mysqldump to get a snapshot of a data can be divided into the following steps:
<1> Lock Table: If you do not have the lock table, you should lock the table, prevent other connections to modify the database, otherwise, you can get the data is inconsistent. As follows:
Mysql> FLUSH TABLES with READ LOCK;
<2> in another connection, create a dump of the database you want to replicate with mysqldump:
shell> mysqldump--all-databases--lock-all-tables >dbdump.db
<3> release lock on table.
Mysql> UNLOCK TABLES;

3. In-depth understanding of replication

Some basic things about replication have been discussed, and the following is an in-depth discussion of replication.

3.1. Statement-based replication (statement-based Replication)

MySQL 5.0 and earlier versions only support statement-based replication (also called logical replication, logical replication), which is not common in databases. Master records the query that changes the data, and then slave reads the event from the log, and executes it, the same SQL statements as the statement that master executes.
The advantage of this approach is that it is simple to implement. In addition, the binary log of statement-based replication can be well compressed, and the amount of data in the log is small and consumes less bandwidth-for example, a query that updates gigabytes of data requires only dozens of-byte binary logs. And Mysqlbinlog is very convenient for statement-based log processing.
However, statement-based replication is not as simple as it seems, because some query statements depend on the specific conditions of master, for example, master and slave may have different times. So, MySQL's binary log format is not just a query statement, but also includes some metadata information, such as the current timestamp. Even so, there are some statements, such as the current user function, that cannot be copied correctly. In addition, stored procedures and triggers are also an issue.
Another problem is that statement-based replication must be serialized. This requires a lot of special code, configuration, such as InnoDB's Next-key lock, and so on. Not all storage engines support statement-based replication.

3.2. Record-based replication (row-based Replication)

MySQL adds record-based replication, recording the actual data changes in the binary log, similar to how some other DBMS implementations do. There are pros and cons to this approach. The advantage is that you can work correctly with any statement, and some statements are more efficient. The main drawback is that binary logs can be large and not intuitive, so you can't use Mysqlbinlog to view binary logs.
For some statements, record-based replication can work more efficiently, such as:
Mysql> INSERT into Summary_table (col1, col2, Sum_col3)
-SELECT col1, col2, sum (col3)
From enormous_table
GROUP by col1, col2;
Assume that there are only three unique combinations of col1 and col2, but the query scans many rows of the original table, but returns only three records. At this point, record-based replication is more efficient.
On the other hand, the following statement, statement-based replication is more efficient:
mysql> UPDATE enormous_table SET col1 = 0;
The cost of using record-based replication at this time is very high. Because both of these methods do not work well for all situations, MySQL 5.1 supports dynamic exchange before statement-based replication and record-based replication. You can control it by setting the session variable Binlog_format.

3.3. Copying related documents

In addition to binary logs and relay log files, there are other replication-related files. As follows:

(1) Mysql-bin.index

Once the server opens the binary log, it produces a file with the same name as the 2nd log file, but ending with. Index. It is used to track which binary log files exist on the disk. MySQL uses it to locate binary log files. The contents of it are as follows (on my machine):

(2) Mysql-relay-bin.index

The file functions like Mysql-bin.index, but it is for the trunk log, not the binary log. The contents are as follows:
. \mysql-02-relay-bin.000017
. \mysql-02-relay-bin.000018

(3) Master.info

Saves information about the master. Do not delete it, otherwise, slave cannot connect to master after a reboot. The content is as follows (on my machine):

I/O thread updates the Master.info file with the following (on my machine):

. \mysql-02-relay-bin.000019

254

mysql-01-bin.000010

286

0

52813

(4) Relay-log.info

Contains information about the current binary and trunk logs in slave.

3.4. Send replication events to other slave

When setting log_slave_updates, you can let slave play the Master of other slave. At this point, slave writes the events executed by the SQL thread to its own binary log (binary logs), and then its slave can get the events and execute it. As follows:

3.5. Copy filter (Replication Filters)

Replication filtering allows you to replicate only a subset of the data in the server, with two replication filters: Filtering events from the binary log on master, and filtering events in the trunk log on slave. As follows:

4. Common topological structure of replication

The architecture of replication has the following basic principles:
(1) Each slave can have only one master;
(2) Each slave can only have a unique server ID;
(3) Each master can have a lot of slave;
(4) If you set Log_slave_updates,slave to be the master of other slave, it will spread the update of master.

MySQL does not support multi-master server replication (multimaster Replication)-that is, a slave can have more than one master. However, with a few simple combinations, we can build a flexible and powerful replication architecture.

4.1, single master and multi-slave

It is easiest to make a replication system from a master and a slave. The slave does not communicate with each other and only communicates with master.

In real-world scenarios, MySQL replication over 90% is a master copy to one or more slave schema patterns, primarily for low-cost, scalable solutions to database-side applications with large read-stress ratios. Because as long as the pressure of master and slave is not too great (especially the slave pressure), the delay of asynchronous replication is generally very small. Especially since the slave end of the replication mode changed to two thread processing, but also reduced the slave end of the delay problem. The benefit is that the real-time requirements of the data is not particularly critical application, only through the cheap pcserver to expand the number of slave, the reading pressure spread to multiple slave machines, Can solve the reading performance bottleneck of the database by dispersing the reading pressure of the single database server, after all, the reading pressure in most database application system is much higher than the writing pressure. This largely solves the current database pressure bottleneck of many small and medium-sized websites, and even some large websites are using similar schemes to solve the database bottleneck.

As follows:

This structure can be taken if the write operation is small and the read operation is very good. You can distribute the read operations to other slave, thereby reducing the pressure on the master. However, when slave is increased to a certain number, the slave load on master and the bandwidth of the network become a serious problem.
This structure is simple, but it is flexible enough to meet the needs of most applications. Some suggestions:
(1) Different slave play different roles (e.g. using different indexes, or different storage engines);
(2) using a Slave as the backup master, copy only;
(3) Use a remote slave for disaster recovery;

It should be clear to everyone, from a master node can be copied out of multiple slave nodes, some might think, that one slave node can be copied from more than the master node? At least for the time being, MySQL is not going to make it, and it is unclear whether it will support it in the future.

MySQL does not support a slave node from multiple master nodes to replicate the schema, mainly to avoid the problem of conflict, to prevent data conflicts between multiple data sources, resulting in inconsistencies in the final data. But I've heard that someone has developed a patch that allows MySQL to support a slave node to replicate from multiple master nodes as data sources, which is the benefit of MySQL's Open source nature.

4.2. Active mode Master-master (Master-master in active-active mode)

Master-master replicates two servers, both master and slave of the other server. In this way, changes made by either party are applied to the other party's database through replication.

Perhaps some readers will have a concern, so that after the replication environment, will not cause two mysql between the replication? In fact, MySQL has long thought of this, so in MySQL Binarylog recorded the current MySQL Server-id, and this parameter is also the time we build mysqlreplication must be explicitly specified, Moreover, the Server-id parameter values of master and slave are more inconsistent than necessary to make the mysqlreplication build successfully. Once you have the value of Server-id, it is easy for MySQL to tell which mysqlserver the change originated from, so it is easy to avoid a recurring replication situation. Also, if we do not open the slave binarylog option (--log-slave-update), MySQL will not record changes in the replication process to Binarylog, not to mention the possibility of a recurring replication situation.


Active Master-master replication has some special uses. For example, two parts that are geographically distributed require a copy of their own writable data. The biggest problem with this structure is the update conflict. Assume that a table has only one row (one column) of data, with a value of 1, and if two servers execute the following statements, respectively:
Execute on the first server:
mysql> UPDATE tbl SET col=col + 1;
Execute on the second server:
mysql> UPDATE tbl SET col=col * 2;
So what's the result? One server is 4 and the other server is 3, but this does not produce an error.
In fact, MySQL does not support a number of other DBMS-supported multi-master server replication (multimaster Replication), which is a significant limitation of the replication functionality of MySQL (the difficulty of a multi-master server is to resolve the update conflict), but if you really have that need, You can use MySQL Cluster and combine Cluster and replication to build a powerful, high-performance database platform. However, there are other ways to emulate this multi-master server replication.

4.3. Active-Passive mode master-master (Master-master in active-passive mode)

This is a master-master structure change, it avoids the disadvantage of m-m, in fact, this is a fault-tolerant and high-availability system. The difference is that one of the services can only be read-only.

4.4 Cascade Replication Architecture Master–slaves-slaves

In some applications, it may be that the pressure difference between reading and writing is large, the reading pressure is particularly large, and a master may need 10 or more slave to support the pressure of the reading. At this time, master will be more difficult, because only the Slaveio thread is more, so write a little bit more pressure, the master side because replication will consume more resources, it is easy to cause replication delay.

How to solve this problem? At this point, we can use MySQL to record the binarylog information on the slave side of the change that was generated, that is, to open the-log-slave-update option. Then, a two-level (or more-level) copy is required to reduce the stress on the master side as a result of replication. That is, we first replicate from master through a handful of MySQL machines, which we would call the first level slave cluster, and then the other slave to replicate from the first level slave cluster. From the first level slave to replicate the slave, I call it the second level slave cluster. If necessary, we can continue to add more layers of replication. This makes it easy for us to control the number of slave attached to each mysql. This architecture I call master-slaves-slaves architecture.

This multi-tiered replication architecture makes it easy to address the risk of the master side becoming a bottleneck because of the fact that the secondary slave is too many. Shows the replication architecture for multi-level replication.

Of course, if the conditions allow, I would prefer to recommend that you break into multiple replication clusters to solve

The above bottleneck problem. After all, slave did not reduce the amount of write, all slave actually still applied all the data change operations and did not reduce any write Io. Conversely, the more slave, the greater the total amount of write Io for the entire cluster, the more we don't have a very obvious feeling, simply because it's scattered over multiple machines, so it's not easy to show.

In addition, increasing the cascade level of replication, the same change to the bottom of the slave need to pass the MySQL will also be more, the same may cause a longer delay risk.

And if we were to solve it by splitting the cluster, it would probably be much better, and of course, a split cluster would require more complex technologies and more complex application architectures.

4.5. Master-master structure with Slave server (master-master with slaves)

The advantage of this structure is that it provides redundancy. Geographically distributed replication structure, it does not have a single node failure problem, and can also put read-intensive requests on the slave.

cascading replication, to a certain extent, does solve the problem that Master has become a bottleneck because of the excessive number of slave attached to it, but he does not solve the problem of manual maintenance and the possibility of re-building replication after an exception needs to be switched. This naturally brings out the replication architecture that Dualmaster and cascade replication, which I call the master-master-slaves architecture

Compared to the master-slaves-slaves architecture, the difference is simply replacing the first-level slave cluster with a separate master, as an alternate master, and then copying from this alternate master to a slave cluster.

The greatest benefit of this dualmaster architecture is that it avoids the impact of the primary master's write operations from the replication of the slave cluster. While the master master needs to switch, there is basically no case of re-replication. However, this architecture also has a disadvantage, that is, the backup master may become a bottleneck, because if the subsequent slave cluster is larger, the standby master may be a bottleneck because of excessive SLAVEIO thread requests. of course, the standby master does not provide any read service, the bottleneck is not particularly high probability, if there is a bottleneck, you can also re-cascade replication after the standby master, set up a multilayer slave cluster. Of course, the more levels of Cascade replication, the more noticeable the data latency that may occur in the slave cluster, so before considering multi-level replication, you need to evaluate the impact of the data latency on the application system.

Reference: "High Performance MySQL"

High-performance MySQL master-slave architecture replication principle and configuration detailed

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.