Understanding MySQL-Replication)

Source: Internet
Author: User
1. Overview of Replication

1.1 problems solved by Replication
Data replication technology has the following features:
(1) Data Distribution
(2) load balancing)
(3) backup
(4) high availability and fault tolerance

1.2 How replication works
From a high-level perspective, replication is divided into three steps:
(1) The master will record changes to binary logs (these records are called binary log events and binary log events );
(2) slave copies the binary log events of the master to its relay log );
(3) slave redo the events in the relay log, which will change and reflect its own data.

Describes the process:

 

The first part of this process is that the master records binary logs. Before each Transaction Completes data update, the master will record these changes in the second log. MySQL writes transactions into binary logs in sequence, even if the statements in the transaction are executed in a crossover manner. After the event is written to the binary log, the master notifies the storage engine to submit the transaction.
The next step is to copy the binary log of the master to its own relay log. First, slave starts a working thread-I/O thread. The I/O thread opens a normal connection on the master and starts the binlog dump process. The Binlog dump process reads the event from the binary log of the master. If it has already followed the master, it will sleep and wait for the master to generate a new event. The I/O thread writes these events to the relay log.
The last step of SQL slave thread processing this process. The SQL thread reads the event from the relay log, updates the slave data, and makes it consistent with the data in the master. As long as the thread is consistent with the I/O thread, the relay log is usually located in the OS cache, so the overhead of the relay log is very small.
In addition, there is also a working thread in the master: Like other MySQL connections, opening a connection in the master will also enable the master to start a thread. There is an important limitation in the replication process-replication is serialized on the slave, that is, the parallel update operations on the master cannot be performed in parallel on the slave.

 

2. Experience MySQL Replication
MySQL replication is a simple process. However, there are some changes in the basic steps according to specific application scenarios. The simplest scenario is a newly installed master and slave. The entire process is as follows:
(1) create a copy account on each server;
(2) Configure master and slave;
(3) Slave connects to the master to start replication.

2.1 create a replication account
Each slave uses a standard MySQL user name and password to connect to the master. The replication slave permission is granted to the user who performs the REPLICATION operation. The username and password are stored in the master.info text file. Assume that you want to create a repl user as follows:
Mysql> grant replication slave, replication client on *.*
-> TO repl @ '192. 192.% 'identified by 'p4ssword ';

2.2 configure master
Next, configure the master, including opening the binary log and specifying a unique servr ID. For example, add the following values to the configuration file:
[Mysqld]
Log-bin = mysql-bin
Server-id = 10
Restart the master and run show master status. The output is as follows:

 

 

2.3 configure slave
The Slave configuration is similar to that of the master. You also need to restart the slave MySQL. As follows:
Log_bin = mysql-bin
Server_id = 2
Relay_log = mysql-relay-bin
Log_slave_updates = 1
Read_only = 1
Server_id is required and unique. Slave does not need to enable binary logs, but in some cases, it must be set. For example, if slave is another slave master, bin_log must be set. Here, we enable binary logs and display the name (the default name is hostname, but if the hostname is changed, the problem will occur ).
Relay_log configures the relay log. log_slave_updates indicates that slave writes the replication event into its own binary log (which will be useful later ).
Some people have enabled the slave binary log, but have not set log_slave_updates, and then check whether the slave Data has changed. This is an incorrect configuration. Therefore, use read_only whenever possible, which prevents data changes (except for special threads ). However, read_only is very useful, especially for applications that need to create tables on slave.

 

2.4 start slave

Next, let slave connect to the master and begin to redo the events in the master binary log. Instead of using the configuration file for this operation, you should use the change master to statement. This statement can completely replace the modification TO the configuration file, and it can specify different masters for slave, you do not need to stop the server. As follows:

Mysql> change master to MASTER_HOST = 'server1 ',

-> MASTER_USER = 'repl ',

-> MASTER_PASSWORD = 'p4ssword ',

-> MASTER_LOG_FILE = 'mysql-bin.000001 ',

-> MASTER_LOG_POS = 0;

The value of MASTER_LOG_POS is 0 because it is the start position of the log. Then, you can use the show slave status statement to check whether the slave settings are correct:

Mysql> show slave status \ G

* *************************** 1. row ***************************

Slave_IO_State:

Master_Host: server1

Master_User: repl

Master_Port: 3306

Connect_Retry: 60

Master_Log_File: mysql-bin.000001

Read_Master_Log_Pos: 4

Relay_Log_File: mysql-relay-bin.000001

Relay_Log_Pos: 4

Relay_Master_Log_File: mysql-bin.000001

Slave_IO_Running: No

Slave_ SQL _Running: No

... Omitted...

Seconds_Behind_Master: NULL

Slave_IO_State, Slave_IO_Running, and Slave_ SQL _Running indicate that the replication process has not started yet. The log location is 4 rather than 0, because 0 is the start position of the log file, not the log location. In fact, MySQL knows that the first event is located at 4.

To start replication, you can run:

Mysql> start slave;

Run show slave status to view the output result:

Mysql> show slave status \ G

* *************************** 1. row ***************************

Slave_IO_State: Waiting for master to send event

Master_Host: server1

Master_User: repl

Master_Port: 3306

Connect_Retry: 60

Master_Log_File: mysql-bin.000001

Read_Master_Log_Pos: 164

Relay_Log_File: mysql-relay-bin.000001

Relay_Log_Pos: 164

Relay_Master_Log_File: mysql-bin.000001

Slave_IO_Running: Yes

Slave_ SQL _Running: Yes

... Omitted...

Seconds_Behind_Master: 0

Note that slave's I/O and SQL threads have started to run, and Seconds_Behind_Master is no longer NULL. The log location is increased, meaning that some events are obtained and executed. If you modify the log file on the master, you can view the changes in the location of various log files on the slave, and you can also see the changes in the database data.

You can view the status of the master and slave threads. On the master, you can see the connection created by the slave I/O thread:

Mysql> show processlist \ G

* *************************** 1. row ***************************

Id: 1

User: root

Host: localhost: 2096

Db: test

Command: Query

Time: 0

State: NULL

Info: show processlist

* *************************** 2. row ***************************

Id: 2

User: repl

Host: localhost: 2144

Db: NULL

Command: Binlog Dump

Time: 1838

State: Has sent all binlog to slave; waiting for binlog to be updated

Info: NULL

2 rows in set (0.00 sec)

Row 2 is the connection to the slave I/O thread.
Run this statement on slave:

Mysql> show processlist \ G

* *************************** 1. row ***************************

Id: 1

User: system user

Host:

Db: NULL

Command: Connect

Time: 2291

State: Waiting for master to send event

Info: NULL

* *************************** 2. row ***************************

Id: 2

User: system user

Host:

Db: NULL

Command: Connect

Time: 1852

State: Has read all relay log; waiting for the slave I/O thread to update it

Info: NULL

* *************************** 3. row ***************************

Id: 5

User: root

Host: localhost: 2152

Db: test

Command: Query

Time: 0

State: NULL

Info: show processlist

3 rows in set (0.00 sec)

Row 1 is the I/O thread state, and row 2 is the SQL thread state.

2.5 initialize slave from another master
We have discussed the assumption that you are a newly installed master and slave. Therefore, slave and master have the same data. However, this is not the case in most cases. For example, your master may have been running for a long time, and you want to synchronize the data of the newly installed slave, even if it does not have the data of the master.
There are several ways to start slave from another service, such as copying data from the master, cloning from the other slave, and starting from the latest backup. Three things are required for synchronization between Slave and master:
(1) Data snapshots at a certain time point of the master;
(2) The current log file of the master and the byte offset when the snapshot is generated. These two values can be called log file coordinates (log file coordinate), because they determine the location of a binary log, you can use the show master status Command to find the coordinates of the log file;
(3) binary log file of the master node.

You can use the following methods to create an slave:
(1) cold copy (cold copy)
Stop the master, copy the master files to slave, and restart the master. The disadvantage is obvious.
(2) hot copy (warm copy)
If you only use the MyISAM table, you can use mysqlhotcopy to copy the table, even if the server is running.
(3) Use mysqldump
Using mysqldump to get a data snapshot can be divided into the following steps:
<1> lock table: if you have not locked the table, you should lock the table to prevent other connections from modifying the database. Otherwise, the data you get may be inconsistent. As follows:
Mysql> flush tables with read lock;
<2> Use mysqldump to create a dump of the database you want to copy on another connection:
Shell> mysqldump -- all-databases -- lock-all-tables> dbdump. db
<3> release the lock on the table.
Mysql> unlock tables;

3. In-depth Replication
We have discussed some basic information about replication. Next we will discuss it in depth.

3.1 Statement-Based Replication)
MySQL 5.0 and earlier versions only support statement-based replication (also called logical replication and logical replication), which is not common in databases. The master records the query of changed data. Then, slave reads the event from the relay log and runs it. These SQL statements are the same as the statements executed by the master.
This method is easy to implement. In addition, statement-based replication of binary logs can be well compressed, and the log data volume is small, consuming less bandwidth-for example, A query that updates GB Data requires only dozens of bytes of binary logs. Mysqlbinlog is very convenient for statement-based log processing.
 
However, statement-based replication is not as simple as it looks, because some query statements depend on specific conditions of the master. For example, the master and slave may have different times. Therefore, the format of MySQL binary logs is not only a query statement, but also metadata information, such as the current timestamp. Even so, some statements, such as the current user function, cannot be copied correctly. In addition, stored procedures and triggers are also a problem.
Another problem is that statement-based replication must be serialized. This requires a lot of special code, such as the next-key lock of InnoDB. Not all storage engines Support statement-based replication.

3.2 record-Based Replication)
MySQL adds record-based replication and records changes in actual data in binary logs, which is similar to other DBMS implementations. This method has advantages and disadvantages. The advantage is that it can work correctly on any statement, and some statements are more efficient. The main drawback is that binary logs may be large and not intuitive. Therefore, you cannot use mysqlbinlog to view binary logs.
For some statements, record-based replication can work more effectively, such:
Mysql> insert into summary_table (col1, col2, sum_col3)
-> SELECT col1, col2, sum (col3)
-> FROM enormous_table
-> Group by col1, col2;
Assume that there are only three unique combinations of col1 and col2. However, this query scans many rows in the original table, but returns only three records. In this case, record-based replication is more efficient.
On the other hand, the following statements make statement-based replication more effective:
Mysql> UPDATE enormous_table SET col1 = 0;
In this case, the cost of using record-based replication is very high. Because the two methods cannot be well handled in all cases, MySQL 5.1 supports dynamic exchange before statement-based replication and record-based replication. You can control it by setting the binlog_format session variable.

3.3 copy Related Files
In addition to binary logs and relay log files, there are other files related to replication. As follows:
(1) mysql-bin.index
Once binary logs are enabled on the server, a file with the same name as the binary log file but ended with. index is generated. It is used to track which binary log files exist on the disk. MySQL uses it to locate binary log files. Its content is as follows (on my machine ):

(2) mysql-relay-bin.index
The file features similar to the mysql-bin.index, but it is for relay logs, not binary logs. The content is as follows:
. \ Mysql-02-relay-bin.000017
. \ Mysql-02-relay-bin.000018
(3) master.info
Save information about the master. Do not delete it. Otherwise, you cannot connect to the master after the slave is restarted. The content is as follows (on my machine ):

The I/O thread updates the master.info file. The content is as follows (on my machine ):

. \ Mysql-02-relay-bin.000019

254

Mysql-01-bin.000010

286

0

52813

(4) relay-log.info
Contains the current binary log and relay log information in the slave.

 

3.4 send replication events to other slave instances
When log_slave_updates is set, you can have slave Act as the master of other slave instances. In this case, slave writes the SQL thread-executed events to its own binary log. Then, its slave can get these events and execute them. As follows:

 

3.5 Replication Filters)
Replication filtering allows you to copy only part of the data on the server. There are two replication filtering methods: filter events in binary logs on the master node, and filter events in relay logs on the slave node. As follows:

 

 

4. Common topology of Replication
The replication architecture has the following basic principles:
(1) Each slave can have only one master;
(2) Each slave can have only one unique server ID;
(3) Each master can have many slave instances;
(4) If you set log_slave_updates, slave can be the master of other slave, thus spreading master updates.

MySQL does not support multi-master Replication-that is, one slave can have multiple masters. However, through some simple combinations, we can build a flexible and powerful replication architecture.

4.1. Single master and multi-slave
A replication system composed of a master and a slave is the simplest case. Slave does not communicate with each other and can only communicate with the master. As follows:

This structure can be used if the number of write operations is small and the number of read operations is very high. You can distribute read operations to other slave instances to reduce the pressure on the master node. However, when slave increases to a certain number, slave's load on the master and network bandwidth will become a serious problem.
This structure is simple, but flexible enough to meet the needs of most applications. Some suggestions:
(1) Different slave plays different roles (for example, using different indexes or different storage engines );
(2) Use a Server Load balancer instance as the slave master and only copy the slave database;
(3) Use a remote slave for disaster recovery;
4.2 Active Mode Master-Master (Master-Master in Active-Active Mode)
Two Master-Master replication servers are both the master and the slave of the other server.

Active Master-Master replication has some special functions. For example, both geographically distributed parts require their own writable data copies. The biggest problem with this structure is the update conflict. Assume that a table has only one row (one column) and its value is 1. If the two servers execute the following statement simultaneously:
Run the following command on the first Server:
Mysql> UPDATE tbl SET col = col + 1;
Run the following command on the second server:
Mysql> UPDATE tbl SET col = col * 2;
So what is the result? One server is 4 and the other server is 3, but this does not produce errors.
In fact, MySQL does not support multi-master Replication supported by other DBMS ), this is a huge limitation of the MySQL replication function (the difficulty of Multi-master servers lies in resolving update conflicts). However, if you have such requirements, you can use MySQL Cluster, by combining Cluster and Replication, you can build a powerful and high-performance database platform. However, you can simulate the replication of multiple master servers in other ways.

4.3 Active-Passive Mode Master-Master (Master-Master in Active-Passive Mode)
This is a change in the master-master structure, which avoids the disadvantages of the M-M, in fact, it is a fault tolerance and high availability system. The difference is that one service can only perform read-only operations.

4.4 Master-Master structure with slave servers (Master-Master with Slaves)
This structure provides redundancy. Geographically distributed replication structure, which does not have a single node failure problem, and can also put read-intensive requests on slave.

 

References: High Performance MySQL

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.