MySQL replication (II) _ MySQL

Source: Internet
Author: User
Tags random seed
MySQL replication (2) bitsCN.com

As we can see from previous articles that binary logs play an important role in replication, this article focuses on the core component behind Mysql replication: the true nature of binary logs.

Binary log structure

In terms of concept, binary logs are a series of binary log events. It includes a series of binlog files and a binlog index file. the binlog file currently being written by the server is called active binlog. The file name is defined by log-bin and log-bin-index in the configuration file.

Each binlog file is composed of several binlog events, starting with the Format_description event and ending with the Rotate event.

The Format_description event contains the server information of the binlog file and key information about the file status. If the server is closed or restarted, a new binlog file is created and a new Format_description event is written. This event is required because updates are generated when the server is shut down or restarted. After the server finishes writing the binlog file, it adds a Rotate event to the end of the file, which contains the file name of the next binlog file and the location where it starts reading. Apart from Format_description and Rotate events, other events of the binlog file are divided into groups for management. In the transaction storage engine, each group corresponds to a transaction. for non-transaction storage engines, each statement itself is a group. Generally, each group must be executed in full or not. For some reason, if Slave is stopped during the execution of the group, it will be copied from the starting point of the group instead of the statement just executed.

Binlog event structure

Binary log version 4 (binlog format 4) is introduced in MySQL 5.0 and is specially designed for expansion. Here we mainly discuss binary log version 4. (MySQL 3.23 4.0 4.1 uses binary log version 3)

Each binlog event consists of three parts:

  • Common header: fixed size. The basic information of the event. the important fields are the event type and event size.
  • Post header: fixed size. The submission header is related to a specific event type.
  • Event body: variable size. The event body stores the main data of an event, which varies with the event type.

Take a look at the Format_description event:

  • Binlog file format version
  • Server version string: generally consists of three parts: version number, hyphen, and other build items. Example: 5.1.42-debug-log
  • Length of a general header: the length of a general header is stored. This refers to the Format_description event, so the values of this field vary with binlog files. Except for Format_description and Rotate events, the general header length of other events is variable. The length of the common header of the Format_description event remains unchanged because the server of any version needs to read the event. The length of the general header of the Rotate event remains unchanged because the event is used first when the Slave is connected to the Master.
  • Length of the submission header: the length of the submission header for all events in the binlog file remains unchanged. This field stores an array consisting of the submission headers for each event. Because the number of events on different servers is different, the number of events on the server is also stored in front of this field.
Use events to record database changes

First, because the binary log is a public resource, all threads write statements to it. to avoid the two threads from updating the binary log at the same time, you need to obtain a mutex lock Lock_log before writing, release after writing.

All statements related to database updates are written to the binary log in the form of a Query event. in addition to the actually executed statements, the Query event also contains the context additional information required for executing the statement. The following describes how to record the context information.

  • Current database: add a special field in the Query event to record the current database.
  • User-defined variable value: the User_var event records the variable name and value of a single user-defined variable.
  • Seed of the RAND function: The Rand event records the random seed used by the Rand function.
  • Current time: NOW, CURDATE, CURTIME, UNIX_TIMESTAMP, and SYSDATE functions use the current time. a timestamp is stored for this event, indicating when the event will start to be executed.
  • Insert value of the AUTO_INCREMENT field: the value of the automatic increment counter in the table before the Intvar event is recorded in the statement.
  • Return value of calling LAST_INSERTED_ID: the Intvar event records the return value of this function in the statement.
  • Thread ID: mainly related to the processing of temporary tables. The thread ID is also stored in the Query event as an independent field.

*For the SYSDATE function, it returns the time when the function is executed, which is different from the NOW function. NOW returns the statement execution time. Therefore, SYSDATE is not secure for replication and should be used as little as possible.

Load data infile statement

Load data infile is special. its context is the file of the file system. To pass and execute the load data infile statement correctly, a new event type needs to be introduced:

  • Begin_load_query: This event starts to transfer data in the file.
  • Append_block: If the file exceeds the maximum size allowed by the connected data packet, the series of one or more Append_block events following the Begin_load_query event contains the remaining part of the file.
  • Execute_load_query: a special variant of the Query event. it contains the load data infile statement executed on the Master.

For each load data infile statement executed on the Master, the read files are mapped to a buffer that supports internal files and used in the subsequent processing process. In addition, a unique file ID is assigned to the execution statement and is used to point to the file read by the statement.

When the statement is executed, the content of the file is written into the binary log and serves as the sequence of events starting with the Begin_load_query event. the Begin_load_query event indicates the start of the new file, the event sequence is followed by zero or multiple Append_block events. Each event written to binary does not exceed the maximum value allowed by the package size, which is specified by the max-allowed_packet option.

After the entire file is read to the table, the execution of the statement is terminated by writing the Execute_load_query event to the binary log. This event contains the execution statement and the file ID assigned to the execution statement. Note that this is not the original statement written by the user, but re-created.

*The event names used in versions earlier than Mysql 5.0.3 are a bit different. the names are Load_log_event, Execute_log_event, and Create_file_log_event.

Binary log filter

My. cnf has two options for Filtering logs: binlog-do-db and binlog-ignore-db. These two options can be used multiple times.

The way MySQL filters events may be a bit strange for unfamiliar people. Mysql filtering is completed at the statement level. binlog-*-db uses the current database to determine whether to filter the statement, rather than the database of the table affected by the statement. In the following example, use binlog-ignore-db = bad to filter bad databases. in the following example, no logs are written.

USE bad; INSERT INTO t1 VALUES (1),(2);
USE bad; INSERT INTO good.t2 VALUES (1),(2);
USE bad; UPDATE good.t1, ugly.t2 SET a = b;

As to why it is not determined by the database where the table affected by the statement is located, try to analyze it. If this logic is used, do the third statement need to write logs when binlog-ignore-db = Uugly is used for filtering?

To avoid errors when executing statements that may be filtered, do not write the statements with the database name before the table name, function name, or stored procedure name, instead, use to change the current database.

Another note is that if binlog-do-db is set, the filter ignores the binlog-ignore-db settings.

For MySQL replication, filter is not recommended because logs are incomplete.

Binary log and security

Generally, a user with the replication slave permission has the permission to read all events on the Master node. therefore, to ensure security, the account should not be damaged. Specific preventive measures include:

  • Try to make it impossible to log on to this account from outside the firewall
  • Record all logs that attempt to log on to this account and place the logs on a separate security server.
  • Encrypt the connections used between the Master and Slave, such as MySQL's built-in SLL
  • Do not include sensitive information in log files, such as passwords.
# The second method will not write the plaintext password into the log, which is safer

UPDATE employee SET pass = PASSWORD ('foobar ')

SET @ pass = PASSWORD ('foobar ');
UPDATE employee SET pass = @ pass
Trigger

To replay binary logs on the server and process the permissions of various tables without any problem, it is necessary to run all statements with the SUPER permission. However, the trigger is not defined to use the SUPER permission, so it is important to re-create the trigger with the correct user as the trigger's definer. Create trigger provides a DEFINER clause. If no DEFINER is specified for a statement, the statement is added with the DEFINER clause and written to the binary log. the current user is used as the DEFINER.

master>SHOW BINLOG EVENTS FROM 92236 LIMIT 1/G
******************** 1. row ********************
Log_name: master-bin.000038
Pos: 92236
Event_type: Query
Server_id: 1
End_log_pos: 92491
Info: use `test`; CREATE DEFINER=`root`@`localhost` TRIGGER ...

The statement that calls the trigger is recorded in the binary log, but it is not connected to a specific trigger. On the contrary, when Slave executes this statement, it automatically executes all the triggers associated with the tables affected by this statement, which means that different triggers can be created on the Master and Slave.

Stored Procedure

The processing of the definition statement of the stored procedure is similar to that of the trigger. the create proceture statement also has an optional sub-statement DEFINER. this clause is forcibly added when binary logs are written. The call process is different from the trigger.

# Define a stored procedure
Delimiter $
Create procedure employee_add (p_name CHAR (64), p_email CHAR (64), p_password CHAR (64 ))
MODIFIES SQL DATA
BEGIN
DECLARE pass CHAR (64 );
Set pass = PASSWORD (p_pass)
Insert into employee (name, email, password) VALUES (p_name, p_email, pass );
END $
Delimiter;

# Call a stored procedure
Master> CALL employee_add ('Chunk', 'Chuck @ example.com ', 'abrakadaba ');
Master> show binlog events from 104033/G
* ******************** 1. row ********************
Log_name: master-bin.000038
Pos: 104033
Event_type: Intvar
Server_id: 1
End_log_pos: 104061
Info: INSERT_ID = 1
* ******************** 2. row ********************
Log_name: master-bin.000038
Pos: 104061
Event_type: Query
Server_id: 1
End_log_pos: 104416
Info: use 'test'; insert into employee (name, email, password) VALUES (
NAME_CONST ('P _ name', _ latin1 'bucket' COLLATE 'latin1 _ swedish_ci '),
NAME_CONST ('P _ email ', _ latin1' chuck @ example.com 'collate' latin1 _ swedish_ci '),
NAME_CONST ('pass', _ latin1 '* FEB778934FDSFQOPL7... 'collate' latin1 _ swedish_ci '))

Note the following four points:

  • The CALL statement is not written into binary logs. Instead, the execution statement is written into the binary log as the call result.
  • This statement is rewritten to not include any reference to the stored procedure parameters. Instead, use the NAME_CONST function to create a single-value result set for each parameter.
  • The locally declared variable pass is also replaced with the NAME_CONST expression.
  • The context information has been written into the log before the call statement is written to the binary log. this indicates the Intvar event.
Storage functions

The processing of the definition statement of the stored procedure is similar to that of the trigger. the create function statement also has an optional sub-statement DEFINER. when writing binary logs, this clause is forcibly added. During the call, the stored function is copied in the same way as the trigger. Note that the SELECT statement is not written into binary logs, but a SELECT statement containing storage functions is an exception.

There is also a permission issue that needs to be mentioned for stored functions. The create routine permission is required to define a stored procedure or function. Strictly speaking, creating a bucket does not require other permissions, but it is usually executed based on the permissions of the creator. The replication thread on Slave executes without permission check, which leaves a serious security vulnerability. MySQL 5.0 and earlier versions have no storage programs, so there will be no problems, because the statements that violate the rules on the Master node will not be written to binary logs. As the stored procedure is expanded, only statements successfully executed on the Master node are written into the binary log, so there is no problem. The storage functions are a little different, and they are not expanded. that is to say, different program branches may be executed on the Master and Slave, resulting in potential security vulnerabilities. This can be prevented by using SQL security definer instead of SQL SECURITY INVOKER when the storage function is defined. For this reason, MySQL requires the SUPER permission by default to define the storage function.

Events

Like other storage programs, definitions also have DEFINER clauses. Because events are called by the event scheduler, they are always executed by the definer so that there is no security vulnerability in storing functions. When an event is executed, the statement is directly written into the binary log. Because the events are executed on the Master, they are automatically disabled on the Slave. However, if you need to upgrade the Slave, you must allow these events to be executed on the Slave.

UPDATE mysql.events SET status = ENABLED WHERE status = SLAVESIDE_DISABLED;
Special structure

Although statement-based replication is usually simple, some special structures must be carefully processed to ensure that the context of the Slave statement execution is the same as that of the Master.

LOAD_FILE function

The LOAD_FILE function allows you to obtain a file. because it is not transmitted during the copy process, you need to rewrite it.

Insert into document (author, body) VALUES ('fox', LOAD_FILR('index.html '));

# Rewrite with LOAD DATA FILE
Load data infile 'index.html 'into table document fields terminated by' @ * @ 'lines TERMINATED by' & % & '(author, body) SET author = 'Fox ';

# You can also use user-defined variables for rewriting.
SET @ document = LOAD_FILE('index.html ');
Insert into document (author, body) VALUES ('fox', @ document );
Non-transactional changes and error handling

If an employee table is an InnoDB storage engine that supports transactions (the primary key is mail), and the log table that tracks changes to the employee table is a MyISAM storage engine that does not support transactions. Define two triggers on it. one triggers tr_insert_before before INSERT, inserts a record into the log table, and the INSERT record status is FAIL. The other triggers tr_insert_after after INSERT, change the status of the inserted record to OK. When two identical records are inserted consecutively, tr_insert_before is triggered, and tr_insert_after is not triggered. Although the employee failed to roll back, the data inserted in the log cannot be rolled back. this is a problem. The binary log file after execution is as follows.

master> SET @pass = PASSWORD('xyz');
master> INSERT INTO employee (name, mail, password) VALUES ('hu', 'hu@fox.com', @pass);
master> INSERT INTO employee (name, mail, password) VALUES ('hu', 'hu@fox.com', @pass);
master> SHOW BINLOG EVENTS IN 'local-bin.000023'
******************** 1. row ********************
Log_name: master-bin.000023
Pos: 1252
Event_type: Query
Server_id: 1
End_log_pos: 1320
Info: use 'test'; BEGIN
******************** 2. row ********************
Log_name: master-bin.000023
Pos: 1320
Event_type: Intvar
Server_id: 1
End_log_pos: 1348
Info: LAST_INSERT_ID=1
******************** 3. row ********************
Log_name: master-bin.000023
Pos: 1348
Event_type: User var
Server_id: 1
End_log_pos: 1426
Info: @'pass'=_utf 0x432423jklfslagklr... COLLATE utf8_general_ci
******************** 4. row ********************
Log_name: master-bin.000023
Pos: 1426
Event_type: Query
Server_id: 1
End_log_pos: 1567
Info: use 'test'; INSERT INTO employee ...
******************** 5. row ********************
Log_name: master-bin.000023
Pos: 1567
Event_type: Xid
Server_id: 1
End_log_pos: 1594
Info: COMMIT /* xid=60 */
******************** 6. row ********************
Log_name: master-bin.000023
Pos: 1594
Event_type: Query
Server_id: 1
End_log_pos: 1662
Info: use 'test'; BEGIN
******************** 7. row ********************
Log_name: master-bin.000023
Pos: 1662
Event_type: Intvar
Server_id: 1
End_log_pos: 1690
Info: LAST_INSERT_ID=1
******************** 8. row ********************
Log_name: master-bin.000023
Pos: 1690
Event_type: User var
Server_id: 1
End_log_pos: 1768
Info: @'pass'=_utf 0x432423jklfslagklr... COLLATE utf8_general_ci
******************** 9. row ********************
Log_name: master-bin.000023
Pos: 1768
Event_type: Query
Server_id: 1
End_log_pos: 1909
Info: use 'test'; INSERT INTO employee ...
******************** 10. row ********************
Log_name: master-bin.000023
Pos: 1909
Event_type: Query
Server_id: 1
End_log_pos: 1980
Info: use 'test'; ROLLBACK

Transactions

The preceding binary log shows that extra processing is required for transaction execution. For a transaction, in order to bring all the statements of each transaction together, it is not in the starting order of the transaction, but committed in the binary log. To ensure that each transaction is written into binary logs as a unit, the server needs to separate the statements executed in different threads and store them in a transaction cache, when a transaction is committed, the cache is cleared, and the content cached by the transaction is copied to the binary log.

So how to record non-transactional statements? There are three rules available:

  1. If a statement is marked as a transaction, it is written to the transaction cache.
  2. If the statement is not marked as transactional and there are no statements in the transaction cache, the statement is directly written to the binary log.
  3. If a statement is not marked as transactional but already exists in the transaction cache, the statement is written to the transaction cache.
If a transaction involves non-transactional statements, make sure that the statements that affect non-transactional tables are first written into the transaction. in this case, binary logs are directly written according to Rule 2. If any value needs to be obtained from the statement after the transaction, they can be allocated to the temporary table or variable. Using XA for distributed transaction processing XA contains a transaction manager that coordinates a series of resource managers so that they can commit a global transaction as an atomic unit. Each transaction is assigned a unique XID, which is used by the Transaction Manager and resource manager. When used inside the MySQL server, the transaction manager is generally a binary log and the resource manager is a storage engine. The process of committing an XA transaction:
  1. In the first stage, each storage engine is required to prepare for submission. During preparation, the storage engine writes all the information that needs to be correctly submitted to the secure storage, and then returns an OK message. If the answer from a storage engine is no, it means that it cannot commit this transaction, the commit is terminated, and all engines are notified to roll back the transaction.
  2. When all storage engines return OK, the transaction cache is written to the binary log before the start of the second stage. A general transaction ends with a common query event with COMMIT, and an XA transaction ends with an XID event containing Xid.
In the second stage, all storage engines prepared in the first stage are required to commit transactions. When committed, each storage engine reports that it has committed transactions in persistent storage. Understanding that commit cannot fail is very important: Once stage 1 has passed, the storage engine must ensure that the transaction can be committed, so that it cannot report a failure at stage 2. Even if a hardware error causes the system to crash, the storage engine can recover the information correctly after the server is restarted because it has already stored the information in the persistent storage engine. The recovery process is probably: at startup, the server will open the last binary log and check the Format description event. If the binlog-in-use flag is set, it indicates that the server has crashed and XA recovery is required. The server reads the Xid event by viewing the binary log of the activity and searches for the XID of all transactions from the binary log. Each storage engine is required to submit transactions in the list immediately after it is started. For each XID in the list, the storage engine determines which XID corresponds to a transaction that is prepared but not committed. If yes, the transaction is committed. If the transaction XID prepared by the storage engine is not in this list, the XID is obviously not written to the binary log before the server crashes, so the transaction must be rolled back. Binary log management

The events mentioned so far are all data changes on the Master. Some events do not mean modifying data on the Master, but they affect replication. For example, when the server is stopped, you need to modify data files. in order to cope with these problems, you also need additional types of events.

Binary log and system crash security

It is important to maintain consistency between the database and binary logs when the database crashes. In other words, if no binary log is written, no changes should be submitted to the storage engine, and vice versa.

But there is a problem with non-transactional engines. For example, it is impossible to ensure consistency between binary logs and MyISAM tables because MyISAM is non-transactional and has been modified before MyISAM tries to record statements. The transaction storage engine is different. As mentioned above, events are written to binary logs after all changes are transferred to each storage engine before all table locks are released. If the system goes down before the storage engine releases the lock, the server must confirm that the change to the binary log has been written into the actual table before the transaction is allowed to be committed, this requires coordination with the standard file system.

Recall XA, in order to be able to safely cope with downtime, when the first phase is completed, all data should have been written to the disk. This means that every time a transaction is completed, the system page cache must be written to the disk. this idea is costly and is not required by many applications. You can use the sync-binlog option to control the frequency of data writing to the disk. the default value is 0, that is, the scheduling of data not writing to the disk is completely handed over to the operating system. set n, indicates that the disk is written every n transaction commits.

Binlog file rotation)

MySQL will enable a new file to save binary log events at intervals. The file switch is called binlog file rotate.

There are four main operations that will cause file rotation:

  1. Server stop: a new binary log file is enabled each time the service is started.
  2. Binlog file size reaches the maximum value: This value can be controlled by the binlog-cache-size parameter.
  3. Explicit refresh: FLUSH LOGS
  4. Server accidents: some accidents require special manual intervention, which forms a "gap" in the replication process"
Each binary log starts with a Format description event. this event describes the server information and file status and content information. Several of them need attention:
  • Binlog-in-use flag: The server may be down when writing binary logs. Therefore, you need to know whether a file is properly disabled. In addition, if a file is damaged, using it for restoration will lead to more problems. Binlog-in-use is used to identify the integrity of a file. it is set during file creation and cleared after the Rotate event is written to the file.
  • Binary log file format version:
  • Server version:
To ensure that log files can be safely switched even when the server is down, the server adopts a pre-write policy to indicate its intention in a temporary file, this temporary file is called purge index file (this is called because it is used when binary logs are cleared). the file name is based on the binary log index file name. After a new binary log file is created and the index file is updated, the server deletes the temporary file. Incidents

The so-called incident events refer to those events that do not produce data changes on the server but must be written into binary logs, because they may affect replication. Most such events do not require DBA intervention, such as database restart.

  • Stop: This is an event that indicates that the server is shut down normally. If the server goes down, there will be no stop event. This event will be in the old binary log file, because the new file will be enabled after restart. This event only contains a common header. When binary logs are replayed on Slave, all Stop events are ignored. So what is the purpose of this kind of event, because a backup or file may be manually restored or modified before the copy is restarted. at this time, when DBA replays the log file, you can find the event to know where to start or stop replay.
  • Incident: This event type was introduced in MySQL 5.1. Compared with the Stop event, this event contains a identifier to specify which type of accident occurred. It is generally used to indicate that the server is forced to execute a change that is not recorded in binary logs. For example, when the database is Reloaded, a non-transactional event is too large to write binary logs. MySQL Cluster reloads the database on one of the nodes, so this event is also generated in different steps. When binary logs are replayed on the Slave, replication stops when an Incident event occurs.
Delete binary files

There are several ways to delete a binary file:

1: Set the expire-logs-days parameter of my. cnf.
2: purge binary logs before datetime;
3: purge binary logs to 'filename ';

Mechanism for deleting binary files:
Before deleting a file, the server will write the list of files to be deleted to a temporary file (purge index file) before deleting the file, and finally deleting the temporary file. In this way, even if the system goes down during log file deletion, the files that have not been deleted can be deleted when the server starts again. As mentioned earlier, purge index file is also used for file rotate.

Mysqlbinlog is a small program that allows you to view binlog log files and relay log files. You can use mysqlbinlog to view the output of binary logs directly on the server. This command is a powerful tool for analyzing logs. it can view the statement content and event content of all logs, so it is often used for error detection. For details about how to use this command, refer to the official documentation. Note that you can use the -- hexdump option to view binary logs, but you need to know the log data format. For example, the integer field in the binary log is printed in the order of little-Endian, so you must read it from the right to the left. The 32-bit block 03 01 00 00 indicates the hexadecimal 103.

--- To Be Continued

BitsCN.com

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.