High Availability of MySQL Cluster-1

Source: Internet
Author: User

High Availability of MySQL Cluster-1 using mysql cluster can achieve high reliability of 99.999. In this chapter, we mainly introduce some basic concepts: Network Partitioning in a mysql Cluster, if a network problem occurs between several nodes, mysql cluster can determine which nodes should continue to provide services. For example, two Node groups have four nodes: SN1, SN2, SN2, and SN4. SN1 and SN2 are in Node Group1, SN3 and SN4 in Node group2. If {SN1, SN3} and {SN2, SN4} lose contact, all of them have complete data and can provide external services. For data integrity, We must isolate the data in half. According to the configuration, one or more management nodes can provide arbitration to help us select which node to continue to survive. If we have configured two management nodes (MGM1 and MGM2) to provide arbitration services, we should avoid the following configuration: Router 1: MGM1, SN1, SN3 router 2: MGM2, SN2, if you have such a configuration in SN4, there is a problem between vro1 1 and vro2 2, and both MGM1 and MGM2 can be arbitrated, then both sides may be providing external services. You know the result. Synchronous Replication: Synchronous Replication. This is easy to understand. Within the same Node Group, the transaction is committed to two-phase commit and is synchronized. Data in each data node is synchronized to another node. MySQL Cluster configuration usually has at least two copies of all data, which are stored on different hosts. To avoid overall system faults, the system regularly stores transaction logs and checkpoint files on the disk at configurable frequencies. If a node fails, at least one data node stores the same information. Note: Synchronous replication is only memory replication. For mysql cluster, it may not have been written to the hard disk (related to Checkpoint, Redo, and UNDO log ). If the same Node Group powers down at the same time, we may lose data. Mysql cluster data Consistency means that we have almost no two nodes over at the same time, so we 'd better not have the data nodes of the same Node Group in the same rack during deployment. Failure Detection Error Detection. There are usually two types of errors: communication loss and heartbeat failure. First, communication is lost. You can use tcp or shared memory to communicate with each storage node to detect whether the node is normal. This is the fastest fault detection method, and the second is heartbeat failure, the communication loss detection method is invalid in some special circumstances, such as disk faults. all storage nodes form a loop, and each node sends a heartbeat signal to the next node. If the next node does not receive a heartbeat signal, the previous storage node is considered faulty, and then broadcast the fault information Logging to the next node in sequence: under normal circumstances, no matter what type of database, there will be logs, which contain all insert/update/delete. When the system fails, we use it to help restore the database. Local Checkpoints: Redo log is a circular log file. With database operations, redo logs also increase. If not cleared in time, the log space will soon be used up, we need to promptly write the snapshot image to the disk, and then move the tail pointer of the redo log forward. NoOfFragmentLogFiles is used to control the redo log size. TimeBetweenLocalCheckpoints controls the frequency of cluster local checkpoint. Here Time does not refer to Time, but to the number of operations. It increases exponentially. For example, 20 is a 4 M write operation, but 21 is 8 M. Global Checkpoints: Mysql cluster is an in-memory database. For the sake of performance, transactions are first synchronized to the memory and then written to the log. GlobalCheckpoints is TimeBetweenGlobalCheckpoints at intervals. We write a group of committed transactions to disk. Global Checkpoint is to regularly write the Redo log to disk. Global Checkpoint is the write Redo log, and Local Checkpoint is the end of the capture Redo log. Local check takes time. To restore the LCP image exactly the same as the LCP start time, mysql cluster introduces the Undo log.
System Recovery: when the System recovers, the LCP image is the data at the LCP stop point. We execute the undo log to make the System roll back to the LCP start point. Then execute the Redo log to the Global checkpoint. Cross-region replication-cross-region replication enables nodes to mirror to remote data centers for disaster recovery restrictions. Continuous submission on disks is not supported. The submission will be copied, but there is no guarantee that logs will be written to the disk during submission. You cannot add or discard nodes online (you must restart the cluster at this time)
 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.