Understanding of MySQL Master/Slave Databases

Source: Internet
Author: User
The entire data layer is composed of three clusters: Group1, Group2, and group3. these three clusters are the result of horizontal data splitting. Of course, these three clusters constitute a complete

The entire data layer is composed of three clusters: Group1, Group2, and group3. these three clusters are the result of horizontal data splitting. Of course, these three clusters constitute a complete

As shown in, the entire data layer consists of three clusters: Group1, Group2, and group3. these three clusters are the results of horizontal data splitting, of course, these three clusters form a DB containing complete data. Each Group includes one Master (of course, the Master can also be multiple) and N Slave. These Master and Slave data are consistent. For example, if a Server Load balancer instance in Group1 is down, two other Server Load balancer instances are available. Such a model will not cause access to some data, unless all the machines in the entire Group are down, but the probability of such a thing happening is very small (unless it is power-off, it is not easy to happen ).

Before the cluster is introduced, the query process is roughly as follows: request the data layer and pass the necessary database shard differentiation field (usually user_id )? How does the data layer differentiate fields from Route to a specific DB? Perform data operations in the specified DB. This is because no cluster is introduced. What will it look like when a cluster is introduced? As you can see, the rules and policies on our vro can only be routed to a specific Group, that is, only one virtual Group can be routed. This Group is not a specific physical server. The next step is to find a physical DB server for specific data operations. Based on the requirements in this step, we introduced the concept of Load balancer (LB ). The Load balancer is responsible for locating a specific DB server. The specific rules are as follows: the Server Load balancer analyzes the read and write features of the current SQL statement. If it is a write operation or requires highly real-time operations, the query load is directly distributed to the Master, for read operations, a server Load balancer policy is used to allocate a Server Load balancer instance. Our Load balancer mainly studies the load distribution policy, which usually includes random load balancing and weighted load balancing. A random Server Load balancer is easy to understand, that is, a random Server Load balancer instance is selected from N Server Load balancer instances. This random Server Load balancer does not consider the performance of the machine. By default, the performance of each machine is the same. If this is the case, it is understandable. What if this is not the case? It is very unscientific to use random Server Load balancer without considering performance when the physical performance and configuration of each Slave machine are different, this will bring unnecessary high load to machines with poor performance, and even bring the danger of downtime. At the same time, high-performance database servers cannot give full play to their physical performance. Based on this consideration, we have introduced weighted load balancing, that is, through some interfaces within our system, we can assign a weight to each DB server, then, at runtime, LB allocates a certain proportion of load to the DB server based on the weight of the weight in the cluster. Of course, the introduction of this concept undoubtedly increases the complexity and maintainability of the system. We have no way to escape.

With Sub-databases, clusters, and Server Load balancer, is it all right? Things are far less simple than we think. Even with these things, we can basically ensure that our data layer can withstand a lot of pressure, but such a design cannot completely avoid the dangers of database downtime. If slave2 in Group1 goes down, the LB of the system cannot be known. This is actually very dangerous because LB does not know that slave2 is available, therefore, slave2 is still allocated with load. In this way, the problem arises, and the client will naturally encounter a data operation failure error or exception. This is unfriendly! How can this problem be solved? We introduce the availability detection mechanism of cluster nodes or the data push mechanism of availability. What are the differences between these two mechanisms? First, let's talk about the probe mechanism. As the name suggests, even if the probe is my data-layer Client, it will try the availability of each database in the cluster from time to time. The implementation principle is the retry link, or the attempted access to the database port can be done. Of course, you can also use the JDBC retry link and use the Java Exception mechanism to judge the availability, the details will be mentioned in the following text. What is the data push mechanism? In fact, this should be discussed in actual application scenarios. In general, if the database used for the database is down, I believe the DBA must know that, at this time, the DBA manually pushes the current status of the database to the client through a program, that is, the application end of the distributed data layer. At this time, a list of local DB statuses is updated. And inform LB that this database node cannot be used. Do not assign load to it. One is an active listening mechanism and the other is a passive notification mechanism. Both have their own strengths. However, the same effect can be achieved. In this way, the hypothetical problem will not happen. Even if it happens, the probability of occurrence will be minimized.

The Master and Slave mentioned in the above text are not explained in depth. 1. A Group consists of one Master and N Slave instances. Why? The Master is responsible for the write operation load, that is, all write operations are performed on the Master, and the read operations are distributed to the Slave. This greatly improves the reading efficiency. In general Internet applications, after some data investigation, it is concluded that the read/write ratio is about, that is to say, a large amount of data operations are concentrated on reading operations, this is why we have multiple Slave instances. But why should we separate reading and writing? R & D personnel familiar with DB know that the write operation involves the lock issue, whether it is a row lock, table lock or block lock, it is a matter of reducing the system execution efficiency. In this way, write operations are concentrated on one node, while read operations are performed on another N nodes, effectively improving the Read efficiency, this ensures the high availability of the system. Read/write splitting also introduces new problems. For example, how can data on my Master be synchronized and consistent with other Slave machines in the cluster? This is a problem that we don't need to pay too much attention to. MySql's Proxy mechanism can help us do this.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.