MySQL performance tuning and architecture design--the 12th chapter on the Basic principles of extensible design

Source: Internet
Author: User

12th principles of extensible design

Preface:

With the rapid increase of information, the development of hardware equipment has been slowly unable to keep up with the application system to the processing capacity requirements. At this point, how can we solve the performance requirements of the system? There is only one way to improve the system's scalability by transforming the architecture of the system, by combining multiple low-capacity hardware devices to achieve a high-processing system, that is to say, we have to do scalable design. Extensible design is a very complex system engineering, the various aspects involved are very broad, the technology is more complex, there may be many other problems. But no matter how we design, no matter what problems we encounter, there are some principles we must ensure. This chapter provides a simple introduction to the principles that need to be ensured during the design process.

12.1 What is extensibility

Before discussing extensibility, there may be a lot of friends who ask: what is the extension of how well a system is designed to be extensible and how good is the architecture? What is extensible? What is extensibility? In fact, we often hear the scale,scalable and scalability these three words.
From the database point of view, scale (extension) is to allow our database to provide a stronger service capability, more processing power. Scalable (extensible) indicates that the database system is able to provide more processing power after the corresponding upgrade (including increasing the capacity of single machine or increasing the number of servers). In theory, any database system is scalable, but it needs to be implemented in a different way. Finally, Scalability (extensibility) refers to the extent to which a database system improves processing power after a corresponding upgrade. Although theoretically any system can be upgraded to achieve the improvement of processing capacity, but different systems to improve the same processing power required by the upgrade cost (capital and manpower) is not the same, which is what we call the various database application system scalability There is a big difference.
Here, I said different database application system does not mean that the database software itself is different (although the database software will also exist Scalability differences), but refers to the same database software different application architecture design, this is the chapter and the next few will be the focus of the analysis of the content.
First, we need to understand that the scalability of a database system is actually mainly embodied in two aspects, one is scale-out, the other is vertical expansion, that is, we often say scale out and scale up.
Scale out refers to horizontal expansion, scaling out, that is, by increasing the processing of nodes to improve the overall processing capacity, said more practical is to increase the machine to increase the overall processing capacity.
Scale up refers to the vertical expansion, scaling up, that is, by increasing the processing power of the current processing node to improve the overall processing capacity, in white is to upgrade the existing server configuration, such as increased memory, increase the CPU, increase the storage system hardware configuration, Or it can be replaced directly with more capable servers and higher-end storage systems.
By comparing the two scale modes, it is easy to see the pros and cons of each.

Scale Out Advantages:

1. Low cost, it is easy to build a very powerful computing cluster with inexpensive PC Server;

2. It is not easy to encounter bottlenecks, because it is easy to add the host to increase processing power;

3. Single node failure has a small impact on the overall system, there are drawbacks, more compute nodes, most of the time is the server host, which will naturally lead to the improvement of the overall system maintenance complexity, in some respects will certainly increase maintenance costs, and the application system architecture requirements will be higher than the scale up, Requires the collaboration of the cluster management software.

Scale out Disadvantage:

1. Multi-node processing, resulting in the overall complexity of the system architecture increased, application complexity increased;

2. Cluster maintenance is more difficult and maintenance cost is greater;

Scale up Benefits:

1. Less processing nodes, relatively simple maintenance;

2. All the data are concentrated together, the application system architecture is simple, the development is relatively easy;

Scale up disadvantage

1. High-end equipment cost, and less competition, easy to be limited by manufacturers;

2. Limited by the development speed of hardware equipment, the processing ability of single host is always limit, and it is easy to meet the performance bottleneck that can not be solved finally.

3. Equipment and data set, after the failure of the impact of a large;
In the short term, scale up has a greater advantage because it simplifies operational costs, simplifies the development of system architectures and application systems, and makes technical requirements easier.
However, in terms of long-term impact, scale out will have a greater advantage, but also the system to reach a size after the inevitable trend. Because in any case, the processing power of a single machine will always be limited by the hardware technology, and the development of hardware technology is always limited, many times it is difficult to keep pace with the development of business. And the higher the processing capacity of high-end equipment, its price/performance will always be worse. So building a distributed cluster with high processing power through multiple inexpensive PC servers will always be a goal for companies to save costs and improve overall processing power. While achieving this goal may encounter a variety of technical problems, it is always worthwhile to study the practice.
Later, we will focus on the scale out aspect for the analysis design. A distributed system design is necessary to be able to get a good scale out. For the database, in order to better scale out, we have only two direction, one is through the continuous replication of data to achieve a number of identical data sources to expand, and the other is to expand a centralized data source into many data sources to achieve.
Let's take a look at some of the principles that need to be followed in designing a database application architecture with a good scalability.

12.2 Transaction Dependency minimization principle

When building a distributed database cluster, many people will be more concerned about the issue of business. After all, transactions are a very central feature of the database.
In the traditional centralized database architecture, the problem of transaction is very well solved, which can be guaranteed entirely by the very mature transaction mechanism of the database itself. But once our database is distributed, many transactions that were originally completed in a single database may now need to span multiple database hosts, so that the original stand-alone transaction might need to introduce the concept of distributed transactions.
But there must be some understanding that the distributed transaction itself is a very complex mechanism, whether it is a large business database system or open source database system, although most of the database manufacturers have basically implemented this function, but more or less there are a variety of restrictions. There are also some bugs that may cause some transactions to be either not well guaranteed or not to be completed successfully.
At this time, we may need to seek alternative solutions to solve this problem, after all, the business is not to be ignored, it is not how we achieve, always need to achieve.
For now, there are three main solutions:
First, when the scale out design reasonable design of the segmentation rules, as far as possible to ensure that the office needs data on the same MySQL Server, to avoid distributed transactions.
If you can do all of the transactions on a single MySQL Server when you design the data segmentation rules, our business requirements can be easily implemented, and the application can do so with minimal adjustments to meet the changes in the architecture, making the overall cost much less. After all, database schema transformation is more than just a DBA, but it also requires a lot of peripheral coordination and support. Even when designing a new system, we also take into account the overall investment in each environment, taking into account the cost of the database itself, as well as the corresponding development costs. If there is a "benefit" conflict between the links, then we have to make a trade-off based on the subsequent expansion and the overall cost, and find a balance point that is best suited to the current phase.
However, even if our segmentation rules are well designed, it is difficult to make all the data needed for a transaction on the same MySQL Server. So, while the cost of this solution is minimal, most of the time it can only take into account most of the core issues and not a perfect solution.
Second, the large transaction is divided into several small transactions, the database guarantees the integrity of the individual small transactions, the application controls the overall transaction integrity between the small transactions.
Compared with the previous scheme, this scheme will bring more application transformation and more stringent application requirements.

The application needs not only to break up many of the original large transactions, but also to ensure the integrity of each small transaction. In other words, the application itself needs to have a certain transaction capability, which undoubtedly increases the technical difficulty of the application.

However, this program also has a lot of advantages. First of all, our data segmentation rules are simpler and difficult to meet. and simpler, it means lower maintenance costs. Second, there are too many restrictions on data segmentation rules, the scalability of the database is also higher, not too many constraints, when the performance bottleneck can be quickly further split the existing database. Finally, the database is farther away from the actual business logic, and the subsequent architecture extension is more advantageous.
Third, combining the above two solutions, the integration of their advantages, to avoid their own shortcomings.
The previous two solutions all have their own advantages and disadvantages, and basically are antagonistic, we can take advantage of each of the two, adjust the design principles of two programs, in the overall architecture design to make a balance. For example, we canWhile ensuring that some of the core offices require data on the same MySQL Server, while others are not particularly important, they are combined into small transactions and application systems to ensure。 And, for some things that are not particularly important, we can also analyze them to see if it is unavoidable to use transactions.
With this principle of balanced design, we can avoid the need for the application to handle too many small transactions to ensure its overall integrity, but also to avoid the complexity of the split rules, resulting in later maintenance difficulties and scalability of the situation.
Of course, not all application scenarios should be combined with the above two scenarios to solve. For example, for those applications that are not particularly strict on the transaction requirements, or if the transaction itself is very simple, it is possible to meet the requirements by a slightly designed split rule, and we can simply use the first scenario to avoid the need for the application to maintain the overall integrity of certain small transactions. This can reduce the complexity of the application to a large extent.
And for those applications where transactional relationships are very complex and the correlation between data is very high, there is no need for us to design to keep transactional data focused, because no matter how hard we try, it is difficult to meet the requirements, most of which are encountered forgotten how scenarios. In this case, we might as well keep the database as concise as possible and let the application make some sacrifices.
In many of the current large-scale Internet applications, regardless of which of the above solutions are used cases, such as the familiar Ebay, is to a large extent the third combination of programs. In the process of integration, the second option is the main one, the first one is supplemented. In addition to the requirements of their application scenarios, the strong technical strength of these architectures is also a guarantee for the development of robust application systems. Another example of a domestic large BBS application system (inconvenient to disclose its real name), its transaction relevance is not particularly complex, the data correlation between each function module is not particularly high, is completely using the first solution, completely through the rational design of data splitting rules to avoid transaction data sources across multiple MySQL Server.
Finally, we need to understand the idea that the more things are not the better, but the less the better, the smaller the better.

No matter what solution we use, it is when we design our applications that we need to make the data's transaction dependencies as small as possible, even without transaction dependencies. Of course, this is only relative, and certainly only some of the data can be done. However, it is possible that the overall complexity of the system will be reduced to a large level after some data is not transaction-related, and the application and database systems are likely to pay a lot less.

12.3 Data Consistency Principles
Regardless of whether we design our own architecture and ensure that the final consistency of the data is absolutely incompatible, it is important to ensure the importance of this principle, which I think all of you must be very clear about.
Also, the assurance of data consistency is like transactional integrity, and we may encounter some problems when we design the system for scale out. Of course, if it is scale up, you may rarely encounter this kind of trouble.

Of course, in many people's eyes, the consistency of data is, in a way, a category of transactional integrity. However, in order to highlight its importance and related characteristics, I will put him alone to analyze.
So how do we keep the data consistent while the scale is out? A lot of times this problem is just as frustrating as ensuring transactional integrity, and it's also being watched by many architects. After many people's practice, we finally summed up the BASE model. That is: basic available, flexible state, basically consistent and ultimately consistent. These words look very complicated and abstruse, in fact, we can simply understand the principle of non-real-time consistency.
In other words, the application system through the relevant technology implementation, so that the entire system in order to meet the user's use of the basis, allow the data in a short time in a non-real time state, and through the follow-up technology to ensure that the data in the final guarantee in a consistent state. This theoretical model is really simple to listen to, but we will encounter a lot of difficulties in the actual implementation process.
First, the first question is, do we need to make all data consistent in real time? I think most of the readers ' friends are definitely voting against it. So if not all of the data are in real-time consistency, then how can we determine what data needs to be in real time and what data needs to be in real-time final consistency? In fact, this can be said to be a business priority of each module division, for the priority of the natural is to ensure the real-time consistency of data camp, and the application of a slightly lower priority, you can consider dividing to allow short-term inconsistency and eventually consistent camp. This is a very tricky question. We can't make decisions by taking a shot at the head, but we need a very detailed analysis and careful evaluation to decide. Because not all data can appear in the system can not be a short period of time in the inconsistent state, nor all the data can be processed by the final data to achieve a consistent state, so less of these two types of data is needed in real-time consistency.

And how to distinguish these two kinds of data, you have to go through the detailed analysis of business scenarios business requirements after a full evaluation to draw conclusions.
Secondly, how to make the inconsistent data in the system achieve the final consistency? In general, we must clearly separate the business modules that such data are designed into and the business modules that need real-time consistent data. Then through the relevant asynchronous mechanism technology, using the corresponding background process, through the system of data, logs and other information to the current inconsistent data for further processing, so that the final data in a fully consistent state. For different modules, using different background processes can not only avoid data disturbance, but also can execute concurrently and improve processing efficiency. such as the user's message notification, such as information, there is no need to achieve strict real-time consistency, only need to record the message needs to be processed, and then let the background processing process in turn, to avoid causing congestion in the foreground business.
Finally, avoid the front-line interaction between the two types of data in real-time consistency and eventual consistency. Because of the inconsistency of the two kinds of data states, it is likely to cause two kinds of data in the interaction process of the disorder, should try to make all the non-real-time consistent data and real-time consistent data in the application to be effectively isolated. Even in some special scenarios, it is necessary to record the physical isolation of a different MySQL Server.

12.4 High Availability and data security principles

In addition to the above two principles, I would also like to mention the system high availability and data security two aspects. After our scale out design, the overall scalability of the system is really going to be greatly improved, the overall performance is naturally easy to be greatly improved. However, the overall usability of the system has become more difficult than before. Because the overall architecture of the system is complex, both the application and the database environment are much larger and more complex than the original. The most direct impact of this is that maintenance is more difficult and system monitoring is more difficult.
If the result of such a design transformation is our system regular Crash, the frequent occurrence of down machine accident, I think everyone is certainly unacceptable, so we must through a variety of technical means to ensure that the system's availability will not be reduced, or even improve overall.
So, it naturally leads us to another principle in the design of scale out, which is the principle of high availability. No matter how the architecture of the design system is adjusted, the overall usability of the system cannot be reduced.
In fact, while discussing the usability of the system, it will naturally lead to another closely related principle, that is, data security principles. To be highly available, the data in the database must be secure enough. The security referred to here is not for malicious attacks or for stealing, but for the loss of exceptions. In other words, we must ensure that our data will not be lost in the event of a soft/hardware failure. Once the data is lost, there is no availability at all. Moreover, the data itself is the core of the database application system resources, it is absolutely impossible to lose this principle is beyond doubt.
to ensure high availability and data security, the best approach is to ensure that the redundancy mechanism is used. All hardware and software devices eliminate a single point of vulnerability, and all data has multiple copies. This will ensure that this principle is better ensured. On the technical side, we can do it through MySQL replication,mysql Cluster and other technologies.

12.5 Summary

No matter how we design our architecture, regardless of how our scalability changes, some of the principles mentioned in this chapter are very important. Whether it is the principle of solving certain problems or the principle of guarantee, whether it is the principle of ensuring availability, or the principle of ensuring data security, we should always pay attention to it in the design, remember. MySQL database is so hot in the internet industry, in addition to its open source features, easy to use, there is a very important factor is the scalability of a greater advantage. Each of its different storage engines has features that can be used to address a variety of different scenarios. Its Replication and Cluster features are a very effective means of enhancing extensibility.

Excerpt from: "MySQL performance tuning and architecture design" Jane Chaoyang

Reprint please specify the source:

Jesselzj
Source: http://jesselzj.cnblogs.com

MySQL performance tuning and architecture design--the 12th chapter on the Basic principles of extensible design

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.