Recently saw a lot of the evolution of the company structure of the article, found that the basic ideas and architectural evolution are very similar, here also summarizes the evolution of the database architecture and the ideas behind the evolution.
Single host
The beginning of the site is generally by the typical lamp architecture evolved, is generally a Linux host, an Apache server, PHP execution environment and MySQL server, in general, these are on a virtual host, referred to as a single host mode.
Single Master Mode disadvantages:
1 Web server and MySQL server common one host, share hardware resources, there may be one side of the resource requisition too large, resulting in the entire application bottleneck
2 When the business grows, there is no way to scale it out.
3 fault tolerance is too poor, once the host has a problem, the entire application is not available
Standalone host
With the development of the business, the MySQL server and the Web server host can be separated, deployed separately, is the standalone host mode.
In standalone host mode, the Web server and MySQL no longer share hardware resources and are deployed separately. Not putting eggs in a basket adds to the fault tolerance. In the case of a MySQL server failure, applications that do not have access to the server on the web will not be affected. And the Web server can be scale-out, if the Web server performance is not enough, you can increase the number of Web servers, load balancing, decentralized Web server pressure.
Disadvantages of standalone host mode:
1 Extensibility Issues: While Web servers can scale horizontally, there is no way for a MySQL server to scale horizontally.
2 Usability issues: MySQL server has a single point of issue, once the MySQL server down, the impact of a large
3 Performance issues: A single MySQL server can support a limited number of services.
Read/write separation
With the continuous development of the business, the database pressure will be more and more, a single database slowly can not meet the demand, some sites on the data real-time requirements are not high, will slowly develop the read-write separation mode, for the ordinary query request, allocated to read the library (also can be said to be prepared), for the modification request, on the main library For the Read library, because it is stateless, you can do scale-out. For a write library, it can only be a single host
This model is actually limited, depending on the type of business to consider. The data in the main library is up-to-date, but there is a delay in synchronizing to the Read library, so the application must tolerate short inconsistencies. It is not suitable for scenarios where the consistency requirement is very high.
Problems with the existence of this pattern:
1 Extensibility: Although the read library can be scaled horizontally, but the library does not work, the read library can not scale horizontally
2 Availability: Read the library becomes a single point, once the failure, affecting all the write operations of the business
Vertical split of business
With the development of the business, a writing library is obviously not able to meet the high concurrency situation, but considering that the writing library is stateful, can not be simply scale-out, assuming that there are two write libraries, then randomly update one of the data, it will cause the other side of the data problems. There is a two different version of the data that is obviously unacceptable. In the writing library, you can consider the vertical sub-Library according to the business. As we are talking about the database schema, for the web layer, in fact, it can be split vertically according to the business.
In accordance with the vertical division of business, the system has a high performance, only need to divide the business into vertical parts, the finer the division, the overall expansion of the system is more powerful.
In this mode, there are several issues
1 availability: Assuming that a complete business process p Access database is split into a, B, C, D, E Five libraries, assuming that the availability of each write library is 99%, then the availability of this business process P is 99%*99%*99%*99%*99%=95%, the more the library splits, The greater the overall availability challenge for the system.
2 Performance: Since the load on each library of the vertical business library may be different, assuming that the transaction base is heavily loaded, a transaction library is certainly not able to meet the demand, in which case the trading database becomes the bottleneck of the whole system.
3 Extensibility: The scalability of a single node is not improved, and the transaction library cannot be scaled separately.
Single Business library horizontal, vertical split
In the previous case, it is assumed that the trading library is a bottleneck for the entire system and requires a separate expansion of the trading library. It is possible to consider the horizontal split of a trade or the vertical split, which may be split in two ways at the same time.
Horizontal splits are typically split based on business-agnostic keywords, which is better for scale-out, but challenging for queries
Vertical splits are typically split according to the business, but may result in uneven data and lack of flexibility in splitting. Relatively friendly for queries
Take a trading library for example, you can trade the type of the vertical sub-Library, in accordance with the order number of the level of the library.
Assuming that the m*n can be divided into libraries, the failure of a single library will affect 1/m*n transactions, but assuming that the availability of each library is 99%, the probability of a transaction database failure is (99%) (m+n), and if the database is split more, the probability of a single database failure is higher.
There are problems with this approach:
1 Although the number of users affected by a single node failure is small, the overall availability is reduced.
2 The complexity of database management challenges, assuming that the transaction database table structure changes, the need to implement the MXN script changes.
3 because the probability of a single database failure is high, the DBA will be very bitter, the estimated regular fire
4 development and testing will be very painful, development and testing costs will be high, query is very complex.
5 If a single node fails, there is no failure detection and switching mechanism
6 The library can not be infinitely extended horizontally, our algorithm is allocated in advance M library, if adding a library is basically not feasible
Random Sub-Library
For the sixth question, in the horizontal direction of the wireless extension, you can consider a mechanism, when the insert data, the application of a database number, and then the database number as a field to save or to add this number to the already field.
For example, if we apply for the Insert database, get a database number of 1000, then we can construct an order number of 1000_tradeno, the order number is preceded by the library number, the order number is the actual Tradeno, which solves the problem of horizontal wireless expansion. This is the random sub-Library mode. But the limitations of this approach are great,
Disadvantages of random libraries:
1 the library algorithm and business coupling together, more suitable for a particular scenario, the scope of the application is relatively narrow
2 for insert operation, it is easier, for update operation, must have the library number, that is, only according to specific fields to update
3 is not suitable for the batch query scenario, the query function limit is relatively large, which is also the problem of the library
Single-database backup and failure switching
For a single database, if a failure occurs, it can affect the business, but can be toggled in the event of a failure. Although achievable, there are certain problems that require specific analysis of specific scenarios. This piece is more complicated, say can write an article, simply introduce
The above is the summary of the structure of the database evolution, the evolution of the database needs a lot of basic technology to support, mainly including
1 Powerful distributed database management middleware, mainly shielding the underlying database routing and data management functions
2 Powerful data OPS team and monitoring system to detect the database status of each node
3 Powerful database management team, able to maintain such a database cluster
4 Powerful business architecture and technical architecture capabilities to take control of such complex business scenarios.
Evolution of the database architecture