Recently I have read about the evolution of many corporate architectures.ArticleIt is found that the basic ideas and Architecture Evolution are similar. Here we also summarize the ideas behind the evolution and evolution of the database architecture.
Single host
At the beginning, the website is generally evolved from a typical lamp architecture. Generally, it is a Linux host, an Apache server, a PHP Execution Environment, and a MySQL server. Generally, these are all in single-Host Mode on a virtual host.
Disadvantages of single host mode:
1. The web server and MySQL Server share a host and share hardware resources. Some of the resources may be requisitioned too much, leading to bottlenecks in the entire application.
2. When the business grows, there is no way to scale horizontally.
3 fault tolerance is too bad. Once the host has problems, the entire application is unavailable.
Independent host
With the development of business, you can separate the MySQL server from the Web server host and deploy them separately, that is, the independent host mode.
In standalone mode, web servers and MySQL do not share hardware resources and are deployed separately. Without placing the eggs in a basket, the fault tolerance is increased. If the MySQL server is faulty, applications that do not access the server on the Web will not be affected. In addition, Web servers can be scaled horizontally. If the performance of web servers is insufficient, multiple Web servers can be added for load balancing to relieve the pressure on Web servers.
Disadvantages of standalone Host Mode:
1. Scalability: Although Web servers can be scaled horizontally, MySQL servers cannot be scaled horizontally.
2 availability problem: there is a single point of failure in the MySQL server. Once the MySQL server goes down, it will have a great impact.
3. Performance problems: the services supported by a single MySQL Server are limited.
Read/write splitting
With the continuous development of business, the pressure on databases will increase, and a single database will not be able to meet the demand. Some websites do not have high requirements on real-time data, and the read/write splitting mode will be developed slowly, for common query requests, they are allocated to the read database (or the standby database) and the modification requests are completed on the master database. The database reading is stateless and can be scaled horizontally. Only a single host can be used for writing databases.
This mode is actually limited and should be considered based on the business type. The data in the master database is up-to-date, but the synchronization to the read database has a latency. Therefore, applications must be able to tolerate transient inconsistencies. It is not suitable for scenarios with high consistency requirements.
Problems with this mode:
1. Scalability: although the read database can be horizontally expanded, the write database is not enough, and the read database cannot be horizontally expanded.
2 availability: Database reading becomes a single point. Once a fault occurs, all write operations are affected.
Vertical business split
With the development of the business, a write database obviously cannot meet the high concurrency requirements. However, considering that the write database is stateful, it cannot be simply scaled horizontally. Assume that there are two write databases, if the data of one instance is updated randomly, the data of the other party may be faulty. Two different versions of data are obviously unacceptable. On the write database, you can consider vertical database sharding by business. As we are talking about the database architecture, the web layer can be split vertically by business.
After Vertical Split by business, the system performance has been greatly improved. You only need to divide the business into vertical parts. The finer the split, the higher the overall scalability of the system.
In this mode, the following problems exist:
1 Availability: assume that the database accessed by a complete business process P is split into five databases: A, B, C, D, and E. Assume that the availability of each write database is 99%, therefore, the availability of this business process P is 99% * 99% * 99% * 99% * 99% = 95%. The more database shards, the more challenging the overall availability of the system.
2. Performance: Because the load of each database in the vertical business database may be different, assuming that the transaction database has a high load, a transaction write database cannot meet the requirements. In this case, the transaction database becomes the bottleneck of the entire system.
3. Scalability: the scalability of a single node is not improved, and the transaction database cannot be expanded separately.
Horizontal and vertical split of A Single Business Database
In the previous case, assuming that the transaction database is the bottleneck of the entire system, you need to expand the transaction database separately. You can consider horizontal or vertical splitting of transactions. Two splitting methods are possible at the same time.
Horizontal splitting is generally performed based on business-independent keywords, which provides better horizontal scalability, but the query challenge is large.
Vertical splitting is generally based on the business, but may result in uneven data and inflexible splitting. Relatively friendly for queries
Take the transaction database as an example. You can perform vertical database sharding Based on the transaction type and perform horizontal database sharding based on the order number.
Assuming that the database can be divided into M * n databases, the failure of a single database will affect the transaction of 1/M * n. However, if the availability of each database is 99%, the failure probability of the transaction database is (99%) (m + n) to the power. If the database is split more, the probability of a single database failure is higher.
Problems with this method:
1. Although there are few users affected by a single node failure, the overall availability is reduced.
2. Database Management brings about complex challenges. Assume that the structure of the table in the transaction database is changed, m × n script changes must be executed.
3. Since the probability of a single database failure is relatively high, DBAs will suffer a lot and it is estimated that it will often be necessary to save fire.
4. Development and testing will be very difficult, the development and testing costs will increase, and the query will be very complex.
5. If a single node fails, no failure detection and Failover mechanism is detected.
Six sub-databases cannot be scaled horizontally.AlgorithmM databases are allocated in advance. Adding a database is basically not feasible.
Random database shard
For the sixth problem, you can consider a mechanism for horizontal wireless scaling to apply for a database number when inserting data, save the database number as a field or add it to a field.
For example, if we apply for an insert database and obtain a database Number of 1000, we can construct an order number of 1000_tradeno. Before the order number is the database shard number, the order number is followed by the actual tradeno, this solves the problem of horizontal wireless expansion. This is the random database sharding mode. However, this method has many limitations,
Disadvantages of random database sharding:
1. The database sharding algorithm is coupled with the business, which is suitable for specific scenarios and has a narrow application scope.
2. The insert operation is relatively easy. For the update operation, the database shard number must exist. That is, the update operation can only be performed based on specific fields.
3. It is not suitable for batch query scenarios. The query function has a large limit, which is also a problem caused by database sharding.
Single database backup and Failover failure
If a single database fails, the service will be affected, but can it be switched in case of a fault. Although it can be implemented, there may be some problems and specific analysis needs to be performed in specific scenarios. This section is complex. You can write an article to briefly introduce it.
The above is the summary of the evolution of the database architecture. The evolution of the database requires the support of many basic technologies, including
1. Powerful distributed database management middleware, mainly blocking underlying database routing and data management functions
2. A powerful data O & M team and monitoring system can detect the database status of each node
3. A powerful database management team can maintain such a database cluster
4. Powerful business architecture and Technical Architecture capabilities to control such complex business scenarios.