Large-scale, high-concurrency, and high-load web Application System Architecture-database architecture Policy

Last Update:2018-12-05 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

In the process of expanding the size of Web websites from small to large, the database access pressure is constantly increasing, and the database architecture needs to be dynamically expanded, the database expansion process consists of the following steps. Each extension can improve the performance of the deployment method in the previous step by an order of magnitude.

1. web applications and databases are deployed on the same server.

Generally, small-scale websites use this method. The user volume, data volume, and concurrent access volume are relatively small. Otherwise, a single server cannot afford it, in addition, the hardware upgrade cost is very high when the performance bottleneck is encountered. When the access traffic increases, applications and databases both seize limited system resources, soon, we will encounter performance problems.

2. web applications and databases are deployed on independent servers.

Web applications and databases are deployed separately. The Web application server and database server perform their respective duties. When the system traffic increases, you can upgrade the application server and database server respectively, this deployment method is a typical deployment method for small-scale websites. When the application performance is optimized and the database object cache policy is used, it can carry a large amount of traffic, such as 2000 users, 200 concurrent users, and millions of data volumes.

3. database servers are deployed in clusters (for example, multiple instances of one database in Oracle)

The database cluster mode has a large load. The physical media of the database is a disk array. Multiple database instances provide database connection services to external application servers using virtual IP addresses. This deployment method can basically meet the needs of the vast majority of common web applications, but it still cannot meet the needs of applications with large user volumes, high loads, and frequent database read/write access.

4. databases are deployed in Master/Slave Mode.

There are millions of users, tens of millions of data records, and a large number of database query operations in the blog, discussion, dating, and CMS systems for mass users, there are also many database write operations, and in most cases, the read operations are much larger than the write operations. At this time, if the read/write operations of the database can be separated, the system will be greatly improved. The master-slave deployment method of the database comes to us.

Master-slave replication:

Almost all mainstream databases support replication, which is the basic means for simple database expansion. The following uses MySQL as an example to describe that it supports master-slave replication and the configuration is not complex, you only need to enable the binary log on the master server and perform simple configuration and authorization on the master server and slave server respectively. MySQL master-slave replication is performed by a binary log file of the master server. The operations recorded in the master server logs are replayed from the slave server to achieve replication, therefore, the master server must enable binary logs to automatically record all updates to the master database. Data is copied from the server to the master server to obtain binary log files for replay. Master-slave replication is also used for automatic backup.

Read/write Splitting:

To ensure database data consistency, we require that all database update operations be performed on the master database, but read operations can be performed on the slave database. Database read operations on most sites are more intensive than write operations, and query conditions are relatively complex. Most of the database performance is consumed by query operations.
Master-slave replication completes data asynchronously, which leads to a certain delay in data in the master-slave database. This must be taken into account in the design of read/write splitting. Taking a blog as an example, a user posted an article after logging on. He needs to see his article immediately, however, other users are allowed to delay for a period of time (1 minute, 5 minutes, or 30 minutes) without any problems. At this time, the current user needs to read the master database, and other external users with more traffic can read the slave database.

Database reverse proxy:

When read/write splitting is used to deploy a master-slave database, a primary database corresponds to multiple slave servers. Write operations are performed on the primary database, the number of databases is unique, but you need to use appropriate algorithms to allocate requests for read operations on the slave server, especially when the configurations of multiple slave servers are different, you even need to assign the read operations by weight.
You can use the database proxy to solve the above problems. Like the WEB Proxy Server, MYsql Proxy can also modify the SQL statement before it is forwarded to the backend Mysql server.

5. vertical database Segmentation

In the master-slave deployment database, when write operations account for more than 50% of the CPU consumption of the master database, the significance of adding slave servers is not great, because all write operations on the slave server account for more than 50% of CPU consumption, the resources provided by one slave server for query are very limited. The database needs to be re-structured. we need to adopt the database vertical partitioning technology.
The simplest vertical partitioning method is to split the independent services in the original database (the split part does not need to be joined to other parts for query ), for example, the BLOG and forum of a WEB site are relatively independent and not highly correlated with other data. In this case, you can split the original database into a BLog library and a forum library, and the database composed of the remaining tables. The three databases are deployed in the master-slave database mode, so that the pressure on the entire database is shared.
In addition, query scalability is also one of the main reasons for using database partitions. Dividing a large database into multiple small databases can improve query performance, because each database partition has a small part of its own data. Suppose you want to scan 0.1 billion records. For a single-partition database, this scan requires the Database Manager to scan 0.1 billion records independently. If you make the database system into 50 partitions, and the 0.1 billion records are evenly allocated to the 50 partitions, then the database manager of each database partition will only scan 2 million records.

6. Horizontal database Segmentation

After the vertical partitioning of the database, what should we do if our BLOG library is unable to perform write operations again? Vertical database partitions are useless in this expansion mode. What we need is horizontal partitions.
Horizontal partitioning means that records in the same database table can be separated and stored in different database tables by using specific algorithms, so that they can be deployed on different database servers. Many large-scale websites basically adopt the architecture of master-slave replication, vertical partitioning, and horizontal partitioning. Horizontal partitioning does not rely on specific technologies. It is completely logical village planning and requires the subdivision of experience and business.
How to partition? For large WEB sites, partitioning is required, and there is no choice for partitioning. For hotspot data that causes frequent access to the site to close to collapse, partitioning is required.
When partitioning data, we must have a partition index field, such as USER_ID, which must be related to all records and is the primary key of the core table in the partitioned database, when a primary key is used as a foreign key in other tables, the primary key cannot grow by itself. It must be a business primary key.

Remainder partition:

We can save the value after User_ID % 10 to different partition databases. This algorithm is simple and efficient, but when the number of partition databases changes, data of the entire system needs to be re-distributed.

Range partition:

We can partition the range of User_ID. For example, the range of 1-is a partitioned database, and the range of 1-is a partitioned database. When the number of Partitioned databases changes, the system is very helpful for expansion, however, the pressure on different partitions may vary. For example, the pressure on the partitioned database where old users are located is high, but the pressure on the partitioned database of new users is low.

Ing partition:

Create a partition ing relationship for each possible result of the partition index field, which is very large and needs to be written into the database. For example, when an application needs to know that the BLOG content of a user whose User_id is 10 is in that partition, it must query the database to obtain the answer. Of course, we can use cache to improve performance.
This method saves in detail the partition ing relationship of each record, so each partition has a very strong scalability and can be flexibly controlled. It is also very easy to migrate the database from one partition to another, it can also enable flexible and dynamic adjustment of each partition to maintain a balance of pressure distribution.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Large-scale, high-concurrency, and high-load web Application System Architecture-database architecture Policy

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Large-scale, high-concurrency, and high-load web Application System Architecture-database architecture Policy

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support