Large-scale high concurrent high-load Web application Architecture-Database schema policy (RPM)

Source: Internet
Author: User

Reproduced Original: http://blog.csdn.net/zhangzhaokun/article/details/4711693


large-scale high concurrent high-load Web Application System Architecture-database schema Strategy

As Web sites grow in size from small to large, database access pressure is also increasing, the database architecture also needs to dynamically expand, the database expansion process is basically included in the following steps, each extension can be compared to the previous step of the deployment of the performance of a number of levels of promotion.

1. Web applications and databases are deployed on the same server

The general small scale site in this way, the amount of user, the amount of data, concurrent access is relatively small, or a single server can not afford, and in the face of performance bottlenecks in the upgrade of hardware costs is very high, in the increase in access, the application and database to preempt the limited system resources, You will soon encounter performance problems.

2. Web applications and databases are deployed on separate servers

Web applications and databases are deployed separately, Web application servers and database servers perform their own duties, and the application servers and database servers can be upgraded separately when system traffic increases, which is typical for small scale web sites. When you optimize your application for performance and use database object caching policies, you can host large amounts of traffic, such as 2000 users, 200 concurrent, millions other data volumes.

3. The database server is deployed in a clustered manner (for example, multiple instances of Oracle's database)

Database cluster way can bear the load is relatively large, database physical media for a disk array, multiple database instances in virtual IP mode to external application server to provide database connection services. This kind of deployment can basically satisfy most common web applications, but still can not meet the large user, high load, database read and write access very frequent applications.

4, the database uses the master-slave deployment Way

In the public users of the blog, talk about, friends, CMS and other systems, there are millions of users, there are tens of thousands of data, there are many database query operations, there are more database write operations, and in most cases are read more than write operations. At this time, if the database can be separated from the read and write operations, for the system is a great improvement. The master-slave approach to the database comes to us.

Master-slave replication:

Almost all major databases support replication, which is the basic tool for a simple database expansion. Take MySQL as an example to illustrate that it supports master-slave replication, configuration is not complex, only need to open the primary server binary log and on the primary server and from the server for simple configuration and authorization. MySQL master-slave replication is a primary server of the binary log files, the master server log records in the operation will be replayed from the server, so that replication, so the primary server must open the binary log, automatically record all the updates for the primary database operations, Copying the data is accomplished by replaying the binary log file from the server to the primary server. Master-slave replication is also used for automatic backups.

Read and write separation:

To ensure consistency of database data, we require that all updates to the database be done against the primary database, but that the read operation can be done against the database. Most of the site's database read operations are more intensive than write operations, and query conditions are relatively complex, most of the database performance consumption in the query operation.
The master-slave copy data is completed asynchronously, which leads to a certain delay in the data in the master-slave database, which must be considered in the design of read-write separation. For example, when a user logs in and publishes an article, he needs to see his article right away, but for other users it can be allowed to delay for a period of time (1 minutes/5 minutes/30 minutes) without causing any problems. The primary database is required for the current user and can be read from the database for other users with greater access.

Database Reverse Proxy:

When using a database of master-slave deployments in a read-write, detached way, will encounter a problem, a primary database corresponding to multiple from the server, for the write operation is for the primary database, the number of databases is unique, but for the read from the server needs to use the appropriate algorithm to allocate the request, In particular, the need for more than one configuration from the server is even required to read the weight of the allocation.
You can use the database direction broker for the above issues. Like the web direction proxy server, MySQL proxy can also modify the SQL statement before forwarding it to the MYsql server on the back end.

5, Database Vertical segmentation

In the master-slave deployment database, when the write operation accounted for the main database CPU consumption of more than 50%, we increase the meaning of the server is not very large, because all the writes from the server will also account for more than 50% of the CPU consumption, a server from the query resources are very limited. The database needs to be redesigned, and we need to adopt a database vertical partitioning technique.
The simplest vertical partitioning is the separation of separate business from the original database (the split part does not require join-join query operations), such as Web site blogs and forums, which are relatively independent, and are not highly correlated with other data. The original database can then be split into a blog library, a forum library, and the remaining tables composed of the library. The three libraries are then deployed separately from the master-slave database, so the pressure on the entire database is shared.
In addition, query extensibility is one of the main reasons for using database partitioning. Splitting a large database into smaller databases can improve query performance because each database partition has a small portion of its own data. Suppose you want to scan 100 million records, for a single partitioned database, the scan requires the database Manager to scan 100 million records independently, and if you make the database system into 50 partitions, and distribute the 100 million records evenly over the 50 partitions, The database manager for each database partition will scan only 2 million records.

6. Database Horizontal Partition

After the vertical partition of the database, if our blog library again can not assume the write operation, we should do. Database vertical Partitioning There is no way to extend this, we need horizontal partitioning.
Horizontal partitioning means that we can separate records from the same database table by a specific algorithm and save them in separate database tables so that they can be deployed on different database servers. Many large-scale sites are basically master-slave replication + vertical Partitioning + horizontal partitions such a framework. Horizontal partitioning does not depend on any particular technology, it is purely a logical village-level planning, and requires a breakdown of experience and business.
How to partition it. For a large web site, it must be partitioned and we have no choice for partitioning, and we must partition the Hotspot data that frequently accesses the site's proximity to collapse.
When partitioning the data, we have to have a partitioned index field. For example, user_id, which must be related to all records, is the primary key of the core table in the partitioned database, as a foreign key in other tables, and when using a primary key, the primary key cannot be self growing, it must be a business primary key.

Remainder partitions:

We can deposit the value of USER_ID%10 into different partitioned database, the algorithm is simple and efficient, but when the number of partitioned database is changed, the data of the whole system needs to be distributed again.

Range Partition:

We can partition the scope of the user_id, such as 1-100000 range of a partition database, 100001-200000 range of a partitioned database, the algorithm in the number of partitioned database changes, the system is very conducive to expansion, but easily lead to different partitions of different pressure , such as the pressure on the partitioned database where the old user resides, but the pressure on the new user's partitioned database is low.

Mapping Relationship partitions:

A partitioned mapping relationship is created for each possible result of the partitioned indexed field, which is a very large mapping relationship that needs to be written to the database. For example, when an application needs to know that the blog content of a user with user_id 10 is in that partition, it must query the database for answers, and of course we can use caching to improve performance.
This way the detailed preservation of each record of the partitioning of the relationship, so each partition has a very strong scalability, flexible control, and the database from one partition to another is very simple, but also can make each partition through flexible dynamic adjustment to maintain the distribution of pressure balance.


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.