Database Architecture for optimizing website performance

Last Update:2015-02-26 Source: Internet

Author: User

Tags website performance

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

It is understandable that many small web developers began to focus on the design of product requirements. But if the overall performance, scalability and other aspects of the consideration, watching the traffic to climb, can suddenly find that one day the site because of the traffic is too large and collapsed, then cry too late.

As I mentioned in the backend design, database access bottlenecks have always been a headache for high-concurrency, high-access Web applications. Especially when your program architecture is built in single-database mode, and the peak number of connections to a data pool has reached 500, your program is not far from the edge of the crash. In the process of expanding the scale of web sites from small to large, the architecture of the database also needs to be expanded dynamically, and each scaling performance can be improved in order of magnitude.

1. Web applications and databases are deployed on the same server

Web applications and database on the same server

This pattern can be taken when the amount of users, the amount of data, and the amount of concurrent traffic are small. However, as traffic increases, applications and databases grab limited system resources and quickly encounter performance problems.

2. Web applications and databases are deployed on their own separate servers

Web applications and databases are deployed on their own separate servers

Service separation, Web applications and databases are deployed separately, servers perform their duties, and the application server and database server can be upgraded as appropriate when traffic increases. This is typically a typical deployment of small-scale web sites, where applications are optimized for performance and use database object caching policies to host large traffic, such as 2000 of users, 200 concurrent, and millions of data.

3. The database server is deployed in a clustered manner

Multiple instances of a database

such as Oracle's case for multiple instances of a database. Database physical media is a disk array, and multiple DB instances provide database connectivity services to external application servers in virtual IP mode. This type of deployment basically meets the needs of most Web applications.

4. Master-slave replication mode for the database

Master-slave replication of the database

In the database access there are many query operations, in most cases a notable feature is that the read operation is much larger than the write operation, and the query conditions are relatively complex, most of the database performance is actually consumed on the query. If you can separate the read and write operations of the database, there will be a lot of room for the system performance improvement.

Almost all major databases support replication, and in MySQL, for example, only the binary logs on the primary server are turned on and the primary and slave servers are configured and authorized separately. Master-slave replication is based on the primary server's log files, and the operations recorded in the primary server log are periodically replayed from the server for replication, so the primary server must log all update operations for the primary database. Master-slave replication is used for automatic backup, but here our purpose is to read and write separation. To ensure the consistency of the data, we require that all updates to the database be made to the primary database, but the read operation can be done from the database.

The master-slave replication data is completed asynchronously, which leads to a certain delay in the data in the master-slave database, which must be considered in the design of read/write separation. In the case of blogs, bloggers have published an article that he needs to see immediately, but for other readers it is permissible to delay it for some time. So other external users with greater access volume can read from the database.

5. Database Vertical Segmentation

The master-Slave deployment method is suitable for read operation more dense than write operation premise, if the write operation accounted for the main database CPU consumption of more than 50%, we increase from the server is meaningless, because all the write operation from the server will also account for more than 50% of CPU consumption, The resources available to the query operation from the server are very limited. Database needs to be re-architected, we need to adopt database vertical partitioning technology.

The simplest vertical partitioning method is to split the original database into separate business, and the part that is separated from the other parts does not require a JOIN connection query operation. For example, the Web site blog and forum, is relatively independent of the data between them is not very strong correlation, then the original database can be split into a blog library, a forum library, as well as the rest of the table composed of libraries. The three libraries are then deployed in a master-slave database, so the pressure on the entire database is shared.

Query extensibility is one of the most important reasons for database partitioning, and dividing a large database into smaller databases can improve the performance of queries. In fact, the current hot NoSQL technology is the use of space-time, using the Key-value-type storage structure, even if there is a lot of data redundancy, but because of the avoidance of join operations to improve performance, nosql in high-concurrency high-load sites widely used.

6. Database Horizontal Segmentation

After the vertical partitioning of the database, if you still cannot handle a large number of writes, then we need a horizontal partition.

Horizontal partitioning means that we separate the records in the same database table from the specific algorithms that are stored in separate database tables so that they can be deployed on different database servers. A lot of large-scale sites are basically the master-slave replication + vertical partition + horizontal partitioning such architecture. Horizontal partitioning does not depend on any particular technology, it is purely a logical level of planning. We must partition the hot spot data that frequently accesses the site that is causing the crash.

When partitioning the data horizontally, we need to find an indexed field, such as user_id, which must be related to all records, is the primary key of the core table in the partitioned database, is the foreign key in the other table, and when the primary key is used, the primary key cannot be self-growing and must be a business primary key. The specific partitioning methods are:

remainder partition : The value after user_id%10 is deposited into different partitioned database, the algorithm is simple and efficient, but when the number of partitioned databases is changed, the data of the whole system needs to be re-distributed.
Scope Partition : The scope of the USER_ID partition, such as 1-100000 is a partitioned database, 100001-200000 is a partitioned database, the algorithm in the number of partitioned database changes, the system is very conducive to expansion, However, it is easy to make different pressures between partitions, such as the high pressure of the partitioned database where the old user resides, but the pressure on the new user's partitioned database is low.
Mapping Relationship Partitioning : A partition mapping relationship is created for each possible result of a partitioned index field, which is very large and needs to be written to the database. For example, when an application needs to know the content of a user_id 10 user's blog in that partition, it must query the database for the answer, and of course we can use caching to improve performance. This method preserves the partition correspondence of each record in detail, so each partition has very strong scalability, can be controlled flexibly, and it is simple to migrate the database from one partition to another, as well as to allow each partition to maintain the pressure distribution balance through flexible dynamic adjustment.

Reference Link: http://blog.csdn.net/zhangzhaokun/archive/2009/10/22/4711693.aspx

Http://www.lovelucy.info/website-database-optimization.html?variant=zh-hans

Database Architecture for optimizing website performance

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More