Database-Capacity (RPM)

Last Update:2018-04-21 Source: Internet

Author: User

Tags db2

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Design and practice of typical database architecture

1). Single-Library architecture

2). Grouping schemas

What is a group?

A: The packet schema is the most common one master multi-slave, master and writer synchronous, read and write detached database schema:

User-service: Still a User Center service
User-db-m (Master): main Library, providing database write service
User-db-s (Slave): Provides database read service from the library

The primary and the database clusters are called "groups" from the composition.

What are the characteristics of grouping?

A: The DB cluster in the same group:

Synchronization of data between Master and slave via Binlog
Multiple instance database structures are identical
Multiple instances store data exactly the same, essentially copying the data

What exactly does the grouping architecture solve?

A: Most of the internet business read and write less, database reading is often the first performance bottleneck , if you want to:

Linear enhancement of database read performance

Improve database write performance by eliminating read-write lock conflicts
"Read-High Availability" of data from libraries through redundancy

You can use the grouping schema at this point, noting that in the grouping schema, the main library of the database is still a write point.

In a word, the group solves the problem of "high concurrency of database reading and writing ", and implements the architecture design.

3). Shard Schema

What is a shard?

A: The Shard architecture is what everyone often says about the horizontal segmentation (sharding) database schema:

User-service: Still a User Center service
USER-DB1: Horizontal cut into the first of 2 copies
USER-DB2: Horizontal cut is divided into 2 parts in the second part

After a shard, multiple DB instances also form a DB cluster.

Horizontal segmentation, in the end is a sub-library or sub-table?

A: It is strongly recommended to divide the library, not the sub-table, because:

Sub-table still common one database file, there is still competition for disk IO
Sub-Libraries can easily migrate data to different DB instances, even on database machines, for better extensibility

What is the algorithm for horizontal slicing?

A: The common horizontal segmentation algorithm has "Scope method" and "Hash Method":

The scope method is to divide the data horizontally into two DB instances based on the UID of the business primary key of the User center:

USER-DB1: Store UID data from 0 to 10 million
USER-DB2: Store UID data from 0 to 20 million

hashing such as: is also based on the user center of the business primary key UID, the data level is divided into two db instances up:

USER-DB1: The UID data that stores the UID to modulo 1
USER-DB2: The UID data that stores the UID to modulo 0

Both of these methods are used in the Internet, where hashing is used more broadly.

What are the characteristics of sharding?

A: The DB cluster in the same shard:

There is no direct connection between multiple instances, unlike Binlog synchronization between master and slave.
Multiple instance database structures, also identical
There is no intersection between the data stored by multiple instances, and the data and set form the global data between all instances

what exactly does the Shard architecture solve?

A: Most of the Internet business Data volume is very large, the single library capacity is easy to become a bottleneck, at this time through the Shard can:

To improve the performance of database write linearly, it is important to note that the packet schema is not linear to improve the performance of database writing
Reduce single-Library data capacity

In a word, the shard solves the problem of "large database data Volume" and implements the architecture design.

4). packet + Shard schema

If the volume of business reads and writes is high and the amount of data is large , it is often necessary to implement a packet + Shard database schema:

Reduce the data volume of library by slicing and improve the writing performance of database linearly
Improve the read performance of the database linearly by grouping to ensure the high availability of the Read library

5). Vertical Slicing

In addition to horizontal segmentation, vertical segmentation is also a common database architecture design, vertical segmentation and business integration is relatively close.

In the case of the User center, this can be done vertically:

User (UID, uname, passwd, sex, age, ...)

USER_EX (UID, intro, sign, ...)

Vertically cut separate tables, primary keys are UID
The login name, password, gender, age and other attributes are placed in a vertical table (library).
Self-Introduction, personal signature and other attributes placed in another vertical table (library)

How do I slice vertically?

A: According to the business of vertical segmentation of data, generally consider the "length" of the attribute and "frequency of access" two factors:

Short length, high frequency of access to put together
Long length, low frequency of access to put together

This is because the database will be in rows (row), the number of load into memory (buffer), in the case of limited memory capacity, short-length and high-frequency properties, memory can load more data, hit rate will be higher, disk IO will be reduced, the performance of the database will be improved.

What are the characteristics of vertical slicing?

A: Vertical slicing and horizontal cutting have similar places, but not the same:

There is no direct connection between multiple instances, i.e. no Binlog synchronization
Multiple instance database structures, not the same
There is at least one column intersection between the data stored by multiple instances, typically a business primary key, and the data and set of all instances form the global data

what problem does vertical slicing solve?

A: Vertical slicing reduces the amount of data in a library and reduces disk IO to increase throughput, but it is tightly integrated with business and not all businesses can be vertically segmented.

6). Summary

The article is longer, and I want to remember at least these points:

Initial business Use single library
Read pressure, read high available, with group
Large data volume, write linear expansion, with sharding
Properties with short attributes, high frequency of access, vertically split together

Content transferred from the public number: Architect's path

Database-Capacity (RPM)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More