Design of MySQL Web database-2 table design principles

Last Update:2014-07-21 Source: Internet

Author: User

Tags web database

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

[Principle of separation of duties]

Separation of duties principle refers to the design should take into account the production of data, the use of aggregation principles, each system to do their own things, each system only to do their own things. Which system a data table should be placed on, usually depends on several points:

1. Who produces this information: usually who produces this data should be responsible for this data, that is, to consider the creation of the data, development, destruction and other life-cycle definition, and to maintain this definition to provide consumers with the principle of consumption;

2. Who uses this information most often: if a system most often uses this data, the most often to modify a certain data, it should be responsible for the maintenance of the system to maintain the data;

3. Adhere to high cohesion, low coupling considerations: when storing data when taking into account the principle of data usage has led to a very high degree of data storage in many places, the need for multiple systems to maintain this data may lead to the coupling between the system, should be avoided as far as possible.

We should also abide by the same principle when we design the relationship between database tables, the separation of duties reduces coupling, but at the same time take into account the performance situation, so that proper redundancy does not lead to the complexity of the modification logic.

Here are some examples of the most common posts and comments:

create table  ' Wanted_post '   (   ' id '  int (TEN)  not null auto_ increment,   ' Puid '  int (Ten)  unsigned NOT NULL,   ' user_id '  int (10)  NOT NULL COMMENT  ' Post User ID ',   ' username '  varchar (a)  not null  COMMENT  ' Post user's username ',   ' city '  smallint (4)  NOT NULL COMMENT  ' location ',    ' IP '  bigint  NOT NULL COMMENT  ' post ip ',   ' district_id '   tinyint (2)  NOT NULL COMMENT  ' area ID ',   ' district_name '  varchar (  not null comment  ' administrative name ',   ' street_id '  tinyint (2)  NOT NULL  ID of comment  ' street (landmark) ',   ' Street_name '  varchar ( NOT NULL COMMENT  ') Cell name ',   ' title '  varchar (255)  NOT NULL COMMENT  ' post title ',   ' Description '  text not null COMMENT  ' Post details description ',   ' Post_at '  int (one)  NOT NULL COMMENT  ' User posting time, Data creation time, using integer storage ',   ' Refresh_at '  int (one)  NOT NULL COMMENT  ' Post modified time, integer storage ',    ' Show_time '  int (one)  NOT NULL COMMENT  ' posts Show time ',   ' Age_max '   Int (one)  NOT NULL DEFAULT  ' 0 '  COMMENT  ' recruiting minimum age ',   ' age_min '  int (11)  NOT NULL DEFAULT  ' 0 '  COMMENT  ' recruiting maximum age ',   ' Post_refresh_at '  int (11 )  NOT NULL COMMENT  ' Refresh Time ',  primary key  (' id '),   unique  KEY  ' Idx_puid '   (' puid '),  key  ' User_id_index '   (' user_id '),   key   ' Post_at_index '   (' post_at '),  key  ' Refresh_at_index '   (' refresh_at '),   KEY  ' Show_time_index '   (' show_time '))  engine=innodb auto_increment=55295 default  charset=utf8 coMment= ' Recruitment posts table ' create table  ' wanted_post_comment_99 '   (   ' id '  int (one)  not  NULL AUTO_INCREMENT,   ' Puid '  int (Ten)  unsigned NOT NULL,   ' user_id '  int  NOT NULL COMMENT  ' review user ID ',   ' post_at '  int (one)   not null comment  ' comment time ',   ' detail '  text NOT NULL COMMENT  ' Comment Details ',  primary key  (' id '),  key  ' User_id_index '   (' user_id '),   KEY  ' Puidid_index '   (' puid ')  ENGINE=InnoDB AUTO_INCREMENT=10 DEFAULT  Charset=utf8 comment= ' Recruitment review sub-table 99 '

Due to the large amount of Comment table data, in advance to make a table, according to the post Puid divided into 100 sub-tables, then the current detail page involves SQL as follows:

SELECT * FROM Wanted_post where Puid=xxxx;select * from wanted_post_comment_99 where puid=xxxx;

This is a simplified model, a lot of comments, but also related to paging, it is not possible to take out all at once. For the above scene, strictly observe the cohesion, low coupling principle, will not store redundant data. Compared with a document database, such as MONGO, you can store comments and posts together and access them with a single sequential IO operation. Overall table design, according to the principle of division of responsibilities.

[on-line processing and analysis separation]

1. In order to ensure the performance of online data processing, some analysis-related data and analysis results should be stored using a separate library to avoid the loss of business data throughput during data analysis, causing system problems.

2. Dedicated to the storage of offline report data, and provide online data query method, it is recommended that the statistical results, aggregated data from the online processing database to remove.

For the above Wanted_post Job Posting table, online processing can only be user in action: Browse, modify, delete, respectively, corresponding to the following SQL:

SELECT * from Wanted_post where puid=xxxxx;update wanted_post set xxx=xxx where Puid=xxxx;delete from Wanted_post where PU id=xxxx;

Similarly, for background statistics, there are aggregation operations that are very performance-intensive, such as viewing a user's posting volume:

Select COUNT (*) from wanted_post where user_id=xxxx;

As a general example, it is in principle to separate the online user request from the background statistics request. To put it simply, this requirement is handled as follows:

It is simple and efficient to point a request to a different slave, and the disadvantage is that the amount of data is increased to play.
Set up an off-line report library, dedicated to the statistical results, so that the calculation and display of asynchronous processing, the disadvantage is that the real-time business response is not good.
Real-time pull MySQL row binlog, do the heterogeneous processing of data (tungsten, canal), will increment the result after processing (storm), save in the database, basic real-time.

[Transaction and log separation]

It's good to understand that user generated content and user behavior logs are separated, for example: two

The game DB contains basic information about the player, equipment, attributes, friends list, etc., which are put into the database. However, the player's behavior log, such as the consumption of gold coins, which copies of today, what to buy the top equipment, these belong to the behavior log, should be stored separately and analyzed processing.
For the web to remember, there are a lot of users pinned, refresh, bidding, display and other behaviors, demand real-time and the volume is very large, and must be separated from the posts.

The behavior log, needs to do the analysis processing, and because the timeliness is not suitable to store in MySQL, the later maintenance is the mine.

[Historical traceability]

In the design of the database in order to ensure that the data is traceable, should follow a few simple conventions, after the convenience of data query and statistics:

1. For the status data, the corresponding state of the field should be designed to hold the final state of the data, while recording the initial creator of the data, time and the date of the last modification of the data and modification time, so in the transaction data (such as order contract), advertising data, account table, etc. should be the default status (status), Creator (Creator/creator_name), creation time (CREATED_AT), Last modified by (Modifier/modifier_name), last Modified (MODIFIED_AT) fields are used to indicate the current state of the data, Create information and modify information.

2. For data that needs to be tracked for each modification, a log table is recorded at the time the data is changed to record the full life cycle of the data change. For situations where the key field changes only need to be concerned, the log table only needs to record key field changes, but the operator, operation type, time should be accurately recorded, log table data once generated is not allowed to modify. such as user account recharge water, consumption of water are some business closely related to the log. Logs such as audit logs, operations records, and so on, are small logs that are associated with the business.

3. Data to be retained for all history needs to generate a new version of each change, such as the class information, the original data is always only insert operation, do not do delete and update operation. However, this is only used when extreme data history is required to be extremely high.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More