database table structure Design method __ Database

Source: Internet
Author: User
Tags log log message queue

Design method of database table structure

When we design a database storage mode, we need to analyze the data pattern carefully and not put all the data together in a single brain. In that case, the availability of the system, efficiency, scalability will have a serious impact. Of course, the system you designed is very small and can be used in the simplest way.

Through the proficiency of the business, from different angles to the multidimensional analysis of data, generally can be from the following several directions analysis:

1. Data flow

2. Data Access characteristics

3. The size of the data volume

4. The increase in the number of data

5. Life cycle of data

According to the above data characteristics, the data table is classified by the integrated Data mode:

1. Constant tables

2. Increment table

3. Water meter

4. Status table

5. Core table

6. Process table

When we design the model of large data volume system, we must follow these points according to different data tables.

Core table:

The core table is the most frequent system access, in the design to consider the cost of access, must follow the paradigm, pay attention to the number of fields and field length, pay attention to the scope of the query. If the data volume of the core table is very large, the data should be archived according to the partition table or table routing, so as to ensure the performance of the core table.

Process table:

A process table, as its name suggests, is used to record a process. Generally refers to the life cycle of the data, in the design process table to design a clearly representative of the data lifecycle of the field, for the Data Warehouse system is to be reasonable use of life cycle fields, can be efficient statistics of different life cycle data; When designing a table, consider the cost of adding and deleting , the cost of insertion is minimal, modify needs to retrieve data to retain the Modified field value, delete to keep the entire record, the most expensive.

Constant tables:

The constant table changes almost rarely, and similar to the way we use dictionary tables, when designing tables, the smaller tables do not recommend indexing.

Increment table:

Increment table growth is very fast, not all data are common, so the size of the partition should be as balanced as possible, strictly distinguish between core data and process data, the index of the key value selectivity as high as possible, careful use of composite index, according to the relevant relationship design appropriate partitions and indexes.

Water table:

The flow table is similar to log log, record some water information, the data flow table is generally very large, the information is almost no change. You should pay attention to the granularity and selection of partitions in design, and it is generally not recommended to build too many indexes.

Status table:

The state table generally refers to the state process of recording a behavior, the life cycle is very short, easily confused with the process table. They can be simply distinguished, and the state table is the trajectory of the action; The process table is the life cycle of the data.

Examples of application of State tables:

Efficient distributed operational Solutions

Why do you want to replace a distributed transaction?
  When the data volume of our system is very large, we need to split the database, deploy multiple database instances, so that we can not avoid some operations need to modify several database instances at the same time
Data, in order to ensure data accuracy and consistency, most of us use distributed transactions to implement (very classic two-phase commit protocol).
  The most significant advantage of distributed transactions is the simplification of application development, which can greatly improve development efficiency for systems with tight time and low performance requirements, which is why most developers indulge in
One of the main reasons. But there is no perfect, advantages and disadvantages, although the development of convenient, but also seriously impaired the availability of the system, high performance and scalability, especially for
The complex system of massive data is manifested more obviously.
Availability of the system
The availability of the system is equal to the availability of the individual database instances participating in the distributed transaction, and the more the database instances, the more obvious the usability declines; Because participating in a distributed transaction
All database instances can work properly, this distributed transaction is completed, and if there is a failure of a database instance, the distributed transaction will fail.
High efficiency and scalability
The total duration of a distributed transaction is the sum of the time that operates on individual database instances, because each operation in a distributed transaction is executed sequentially so that each transaction is rung
Time will be long; and for an OLTP system, transactions are small, typically milliseconds, and when a distributed transaction is involved, the network communication time between nodes is the total response of the transaction.
The ratio is also not to be overlooked. Also, because the transaction time relative to the longer, the locked resource time will become longer. This can seriously affect the concurrency of the system, throughput and
Scalability.
Based on the above description you can understand the drawbacks of distributed transactions, how to avoid it.
Suppose there are three database instances on each instance with a user (id,username,account,routedb) table, one of which is read-write,
The other two user is read-only (realizes read-write separation).
Primay Db:user1
Standby Db:user2
Standby Db:user3
If I want to update the user table, for distributed transactions, the pseudo code is as follows:
Begin
Update user1 set account=account+ $b;
Update User2 set account=account+ $b;
Update User3 set account=account+ $b;
Commit;
End;
Here, to eliminate distributed transactions, introduce message queues and state tables
Transaction 1:
Begin
Update user1 set account=account+ $b;
Put_queue User2;
Put_queue User3;
Commit;
Transaction 2:
For the in queue
Begin
If (routedb= ' DB2 ') then
Begin
Select count (1) CNT from message_state where meg_id= $messageid;
If (cnt=0) Then
Update User2 set account=account+ $b;
End;
Insert into message_state values ($messageid);
End;
Elseif (routedb= ' db3 ') then
Begin
Select count (1) CNT from message_state where meg_id= $messageid;
If (cnt=0) Then
Update User3 set account=account+ $b;
End;
Insert into message_state values ($messageid);
End;
End;
Commit
If transaction 2 succeeded;
Dequeue message;
Delete from Message_state where meg_id= $messageid;
End;
You can use a diagram to represent a procedure like this:
Describe the process
1. In the first step, Message Queuing and table User1 in the same database instance, there is no distributed operation; In this step, the operation of each of the other database instances
    Are queued as a single message.
2. Each database instance corresponds to a state table message_state, which is used to implement the power of the message, which is used to record whether the message was successfully applied (avoid
    Without multiple updates, there is no distributed operation in this step, so it can also ensure data consistency.
3. After the success of the second transaction, that is, the 2nd, 3-step success, messages that have been removed from the message queue are removed from the Message_state table, which allows the
    The Message_state table is guaranteed to be in a very small state (it is also possible to not purge, without affecting system correctness). Because Message Queuing and message_state are
    On different instances (servers), dequeue messages (message out queues) may fail before the corresponding Message_state records are deleted.
    Once the failure occurs, the Message_state table will leave some rubbish content, but does not affect the system correctness;
   
    However, if a failure occurs between the end of the second transaction and the Dequeue message (out of the queue), the system will again remove the
    Message, but can be checked by message_applied table This message has been applied, skipping this message to implement the correct behavior;
Summary: The use of the above scheme can not guarantee data consistency, in the event of failure, the system will not guarantee data consistency in a short time, but based on Message Queuing and state table,
      Ultimately, the system restores consistency, using this solution to unlock the tight coupling between database instances, and the performance and scalability of distributed transactions are not
      comparable to.
Message Queuing-Comparison of state table scenarios and distributed transactions
For systems with time constraints or low performance requirements, distributed transactions should be used to accelerate development efficiency; For systems that are not very tight in time and require a high performance,
You should consider using Message Queuing scenarios. So time and convenience, performance and expansion needs to be carefully measured, find the middle balance; for the original use of distributed transactions,
And the system has been stabilized and the system with high performance requirements can be reconstructed using Message Queuing-state table scheme to optimize performance.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.