database table structure Design method and principle (Ali)

Source: Internet
Author: User

The three paradigms of database design: In order to build a database with small redundancy and reasonable structure, we must follow certain rules when designing database. In a relational database, this rule is called a paradigm. A paradigm is a summary of a design requirement. In order to design a relational database with reasonable structure, a certain paradigm must be met.

In the actual development of the most common design paradigm has three: the first paradigm is the most basic paradigm. If all the field values in a database table are non-biodegradable atomic values, the database table satisfies the first paradigm, and the second paradigm is more advanced on the basis of the first paradigm. The second paradigm needs to ensure that each column in a database table is related to the primary key, not just one part of the primary key (primarily for the Federated primary key). In other words, in a database table, only one data can be saved in a table, and a variety of data cannot be stored in the same database table; the third paradigm needs to ensure that each column of data in the data table is directly related to the primary key, not indirectly. To summarize , that is: the first paradigm (to ensure that each column remains atomic); the second paradigm (to ensure that each column in the table is related to the primary key); the third paradigm (ensure that each column is directly related to the primary key column, not indirectly).

In the current enterprise information system, the database is still the best way to store data, although there are a lot of books in the guidance of our database design, but that way is the best way to design the table structure of the database, design should follow what principles, How the four paradigms can be used in a way to achieve smooth application, etc. is I have been thinking and summary of the problem, the following is my design for these issues based on the preparation of a summary of the outline of an article, we welcome a piece of discussion, brainstorming. The concept of domain modeling is mentioned, but it is not explained in detail, I hope we can have time to discuss this proposition in depth.

1. The database design should not be designed for the whole system, but should be based on the components of the system architecture, and the database design of the component unit for the business handled by each component; The association between the database tables corresponding to the different components should be minimized, If a table between different components requires a foreign key association, try not to create a foreign key association, but simply record a primary key of the associated table, ensuring that the components correspond to the independence of the table, providing the possibility of refactoring the system or table structure.

// notice what he's saying here is "do not create a foreign key association," and the statement that creates the foreign Key association is: // foreign KEY (member_id) references member (ID); // We hardly use this statement because we are doing this, when using a foreign key, we simply record the primary key of the associated table, rather than creating a foreign key at the database level.  // do not know whether it is fluke, or the predecessor DBA is too strong, has been considered well. 

2. Using the domain model-driven approach and the top-down approach to database design, first analyze the system business, define the object according to the responsibility. Objects conform to the encapsulated nature, ensuring that the data items associated with the duties are defined within an object that fully describes the responsibility and does not present a lack of responsibility description. And an object has and has only one responsibility, if an object is responsible for two or more than two responsibilities, it should be split.

// domain model-driven approach, currently used is not very familiar, not enough to consider. Because the table in the regular database is only used for storage,// Especially small demand, what fields to add, find the related table plus go on the line, not much consider the domain model. This is very common in the Chinese station old business table

3. To map the database tables based on the established domain model, refer to the second paradigm of database design: All non-keyword attributes in a table depend on the entire keyword. A keyword can be a property or a collection of multiple properties, and in either case, ensure that the keyword is guaranteed to be unique. When determining a keyword, you should ensure that the keyword is not involved in the business and that there are no update exceptions, and the optimal solution is to use a self-increment numeric attribute or a random string as the table's keyword.

4. Because the domain model-driven approach of the 1th describes the design of a database table structure, each object in the domain model has only one responsibility, so the data item in the object has no transitive dependency, so the database table structure design of this idea satisfies the third paradigm from the beginning: a table should satisfy the second normal form, And there is no transitive dependency between attributes.

// the three normal forms of the database do not remember the students to check the information.  // personally think that the purpose of the third paradigm is to minimize data redundancy and ensure that only one copy of the same data exists.  // Third paradigm In fact, we are not very strict, especially the old database tables will have redundant fields. It depends on the circumstances. 

5. Similarly, because of the uniqueness of the object's responsibilities and the relationship between the objects reflects the relationship between the business logic, the objects in the domain model exist in the main object and from the object, from the object is from the angle of 1-n or n-n to further refine the business logic of the main object, Therefore, there is no delete and insert exception for table and table association relationships mapped from object and object relationships.

// The last sentence does not understand, it may be "so the table and table associations should not be deleted and inserted exceptions." "? 

6. In the database table structure derived from the mapping, further modifications should be made to the fourth paradigm to ensure that there is no multi-valued dependency. At this time, we should feedback to the domain model according to the idea of reverse engineering. If there is a multivalued dependency in the table structure, it proves that the objects in the domain model have at least two or more responsibilities, and that the design corrections should be made according to the first article. The Forth paradigm: if a table satisfies bcnf, there should be no multivalued dependency. 

//in the fourth paradigm, we don't obey much. //For example://the values of the Vas_wp_config.config_name field include: ADV (advertising theme)/glare (colorful scrolling theme)/theme_simple (General theme)/theme_cartoon (animated theme)/Theme_ None (background theme not shown)//cate_background (category background)/video (company video)/board_cartoon (animation signboard)/board_simple (ordinary signboard) and so on. //If you follow the IV paradigm, you need to add a new Vas_wp_config_name table, store the configuration name enumeration value, and the Vas_wp_config.config_name field to vas_wp_config.config_name_id. //This is better for scaling, not to set a messy value in the Vas_wp_config.config_name field because everyone's understanding is inconsistent, but this requires maintaining more small tables, causing the number of data tables to swell, and DBAs may find it more difficult to manage. //we use the unspoken rules convention, the Java enumeration class and other ways to guarantee. But sometimes the effect is not very good, often found in the old database table in the enumeration of the value of a variety of fields, not all agreed. 

7. When the analysis confirms that all the tables satisfy the two-three or four paradigm, the association between the table and the table is as weak as possible in order to adjust and reconstruct the table fields and tables structure. And, I think the table in the database is used to persist the state of an object instance at a specific time and under certain conditions, but only a storage medium, so there is no strong association between tables and tables to express the business (consistency between data), this responsibility should be ensured by the logical layer of the system, This approach also ensures that the system is compatible with incorrect data (dirty data). Of course, from the point of view of the whole system, we still have to do our best to ensure that the system does not produce dirty data, from another point of view, the generation of dirty data is also inevitable to some extent, we also want to ensure that the system fault tolerance of this situation. This is a compromise solution.

8. Index the primary key and foreign key for all tables, and target the index of the combined attribute (for some large data volumes and common retrieval methods) to improve the retrieval efficiency. While indexing consumes some of the system resources, it is still worth advocating to compare the performance impact of searching for data in the entire table, especially when the data in the table is large, and the performance impact of sorting operations without indexes.

// indexes are currently created by DBAs based on specific SQL, but when you develop write SQL, you should also consider the index of the fields appropriately. 

9. Minimize the use of stored procedures, there are many technologies can replace the functions of stored procedures such as "object/relational mapping", and so on, the guarantee of data consistency in the database, regardless of version control, development and deployment, as well as the migration of the database will have a great impact. However, there is no denying that the stored procedure has a performance advantage, so when the hardware that the system can use is not improved and the performance is a very important quality attribute, the stored procedure can be considered balanced.

// at present, is to eliminate the use of stored procedures, I think it is more convenient to use, for us, the main reason is to the DBA to bring management problems, // because time is long, the logic of stored procedure and usage scene, often no one can understand, easy to produce more problems

10. When the cost of dealing with association constraints between tables (often a cost of use) exceeds the cost of ensuring that no modifications, deletions, changes to exceptions are guaranteed, and that data redundancy is not the primary issue, the table design can be inconsistent with four paradigms. The four paradigms ensure that no anomalies occur, but may also lead to overly pure design, making the table structure difficult to use, so it is necessary to make a comprehensive judgment at design time, but first ensure that it conforms to four paradigms, and then refinement correction is the best way to enter the field of database design.

11. The design of the table to have a good usability, mainly reflected in the query if you need to correlate multiple tables and also need to use complex SQL skills. I feel that the more I follow the paradigm, the more I make SQL complex, specific analysis of the situation. Designed tables to minimize data redundancy, ensure data accuracy, and effectively control redundancy to help improve database performance

database table structure Design method and principle (Ali)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.