The dispute between normalization and non-normalization of relational database design

Source: Internet
Author: User
Tags execution include join query requires access
Standardization | design | data | database | Database design           relational database design is the process of organizing and structuring data. The core problem is the design of the relational model. A relational model is a mathematical model that describes the relationship between entities by using two-dimensional tabular data, which is the collection of all relational schemas, attribute names, and keywords, and is the object of relational schema description. A relational pattern is a list of property names for a relationship, that is, a table frame of a two-dimensional table. The design of relational pattern is the soul of relational model design. Therefore, the design of relational pattern is the core of the core of relational database design. The design of the relational schema directly determines the performance of the relational database. At present, the standardized (normalization) design is dominant in the design of the instructional relational model, which is produced and matured in the long-term development of the database for several decades. But in recent years there has been a new trend in this field, and a kind of relational model, called Denormalization, has attracted the attention of the industry and has been applied to a certain extent. To this new design thought, each side reaction is different and mixed, thus in the relevant theorists set off a modest normalization and non-standard dispute. This paper briefly introduces the basic idea of the standardization and non-standard design, and summarizes the main points of the pros and cons, for the reference of domestic industry related personnel. The basic idea of standardized design relation model is to decompose the relational model, instead of the original relational schema, a set of equivalent relational models is used to eliminate the unreasonable parts of data dependency (including function dependency and multivalued dependency), so that a relationship can only describe a relation between an entity or an entity. This process must be carried out in the context of ensuring lossless connectivity and maintaining functional dependencies, i.e. ensuring that the original data is not corrupted and that the decomposed relationship can be restored to the original relationship through a natural join. Specifically, the process of normalization design is to divide a two-dimensional table into multiple two-dimensional tables and establish the association between tables in different paradigms, eventually reaching a table that only describes the goal of a relationship between an entity or an entity. The main paradigms currently followed include 1 NF, 2 NF, 3 NF, BCNF, 4NF and 5NF, and the most widely used 3NF and BCNF in engineering, with 3 NF recommended as standard. The advantages of standardized design include effectively eliminating data redundancy, straightening out data dependencies, maintaining database integrity, enhancing database stability, scalability and adaptability. It is generally considered that the main problem of standardized design is to increase the operation of the Connection library table in the query, resulting in the loss of computer time, space, system and operation efficiency. In most cases, this problem can be solved by means of good index design. Second, non-standard design of non-standard design of the basic idea is that the real world does not always comply with a certain endThe mathematical relational model of beauty. It is imperative to standardize the design of things, the form appears to be simplified, the content tends to be complicated, more importantly, the efficiency of database operation is reduced. Non-normalization requires a proper reduction or even abandonment of relational patterns, and no longer requires a table to describe only one relationship between an entity or an entity. Its main purpose is to improve the operational efficiency of the database. The main techniques for nonstandard processing include adding redundant or derived columns, merging, splitting, or adding repeating tables to a table. It is generally considered that the nonstandard treatment may be considered in the following situations: (1) A large number of frequent query processes involved in the table need to be connected, (2) The main application in the execution of the table to join the query, (3) The calculation of the data requires a temporary table or a complex query. The main advantage of the non-standard design is that it reduces the connection required by the query operation, reduces the number of external keys and indexes, and can advance statistical calculation to improve the response speed of the query. The main problem of non normalization is that it increases data redundancy, affects database integrity, reduces the speed of data update, and increases the physical space occupied by storage tables. The most important of these is database integrity issues. This issue can generally be resolved by establishing triggers, applying transaction logic, and running batch commands or stored procedures at appropriate intervals. The main points of the controversy of normalization and non-normalization support for the one side of the non-standard design is that the higher the degree of database normalization, the more the number of tables, the degree of normalization is directly related to the number of tables; the more the table is, the more the table joins the operation, the higher the connection operation, the faster the database execution and the performance of the database. Only through the non-standard design, significantly reduce the number of tables, so as to reduce the dependency on the connection operation, speed up the implementation of the database to ensure the normal performance of the database. For example, the nonstandard star pattern currently popular in decision support systems is far better than the application of standardized design, the best example of nonstandard design. Non-normalized design does not imply confusion and disregard for rules, it also follows the basic principles of software engineering such as protecting information integrity. The party that supports the normalization design thinks that normalization and non-normalization is only a logical concept, emphasizing that the non-standard designers confuse the relationship between logic and physics. The performance of database is determined by physical level, namely hardware, database size and physical design, data storage and access method, database management system optimization degree, concurrent access quantity and so on; non-normalized design does not change the physical level of database, so it is impossible to improve the performance of database. Normalization is not just to avoid data redundancy, but more importantly to ensure the integrity of the database. The biggest problem of non-standard design is that it is difficult to ensure the consistency of data in database, and there is a danger of destroying data. In addition, non-normalization enables multiple entities to exist in a single table, and the combination of different entities reinforces the complexity of the database, increases the difficulty of user understanding, and leads to difficulties in describing problems and increasing the risk of correct responses. OnlyStandardized design is the fundamental way to solve these problems. If we do not abandon the non-standard design concept, in order to obtain the so-called performance improvement and disregard the risk of database integrity is destroyed, can not motivate developers to study the real completely standardized and High-performance relational database management system, its consequences will inevitably affect the healthy development of the database. In a sense, the normalization of the database and the non-standard design is not the opposite, either or the relationship. Perhaps one side will gradually die out, perhaps there is a middle road to the two. There is a spiraling process of knowing things. The debate is not over, and the final results cannot be predicted. But it is certain that whatever the outcome will have a far-reaching impact on the direction of future database development.             

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.