Paradigm: The English name is normal form, which is E. f. codd was summarized after the relational database model was proposed in the 1970s S. The paradigm is the basis of the relational database theory, it is also the rule and guiding method we should follow in designing the database structure. The design paradigm of databases is the specification that must be met by database design. Only by understanding the design paradigm of the database can we design an efficient and elegant database. Otherwise, an incorrect database may be designed. there are currently eight traceable paradigms: 1nf, 2nf, 3nf, bcnf, 4nf, 5nf, dknf, and 6nf. The first paradigm that meets the minimum requirements is 1nf. 2nf is the second paradigm that meets some requirements on the basis of the first paradigm. The rest. Generally, only the first three paradigms are used: 1nf, 2nf, and 3nf ). The following describes the three paradigms.
◆1nf ):It emphasizes the atomicity of the column, that is, the column cannot be further divided into several other columns. In short, the first paradigm is a non-repeated column.
◆2nf ):First, it must be 1nf. In addition, it must contain two parts: first, the table must have a primary key; second, the columns not included in the primary key must be completely dependent on the primary key, instead of relying only on a portion of the primary key. In short, the second paradigm is that non-primary attributes are not partially dependent on primary keywords.
◆3nf ):2nf is the first option. In addition, non-primary key columns must be directly dependent on the primary key and cannot be passed. That is, it cannot exist: non-primary key column A depends on non-primary key column B, and non-primary key column B depends on the primary key. In short, the third paradigm (3nf) requires that a database table do not contain information about non-primary keywords already contained in other tables. For example, there is a department information table, where each department has a department ID (dept_id), department name, Department profile, and other information.
Q: What are the differences between the second and third paradigms?
Second paradigm: whether a non-primary key column depends on the primary key (including a column that indirectly relies on the primary key through a column). If there is dependency, the second paradigm is used;
Third paradigm: whether a non-primary key column is directly dependent on the primary key cannot be dependent on the pass-through relationship. This is the third paradigm;
Q: What are the benefits of the existence of a paradigm?
The paradigm can avoid data redundancy, reduce database space, and reduce the trouble of maintaining data integrity.
The benefits that the paradigm brings to us are also accompanied by some bad points: the tables designed according to the paradigm specifications, the more tables designed by the higher the level. For example, the table designed in the first paradigm may only have one table. When this table is designed in the second paradigm, two or more tables may be created, if we design this table in the third or higher paradigm, there will be more tables than the second paradigm. The more tables there are, the more data we need to query in multiple tables. In this way, the query time is much higher than the query time in a table.
That is to say, the higher the paradigm we use, the lower the performance of data operations. Therefore, when using a paradigm to design a table, we need to weigh whether to use a higher paradigm to design the table based on specific needs. In general projects, we use the third paradigm most. The third paradigm can meet our project needs, with good performance and convenient data management;
When many tables are involved in our business, there are often relations between multiple tables, and we need to perform table operations as quickly as possible, in this case, we can consider using the "Anti-paradigm ". The anti-paradigm, the old name, is exactly the opposite of what the paradigm requires. In the design mode of the anti-paradigm, we can allow proper data redundancy and use this redundancy to shorten the time required for data extraction and operation.That is to say, the space is used for time, and the data is redundant in multiple tables. When querying, the association between tables can be reduced or avoided;
For example, if we want to operate on the curriculum of a school, there are now two tables, one of which is the student (a_id, a_name, a_adress, B _id) Table of student information, one is the curriculum subject (B _id, B _subject). Now we need such information to output the course name and Student name for each course:
SQL statement: Select B. B _id, B. B _subject, a_a_name from student A, Subject B;
When there is not much data in the table, we can query it like this. When the data of both tables is in the millions, we can check the above information, the query speed is several hundred milliseconds or even slower. This query efficiency cannot meet our webpage speed requirements (generally not more than 100 milliseconds). What should we do? Of course, we need to add the redundant field-Student name in the curriculum, so that we can achieve the same purpose through the following query:
SQL statement: Select B _id, B _subject, a_name from Subject B;
When you put two queries together to view the execution plan, you will find that the first query consumes 92% of the total cost, and the second query consumes 8%. That is to say, the second query is compared to the first query, the efficiency has been optimized by more than 10 times, with remarkable results.
When we started a project, the application of the paradigm changed like this:
Design of the third paradigm database --> when the data volume is getting bigger and bigger, we often need to operate on a large range of multi-table data at a high frequency-> Design of the paradigm database-> continue to increase the data volume of the website-> design of the database of the paradigm and the reverse paradigm