Explanation of database paradigm and explanation of database paradigm

Source: Internet
Author: User

Explanation of database paradigm and explanation of database paradigm

Database paradigm 1NF 2NF 3NF BCNF (example)

The design paradigm (paradigm, database design paradigm, and database design paradigm) is a set of relational patterns at a certain level. Database construction must follow certain rules. In relational databases, this rule is a paradigm. The relationships in relational databases must meet certain requirements, that is, they must meet different paradigms. Currently, relational databases have six paradigms: 1NF, 2NF, 3NF, 4NF, and 5NF) and the sixth paradigm (6NF ). The first paradigm (1NF) meets the minimum requirements ). The second Paradigm (2NF) that meets more requirements on the basis of the first paradigm, and the other paradigms are similar. Generally, databases only need to satisfy the 3NF. The following is an example of the first paradigm (1NF), the second Paradigm (2NF), and the third paradigm (3NF ).
In the process of creating a database, normalization is the process of converting it into some tables. This method can make the results obtained from the database clearer. This may cause duplicate data in the database, resulting in the creation of redundant tables. Normalization is a refined process after identifying the data elements and relationships in the database, and defining the required tables and projects in each table.
The following is an example of fanization: Customer Item purchased Purchase price Thomas Shirt $40 Maria Tennis shoes $35 Evelyn Shirt $40 Pajaro Trousers $25
If the table above is used to save the price of an item and you want to delete one of the customers, you must delete a price at the same time. To solve this problem, you can convert the table into two tables, one for storing information about each customer and the items they bought, the other is used to store information about each product and its price. Therefore, adding or deleting one table does not affect the other table.

Introduction to several design paradigms of relational databases

1. 1NF)

In any relational database, the first paradigm (1NF) is the basic requirement for the relational model. databases that do not meet the first paradigm (1NF) are not relational databases.
The first paradigm (1NF) means that each column in the database table is an inseparable basic data item. The same Column cannot contain multiple values, that is, an attribute in an object cannot have multiple values or duplicate attributes. If duplicate attributes exist, you may need to define a new object. A new object consists of duplicate attributes. The new object has one-to-multiple relationships with the original object. In the first paradigm (1NF), each row of the table contains only information of one instance. For example, for the employee information table in Figure 3-2, the employee information cannot be displayed in one column or two or more columns in one column; each row in the employee information table only represents the information of one employee. The information of one employee appears only once in the table. In short, the first paradigm is a non-repeated column.

2 second Paradigm (2NF)

The second Paradigm (2NF) is established on the basis of the first paradigm (1NF), that is, to satisfy the second Paradigm (2NF) must satisfy the first paradigm (1NF) first ). The second Paradigm (2NF) requires that each instance or row in the database table be able to be distinguished by a unique region. To implement differentiation, you usually need to add a column to the table to store the unique identifier of each instance. 3-2 The employee ID (emp_id) column is added to the employee information table. Because each employee's employee ID is unique, each employee can be uniquely distinguished. This unique attribute column is called as the primary keyword, primary key, and primary code.
The second Paradigm (2NF) requires that the attributes of an object fully depend on the primary keyword. The so-called full dependency refers to the fact that there cannot be an attribute that only depends on a part of the primary keyword. If so, this attribute and this part of the primary keyword should be separated to form a new entity, the relationship between the new object and the original object is one-to-multiple. To implement differentiation, you usually need to add a column to the table to store the unique identifier of each instance. In short, the second paradigm is that non-primary attributes are not partially dependent on primary keywords.

3. Third Paradigm (3NF) 

The third paradigm (3NF) must satisfy the second Paradigm (2NF) first ). In short, the third paradigm (3NF) requires that a database table do not contain information about non-primary keywords already contained in other tables. For example, there is a department information table, where each department has a department ID (dept_id), department name, Department profile, and other information. After listing the Department numbers in the employee information table in Figure 3-2, you cannot add the Department name, Department profile, and other information related to the department to the employee information table. If the department information table does not exist, it should also be constructed based on the third paradigm (3NF), otherwise there will be a large amount of data redundancy. In short, the third paradigm is that attributes do not depend on other non-primary attributes.

Analysis of Three paradigm application examples of Database Design

The design paradigm of databases is the specifications that need to be met by database design. databases that meet these specifications are concise and have clear structures. At the same time, there will be no insert or delete operations) and update operations are abnormal. On the contrary, it is a mess, which not only creates troubles for database programmers, but also features an ugly face. It may store a large amount of unnecessary redundant information.
Is the design paradigm hard to understand? No, we certainly cannot understand and remember the mathematical formulas given to us in university textbooks. Therefore, many of us simply do not follow the paradigm to design databases.
In essence, the design paradigm can be clearly stated in an image and concise discourse, and it is clear. This article will give a general description of the paradigm, and explain how to apply these paradigms to practical engineering using the database of a simple forum designed by the author as an example.

Paradigm description 

1NF: fields in the database table are single attributes and cannot be divided. This single attribute is composed of basic types, including integer, real number, complex type, logical type, and date type.

For example, the following database tables conform to the first paradigm:

Field 1 Field 2 Field 3 field 4

Such database tables do not conform to the first paradigm:

Field 1 Field 2 Field 3 field 4
Field 3.1 field 3.2

Obviously, in any current Relational Database Management System (DBMS), dummies cannot make databases that do not conform to the first paradigm, because these DBMS do not allow you to divide one or more columns of a database table into two or more columns. Therefore, it is impossible for you to design a database that does not conform to the first paradigm in the existing DBMS.

2NF ): the database table does not have some function dependencies between non-Keyword fields and any candidate keyword fields (some function dependencies refer to the condition where some fields in the composite keywords determine non-Keyword fields ), that is, all non-Keyword fields depend entirely on any set of candidate keywords.

Assume that the course selection relation table is SelectCourse (student ID, name, age, course name, score, and credits), and the keywords are combined keywords (student ID, course name), because the following deciding relation exists:
(Student ID, course name) → (name, age, score, credits)

This database table does not meet the second paradigm because of the following decision relationships:
(Course name) → (credits)
(Student ID) → (name, age)
That is, fields in the combined keywords determine non-keywords.

Because 2NF is not met, this course selection relation table has the following problems:
(1) data redundancy:
The same course is selected by n students, and "Credits" are repeated for n-1 times. The same student takes m courses, and the name and age are repeated for m-1 times.
(2) Update exception:
If the credits of a course are adjusted, the "Credits" value of all rows in the data table must be updated. Otherwise, different credits may occur for the same course.
(3) insertion exception:
For example, if you want to open a new course, no one will take the course. In this way, the course name and credits cannot be recorded in the database because the "student ID" keyword is not yet available.
(4) Deletion exception:
Assuming that a group of students have completed their electives, These electives should be deleted from the database table. However, the course name and credit information are also deleted. Obviously, this will also cause insertion exceptions.

Change the SelectCourse table to the following three tables:
Student: Student (Student ID, name, age );
Course: Course (Course name, credits );
Course Selection relationship: SelectCourse (student ID, course name, score ).

Such database tables conform to the second paradigm, eliminating data redundancy, update exceptions, insertion exceptions, and deletion exceptions.
In addition, all database tables with single keywords comply with the second paradigm, because it is impossible to have a combination of keywords.

Third Paradigm (3NF): Based on the second paradigm, if there is no transfer function dependency for any candidate keyword segment in the data table, it complies with the third paradigm. The so-called pass function dependency refers to the existence of "A → B → C" decision relationship, then the C transfer function depends on. Therefore, database tables that meet the third paradigm should not have the following dependency:
Keyword field → non-Keyword field x → non-Keyword field y

Assume that the Student relationship table is Student (Student ID, name, age, school, school location, school phone number), and the keyword is single keyword "Student ID", because the following decision relationship exists:
(Student ID) → (name, age, school, school location, school phone number)

This database complies with 2NF but does not comply with 3NF because of the following decision relationships:
(Student ID) → (school location, school phone number)
That is, the transfer function dependency of the non-Keyword section "school location" and "college phone" on the keyword section "student ID" exists.

It can also cause data redundancy, update exceptions, insertion exceptions, and deletion exceptions. You can analyze and learn this information on your own.
The student relationship table is divided into the following two tables:
Student (student ID, name, age, school );
School: (school, location, phone number ).

Such database tables conform to the third paradigm, eliminating data redundancy, update exceptions, insertion exceptions, and deletion exceptions.
Bois-cell Paradigm (BCNF): Based on the third paradigm, if no field in the database table is dependent on the transfer function of any candidate keyword segment, it complies with the third paradigm.

Assume that the warehouse management relation table is StorehouseManage (warehouse ID, storage item ID, administrator ID, quantity), and one administrator works only in one warehouse. One warehouse can store multiple items. This database table has the following decision relationships:
(Repository ID, storage item ID) → (administrator ID, quantity)
(Administrator ID, storage item ID) → (warehouse ID, quantity)
Therefore, both (repository ID, storage item ID) and (administrator ID, storage item ID) are candidate Keywords of StorehouseManage, and the unique non-Keyword segments in the table are quantity, it conforms to the third paradigm. However, the following decision relationships exist:
(Repository ID) → (administrator ID)
(Administrator ID) → (repository ID)
That is, the keyword segment determines the keyword segment, so it does not conform to the BCNF paradigm. It has the following exceptions:
(1) Deletion exception:
When the respiratory is cleared, all the "Storage item ID" and "quantity" information are deleted, and the "warehouse ID" and "administrator ID" information are also deleted.
(2) insertion exception:
When a Warehouse does not store any items, an administrator cannot be assigned to the warehouse.
(3) Update exception:
If the repository is changed to an administrator, the administrator ID of all rows in the table must be modified.

Break down the warehouse management relationship table into two Relational Tables:
Warehouse Management: StorehouseManage (warehouse ID, administrator ID );
Repository: Storehouse (warehouse ID, storage item ID, quantity ).
Such database tables conform to the BCNF paradigm and eliminate deletion, insertion, and update exceptions.

Paradigm Application

Let's get a forum database step by step, with the following information:
(1) User: user name, email, home page, phone number, and contact address
(2) post: post title, post content, reply title, reply content

For the first time, we designed the database to only exist tables:
User name email homepage phone contact address post title post content reply title reply content
This database table conforms to the first paradigm, but no set of candidate keywords can determine the entire row of the database table. The username of the unique keyword segment cannot completely determine the entire tuples. We need to add the "Post ID" and "Reply ID" fields to change the table:
User name email homepage phone contact address post ID post title post content reply ID reply title reply content
In this way, the keywords (username, post ID, and reply ID) in the data table can determine the entire line:
(User name, post ID, reply ID) → (email, home page, phone number, contact address, post title, post content, reply title, reply content)
However, such a design does not conform to the second paradigm because of the following decision relationships:
(User Name) → (email, home page, phone number, contact address)
(Post ID) → (post title, post content)
(Reply ID) → (reply title, reply content)
That is, some functions of non-Keyword fields depend on the candidate keyword fields. Obviously, this design will cause a large amount of data redundancy and operation exceptions.

We break down a database table into (underlined keywords ):
(1) User information: user name, email, home page, phone number, and contact address
(2) post information: Post ID, title, content
(3) reply information: reply ID, title, content
(4) post: User Name, post ID
(5) reply: Post ID and reply ID

This design meets the requirements of the 1st, 2, 3 and BCNF paradigms. But is this the best design?
Not necessarily.

We can see that there is a 1: N relationship between the "user name" and "post ID" in the 4th "posts, therefore, we can merge the "post" into the "post information" of the 2nd items; the "Post ID" and "Reply ID" in the 5th items "reply" are also 1: therefore, we can merge the "reply" into the "Reply information" of the 3rd items. In this way, data redundancy can be reduced in a certain amount. The new design is as follows:
(1) User information: user name, email, home page, phone number, and contact address
(2) post information: User Name, post ID, title, content
(3) reply information: Post ID, reply ID, title, content

Database Table 1 clearly meets the requirements of all paradigms;

Database Table 2 contains some functional dependencies of non-keyword "title" and "content" on the "Post ID" of the keyword segment, that is, it does not meet the requirements of the second paradigm, however, this design does not cause data redundancy and operational exceptions;

In database table 3, some function dependencies of non-Keyword segments "title" and "content" on the keyword segment "Reply ID" do not meet the requirements of the second paradigm, however, similar to database table 2, this design does not cause data redundancy and operation exceptions.

From this we can see that it is not necessary to forcibly meet the requirements of the paradigm. For a 1: N relationship, when one side is merged to the other side of N, the other side of N will no longer meet the second paradigm, but this design is better!

For M: N relationships, one or N sides of M cannot be merged to the other, which may result in non-compliance with the paradigm requirements, Operation exceptions and data redundancy.

For a relationship, we can merge 1 on the left or 1 on the right to the other side. This design does not meet the requirements of the paradigm, but does not cause operation exceptions and data redundancy.

Conclusion

The database design that meets the requirements of the paradigm is clear in structure, while avoiding data redundancy and operational exceptions. This means that the design that does not meet the requirements of the paradigm must be incorrect. In the case of a database table with a or 1: N relationship, rather than conforming to the requirements of the paradigm, the merger is reasonable.

When designing databases, we must always consider the requirements of the paradigm.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.