Six paradigms of the database __ database

Source: Internet
Author: User

The design paradigm (paradigm, Database design paradigm, database design paradigm) is a collection of relational patterns that conform to a certain level. The construction of the database must follow certain rules. In a relational database, this rule is a paradigm. Relationships in relational databases must meet certain requirements, that is, to satisfy different paradigms. There are six types of relational databases: first paradigm (1NF), second normal form (2NF), third normal form (3NF), fourth normal form (4NF), v Paradigm (5NF) and sixth paradigm (6NF). The paradigm that satisfies the minimum requirement is the first paradigm (1NF). Further satisfying more requirements on the basis of the first paradigm is called the second normal form (2NF), and the remaining paradigms are referred to as the second analogy. Generally speaking, the database only needs to satisfy the third normal form (3NF) on the line. Here are examples of first paradigm (1NF), second Normal (2NF), and third Normal (3NF).

In the process of creating a database, Fan Hua is the process of translating it into tables, which makes the results from the database more explicit. This may cause the database to produce duplicate data, resulting in the creation of redundant tables. Fan Hua is a refinement of the initial work of identifying data elements, relationships in a database, and defining the required tables and items in each table.

The following is an example of a paradigm Customer Item purchased Purchase price Thomas shirt $ Maria Tennis shoes $ evely N Shirt $ Pajaro Trousers $

If the above table is used to keep the price of the item, and you want to delete one of the customers, you must delete a price at the same time. Fan Hua To solve this problem, you can turn this table into two tables, one to store information for each customer and the item he buys, and another to store information about each product and its price, so adding or deleting one of the tables does not affect the other table.

Introduction to several design paradigms of relational database

1 First paradigm (1NF)

In any relational database, the first normal form (1NF) is the basic requirement of the relational schema, and the database that does not satisfy the first normal form (1NF) is not a relational database.

The so-called first normal form (1NF) means that each column of a database table is an indivisible basic data item and cannot have multiple values in the same column, that is, an attribute in an entity cannot have more than one value or cannot have duplicate attributes. If duplicate attributes occur, you may need to define a new entity, which consists of duplicate attributes and a one-to-many relationship between the new entity and the original entity. In the first normal form (1NF), each row of the table contains information about only one instance. For example, for the Employee Information table in Figure 3-2, you cannot display employee information in a column, or two or more columns in one column; Each row of the employee Information table represents only one employee's information, and one employee's information appears only once in the table. In short, the first paradigm is a column without duplicates.

2 Second Normal form (2NF)

The second normal form (2NF) is established on the basis of the first normal form (1NF), that is, satisfying the second normal form (2NF) must first satisfy the first normal form (1NF). The second normal form (2NF) requires that each instance or row in a database table must be divided into unique regions. For implementation differentiation it is often necessary to add a column to the table to store the unique identities of individual instances. As shown in Figure 3-2, the Employee Information table adds the employee number (emp_id) column, because each employee's employee number is unique, so each employee can be uniquely differentiated. This unique property column is called the primary key or primary key, and the primary code.

The second normal form (2NF) requires that the attributes of an entity depend entirely on the primary keyword. Total dependency refers to an attribute that cannot exist only on a part of the primary key, and if so, this part of the attribute and the primary key should be separated to form a new entity, a one-to-many relationship between the new entity and the original entity. For implementation differentiation it is often necessary to add a column to the table to store the unique identities of individual instances. In short, the second paradigm is that non-primary attributes are dependent on the primary keyword.

3 Third paradigm (3NF)

Satisfying the third normal form (3NF) must first satisfy the second normal form (2NF). In short, the third normal form (3NF) requires that a database table not contain non-primary key information already contained in other tables. For example, there is a departmental information table in which each department has information such as department number (dept_id), department name, department profile, and so on. Then, after the department number is listed in the Employee Information table in Figure 3-2, the department name, department profile, and other department-related information can no longer be added to the Employee Information table. If the departmental information table is not present, it should also be built according to the third normal form (3NF), otherwise there will be a large amount of data redundancy. In short, the third paradigm is that attributes are not dependent on other non-primary attributes.

Analysis of application examples of three paradigms of database design

The database design paradigm is the specification that the database design needs to satisfy, the database that satisfies these specifications is concise, the structure is clear, at the same time, insert (insert), delete (delete) and update operation exception are not occurred. The reverse is messy, not only causing trouble to programmers in the database, but also being repulsive, potentially storing a lot of unwanted redundancy information.

Design paradigm is not difficult to understand. Not also, the university textbook gives us a bunch of mathematical formulas we certainly can not understand, also can't remember. So many of us do not design databases according to Paradigm.

In essence, the design paradigm can be clearly understood in terms of very vivid and concise words. This article will explain the paradigm in layman's way, and take the example of a simple forum database which I have designed to explain how to apply these paradigms to practical projects.

Paradigm Description

First normal form (1NF): The fields in the database table are single attributes and cannot be divided. This single attribute is composed of basic types, including integer, real, character, logical, date, etc.

For example, the following database tables are consistent with the first paradigm:

Field 1 Field 2 Field 3 Field 4

and such a database table is inconsistent with the first paradigm:

Field 1 Field 2 Field 3 Field 4

Field 3.1 Field 3.2

It is clear that in any current relational database management system (DBMS), a fool is unlikely to make a database that does not conform to the first paradigm, because the DBMS does not allow you to divide one column of a database table into two or more columns. Therefore, it is impossible for you to design a database that does not conform to the first paradigm in the existing DBMS.

Second Normal form (2NF): Non-critical fields do not exist in database tables partial function dependencies on any candidate key fields (partial function dependencies refer to the existence of certain fields in the combination keyword that determine non-critical fields), i.e. all non-critical fields are entirely dependent on any set of candidate keywords.

Suppose the elective relationship table is Selectcourse (school number, name, age, course name, score, credits), the keyword is the combination of keywords (school number, course name), because there are the following decision relationships:

(School number, course name) → (name, age, grade, credits)

This database table does not satisfy the second normal form because the following decision relationships exist:

(course name) → (credits)

(school number) → (name, age)

That is, there are fields in the combination key that determine the non keyword.

Because it does not conform to 2NF, this elective relationship table has the following problems:

(1) Data redundancy:

The same course is selected by n students, and the "credits" are repeated n-1 times; the same student took the M-class course, and the name and age were repeated m-1 times.

(2) Update exception:

If you adjust the credits for a course, the "credits" values for all the rows in the datasheet are updated, otherwise the same course credits will be different.

(3) Insert exception:

Suppose that a new course is to be opened and no one has been selected for the time being. This way, because there is no "learn number" keyword, the course name and credits are not recorded in the database.

(4) Delete exception:

If a group of students have completed elective courses, these elective records should be removed from the database table. At the same time, however, the course name and credit information have been removed. Obviously, this can also cause an insert exception.

Change the selectcourse relationship table to the following three tables:

Student: Student (school number, name, age);

Course: Course (course name, credits);

Elective relationship: Selectcourse (School number, course name, grade).

Such database tables conform to the second normal form, eliminating data redundancy, updating exceptions, inserting exceptions, and deleting exceptions.

In addition, database tables for all single keys conform to the second normal form because there is no possible combination of keywords.

Third normal form (3NF): On the basis of the second normal form, if no non-critical field exists in the data table, it conforms to the third normal form if the transfer function dependency of any candidate key field is not present. The so-called transfer function dependency means that if there is a "a→b→c" decision relationship, the C transfer function relies on A. Therefore, a database table that satisfies the third normal form should not have the following dependencies:

key field → Non-critical field x→ non-critical field Y

Assuming the student Relationship table is student (school number, name, age, Institute, Institute location, college phone), the keyword is a single keyword "student number" because of the following decision relationships:

(school number) → (name, age, location of college, Institute, Telephone)

This database is 2NF compliant, but does not conform to 3NF because of the following decision relationships:

(school number) → (college) → (Institute location, college Telephone)

That is, there is a non-critical field, "College location", "College Phone" transfer function dependency on key field "learning number".

It will also have data redundancy, update the exception, insert the exception and delete the exception, the reader can be informed by their own analysis.

The Student relations table is divided into the following two tables:

Student: (School number, name, age, college);

College: (College, location, telephone).

Such database tables conform to the third paradigm, eliminating data redundancy, updating exceptions, inserting exceptions, and deleting exceptions.

Boyce-ke Paradigm (BCNF): On the basis of the third normal form, if no field exists in the database table, the transfer function dependency of any candidate key field is in accordance with the third normal form.

Suppose the Warehouse Management relational table is storehousemanage (warehouse ID, store item ID, admin ID, quantity), and one administrator works only in one warehouse; a warehouse can store multiple items. The following decision relationships exist in this database table:

(Warehouse ID, store item id) → (admin id, quantity)

(Admin ID, store item id) → (warehouse ID, quantity)

So, (warehouse ID, store item id) and (admin ID, store item ID) are the candidate keywords for storehousemanage, the only non-critical field in the table is quantity, it is in accordance with the third normal form. However, because of the following decision relationships:

(warehouse id) → (Administrator id)

(Administrator id) → (warehouse id)

There is a key field that determines the key field, so it does not conform to the BCNF paradigm. It will have the following exception:

(1) Delete exception:

When the warehouse is emptied, all the "store ID" and "quantity" messages are deleted, and the warehouse IDs and admin ID messages are deleted.

(2) Insert exception:

You cannot assign an administrator to a warehouse when no items are stored in the warehouse.

(3) Update exception:

If the warehouse is replaced by an administrator, the administrator ID for all rows in the table is modified.

Decompose the Warehouse Management relationship table into two relational tables:

Warehouse Management: Storehousemanage (warehouse ID, admin ID);

Warehouse: Storehouse (warehouse ID, store item ID, quantity).

Such database tables conform to the BCNF paradigm, eliminating the deletion of exceptions, inserting exceptions, and updating exceptions.

Paradigm Application

Let's step through the database of a forum with the following information:

(1) User: username, email, homepage, telephone, contact address

(2) Post: Post title, post content, reply title, reply content

For the first time, we designed the database to exist only in tables:

User name Email home phone Contact Address post title post content reply Title reply content

This database table conforms to the first paradigm, but none of the candidate keywords can determine the entire row of the database table, and the only key field username cannot completely determine the entire tuple. We need to add the "Post ID", "Reply id" field, and modify the table to:

User name Email home phone Contact address post ID post title post content Reply ID reply title reply content

The keywords in the datasheet (user name, post ID, reply ID) can determine the entire line:

(username, post ID, reply id) → (email, homepage, telephone, contact address, post title, post content, reply title, reply content)

However, such a design does not conform to the second paradigm because of the following decision relationships:

(username) → (email, homepage, telephone, contact address)

(post id) → (post title, post content)

(reply id) → (reply title, reply content)

That is, a non-critical field part of a function relies on candidate key fields, and it is clear that this design can lead to a large amount of data redundancy and operational anomalies.

We decompose the database tables into (underlined keywords):

(1) User information: username, email, homepage, telephone, contact address

(2) Post information: Post ID, title, content

(3) Reply message: Reply ID, title, content

(4) Posts: User name, Post ID

(5) Reply: Post ID, reply ID

Such designs are designed to meet the requirements of the 1th, 2, 3 paradigm and bcnf paradigm, but the design is not the best.

Not necessarily.

Observation, the 4th "post" in the "User name" and "Post ID" is the relationship between the 1:n, therefore, we can merge "post" into the 2nd item "Post Information", the 5th "reply" in the "Post ID" and "Reply ID" between the 1:n relationship, so we can put "reply" Merge to the "Reply info" of item 3rd. This allows for a quantitative reduction in data redundancy, with the new design:

(1) User information: username, email, homepage, telephone, contact address

(2) Post information: User name, post ID, title, content

(3) Reply message: Post ID, reply ID, title, content

Table 1 of the database clearly meets the requirements of all paradigms;

In database table 2, there is a partial function dependency of the Non-critical field "title" and "Content" on the key field "Post ID", that is, the requirement of the second normal form is not satisfied, but this design does not result in data redundancy and operation exception;

There is also a partial function dependency of the Non-critical field "title", "Content" on the key field "Reply ID" in database table 3, and does not meet the requirements of the second normal form, but similar to database table 2, this design does not result in data redundancy and manipulation exceptions.

From this can be seen, not necessarily forced to meet the requirements of the paradigm, for the 1:n relationship, when 1 of the side of the merge to the n side, the N side will no longer meet the second normal form, but this design is better.

For m:n relationships, it is not possible to merge M side or N on the other side, which can lead to a failure to conform to the normal paradigm and result in operational exceptions and data redundancy.

For a 1:1 relationship, we can merge the Left 1 or the right 1 to the other side, and the design leads to a pattern that does not conform to the normal paradigm, but does not cause operational anomalies or data redundancy.

Conclusion

The database design which satisfies the paradigm requirement is clear in structure, and can avoid data redundancy and operation Anomaly. This means that the design that does not conform to the normal paradigm must be wrong, and in the case of a 1:1 or 1:n relationship in a database table, it is reasonable to combine the inconsistent paradigm requirements.

When we design a database, we must always consider the requirements of the paradigm.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.