Application of three paradigms of database design analysis of ____ database

Source: Internet
Author: User

The database design paradigm is the specification that the database design needs to satisfy, the database that satisfies these specifications is concise, the structure is clear, at the same time, insert (insert), delete (delete) and update operation exception are not occurred. The reverse is messy, not only causing trouble to programmers in the database, but also being repulsive, potentially storing a lot of unwanted redundancy information.

Design paradigm is not difficult to understand. Not also, the university textbook gives us a bunch of mathematical formulas we certainly can not understand, also can't remember. So many of us do not design databases according to Paradigm.

In essence, the design paradigm can be clearly understood in terms of very vivid and concise words. This article will explain the paradigm in layman's way, and take the example of a simple forum database which I have designed to explain how to apply these paradigms to practical projects.

  Paradigm Description

First normal form (1NF): The fields in the database table are single attributes and cannot be divided. This single attribute is composed of basic types, including integer, real, character, logical, date, etc.

For example, the following database tables are consistent with the first paradigm:

Field 1 Field 2 Field 3 Field 4


and such a database table is inconsistent with the first paradigm:

Field 1 Field 2 Field 3 Field 4
Field 3.1 Field 3.2


It is clear that in any current relational database management system (DBMS), a fool is unlikely to make a database that does not conform to the first paradigm, because the DBMS does not allow you to divide one column of a database table into two or more columns. Therefore, it is impossible for you to design a database that does not conform to the first paradigm in the existing DBMS.

Second Normal form (2NF): Non-critical fields do not exist in database tables partial function dependencies on any candidate key fields (partial function dependencies refer to the existence of certain fields in the combination keyword that determine non-critical fields), i.e. all non-critical fields are entirely dependent on any set of candidate keywords.

Suppose the elective relationship table is Selectcourse (school number, name, age, course name, score, credits), the keyword is the combination of keywords (school number, course name), because there are the following decision relationships:

(School number, course name) → (name, age, grade, credits)

This database table does not satisfy the second normal form because the following decision relationships exist:

(course name) → (credits)

(school number) → (name, age)

That is, there are fields in the combination key that determine the non keyword.

Because it does not conform to 2NF, this elective relationship table has the following problems:

(1) Data redundancy:

The same course is selected by n students, and the "credits" are repeated n-1 times; the same student took the M-class course, and the name and age were repeated m-1 times.

(2) Update exception:

If you adjust the credits for a course, the "credits" values for all the rows in the datasheet are updated, otherwise the same course credits will be different.

(3) Insert exception:

Suppose that a new course is to be opened and no one has been selected for the time being. This way, because there is no "learn number" keyword, the course name and credits are not recorded in the database.

(4) Delete exception:

If a group of students have completed elective courses, these elective records should be removed from the database table. At the same time, however, the course name and credit information have been removed. Obviously, this can also cause an insert exception.

Change the selectcourse relationship table to the following three tables:

Student: Student (school number, name, age);

Course: Course (course name, credits);

Elective relationship: Selectcourse (School number, course name, grade).

Such database tables conform to the second normal form, eliminating data redundancy, updating exceptions, inserting exceptions, and deleting exceptions.

In addition, database tables for all single keys conform to the second normal form because there is no possible combination of keywords.

Third normal form (3NF): On the basis of the second normal form, if no non-critical field exists in the data table, it conforms to the third normal form if the transfer function dependency of any candidate key field is not present. The so-called transfer function dependency means that if there is a "a→b→c" decision relationship, the C transfer function relies on A. Therefore, a database table that satisfies the third normal form should not have the following dependencies:

key field → Non-critical field x→ non-critical field Y

Assuming the student Relationship table is student (school number, name, age, Institute, Institute location, college phone), the keyword is a single keyword "student number" because of the following decision relationships:

(school number) → (name, age, location of college, Institute, Telephone)

This database is 2NF compliant, but does not conform to 3NF because of the following decision relationships:

(school number) → (college) → (Institute location, college Telephone)

That is, there is a non-critical field, "College location", "College Phone" transfer function dependency on key field "learning number".

It will also have data redundancy, update the exception, insert the exception and delete the exception, the reader can be informed by their own analysis.

The Student relations table is divided into the following two tables:

Student: (School number, name, age, college);

College: (College, location, telephone).

Such database tables conform to the third paradigm, eliminating data redundancy, updating exceptions, inserting exceptions, and deleting exceptions.

Boyce-ke Paradigm (BCNF): On the basis of the third normal form, if no field exists in the database table, the transfer function dependency of any candidate key field is in accordance with the third normal form.

Suppose the Warehouse Management relational table is storehousemanage (warehouse ID, store item ID, admin ID, quantity), and one administrator works only in one warehouse; a warehouse can store multiple items. The following decision relationships exist in this database table:

(Warehouse ID, store item id) → (admin id, quantity)

(Admin ID, store item id) → (warehouse ID, quantity)

So, (warehouse ID, store item id) and (admin ID, store item ID) are the candidate keywords for storehousemanage, the only non-critical field in the table is quantity, it is in accordance with the third normal form. However, because of the following decision relationships:

(warehouse id) → (Administrator id)

(Administrator id) → (warehouse id)

There is a key field that determines the key field, so it does not conform to the BCNF paradigm. It will have the following exception:

(1) Delete exception:

When the warehouse is emptied, all the "store ID" and "quantity" messages are deleted, and the warehouse IDs and admin ID messages are deleted.

(2) Insert exception:

You cannot assign an administrator to a warehouse when no items are stored in the warehouse.

(3) Update exception:

If the warehouse is replaced by an administrator, the administrator ID for all rows in the table is modified.

Decompose the Warehouse Management relationship table into two relational tables:

Warehouse Management: Storehousemanage (warehouse ID, admin ID);

Warehouse: Storehouse (warehouse ID, store item ID, quantity).

Such database tables conform to the BCNF paradigm, eliminating the deletion of exceptions, inserting exceptions, and updating exceptions.

Paradigm Application

Let's step through the database of a forum with the following information:

(1) User: username, email, homepage, telephone, contact address

(2) Post: Post title, post content, reply title, reply content

For the first time, we designed the database to exist only in tables:
  

User name Email Home Phone Contact address Post title Post Content Reply title Reply content

This database table conforms to the first paradigm, but none of the candidate keywords can determine the entire row of the database table, and the only key field username cannot completely determine the entire tuple. We need to add the "Post ID", "Reply id" field, and modify the table to:

User name Email Home Phone Contact address Post ID Post title Post Content Reply ID Reply title Reply content

The keywords in the datasheet (user name, post ID, reply ID) can determine the entire line:

(username, post ID, reply id) → (email, homepage, telephone, contact address, post title, post content, reply title, reply content)

However, such a design does not conform to the second paradigm because of the following decision relationships:

(username) → (email, homepage, telephone, contact address)

(post id) → (post title, post content)

(reply id) → (reply title, reply content)

That is, a non-critical field part of a function relies on candidate key fields, and it is clear that this design can lead to a large amount of data redundancy and operational anomalies.

We decompose the database tables into (underlined keywords):

(1) User information: username, email, homepage, telephone, contact address

(2) Post information: Post ID, title, content

(3) Reply message: Reply ID, title, content

(4) Posts: User name, Post ID

(5) Reply: Post ID, reply ID

Such designs are designed to meet the requirements of the 1th, 2, 3 paradigm and bcnf paradigm, but the design is not the best.

Not necessarily.

Observation, the 4th "post" in the "User name" and "Post ID" is the relationship between the 1:n, therefore, we can merge "post" into the 2nd item "Post Information", the 5th "reply" in the "Post ID" and "Reply ID" between the 1:n relationship, so we can put "reply" Merge to the "Reply info" of item 3rd. This allows for a quantitative reduction in data redundancy, with the new design:

(1) User information: username, email, homepage, telephone, contact address

(2) Post information: User name, post ID, title, content

(3) Reply message: Post ID, reply ID, title, content

Table 1 of the database clearly meets the requirements of all paradigms;

In database table 2, there is a partial function dependency of the Non-critical field "title" and "Content" on the key field "Post ID", that is, the requirement of the second normal form is not satisfied, but this design does not result in data redundancy and operation exception;

There is also a partial function dependency of the Non-critical field "title", "Content" on the key field "Reply ID" in database table 3, and does not meet the requirements of the second normal form, but similar to database table 2, this design does not result in data redundancy and manipulation exceptions.

From this can be seen, not necessarily forced to meet the requirements of the paradigm, for the 1:n relationship, when 1 of the side of the merge to the n side, the N side will no longer meet the second normal form, but this design is better.

For m:n relationships, it is not possible to merge M side or N on the other side, which can lead to a failure to conform to the normal paradigm and result in operational exceptions and data redundancy.
For a 1:1 relationship, we can merge the Left 1 or the right 1 to the other side, and the design leads to a pattern that does not conform to the normal paradigm, but does not cause operational anomalies or data redundancy.

   Conclusions

The database design which satisfies the paradigm requirement is clear in structure, and can avoid data redundancy and operation Anomaly. This means that the design that does not conform to the normal paradigm must be wrong, and in the case of a 1:1 or 1:n relationship in a database table, it is reasonable to combine the inconsistent paradigm requirements.

When we design a database, we must always consider the requirements of the paradigm.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.