Analysis of database design paradigm

Source: Internet
Author: User
Tags dname

When designing relational databases, you must follow certain rules. In particular, the database design paradigm briefly introduces 1nf (first paradigm), 2nf (second paradigm), 3nf (third paradigm), and bcnf. The fourth and fifth paradigms will be introduced later. When designing a database, if you can meet these paradigms, you are a master of database design.

1nf: in each specific relational R in relational mode R, if each attribute value is the smallest data unit that cannot be further divided, R is called the relationship of the first paradigm. For example, the employee number, name, and phone number form a table (a person may have an office phone number and a home phone number). There are three methods to standardize 1nf:

The first is to repeat the employee ID and name. In this way, the keyword can only be a phone number.

Second, the employee number is a keyword. The phone number is divided into two attributes: the phone number of the organization and the residential phone number.

Third, the employee number is a keyword, but each record must have only one phone number.

The first method is the least desirable of the above three methods. The last two cases are selected based on the actual situation.

2nf: if all the non-primary attributes in the relational mode R (u, f) depend entirely on any candidate keyword, the relational R belongs to the second paradigm.

For example, in the Course Selection relation SCI (SNO, CNO, grade, and credit), SnO indicates the student ID, CNO indicates the course number, gradege indicates the score, and credit indicates the credit. The preceding condition indicates a combination of keywords (SNO, CNO)

There are the following problems when using the above link mode in applications:

A. data redundancy. Assume that 40 students take the same course and 40 credits are repeated.

B. The update is abnormal. If the credits of a course are adjusted, the corresponding credit values of the corresponding tuples must be updated, and the credits of the same course may be different.

C. Insert exceptions. For example, if you plan to open a new course, because no one takes the course or has no student ID keyword, you can only save the course and credits after someone takes the course.

D. Deletion exception. If the student has completed the course, the electives will be deleted from the current database. Some new students have not yet taken the course, so the course and credits cannot be saved.

Cause: the non-Keyword attribute credit only depends on CNO, that is, the credit part depends on the combined keyword (SNO, CNO) rather than the full dependency.

Solution: It is divided into two relational modes: SC1 (SNO, CNO, grade), C2 (CNO, credit ). The new relationship includes two relational modes, which are connected by the external keyword CNO in SC1. When necessary, they are naturally connected, restoring the original relationship.

3nf: If all non-primary attributes in the relational model R (u, f) do not transmit trust to any candidate keywords, the relational model R belongs to the 3nf.

For example, each attribute of S1 (SNO, sname, DNO, dname, location) represents a student ID,

Name, Department, Department name, and Department address.

The SNO keyword determines the attributes. Because it is a single keyword, there is no part of the dependency problem, it must be 2nf. However, there must be a lot of redundancy in this relationship. The attributes DNO, dname, and location of the student will be stored, inserted, deleted, and modified repeatedly.

Cause: A dependency is transmitted in the link. That is, Sno-> DNO. DNO-> SnO does not exist, DNO-> location. Therefore, the key Liao SnO decides to implement the location function by passing the dependency Sno-> location. That is to say, SnO does not directly determine the non-primary attribute location.

Solution: the transfer dependency cannot be left in each link mode.

Solution: There are two relationships: S (SNO, sname, DNO), D (DNO, dname, location)

Note: The outer keyword DNO cannot be found in link S. Otherwise, the relationship is lost.

Bcnf: If all attributes (including primary and non-primary attributes) of the relational mode R (u, f) do not pass any candidate keywords dependent on R, therefore, relational R belongs to bcnf. Or relational mode R. If each deciding factor contains a keyword (rather than a keyword), then the relational mode of rcnf.

For example, the component management relationship model WPE (wno, PNO, Eno, qnt) lists the Warehouse number, accessory number, employee number, and quantity respectively. The following conditions are met:

A. a warehouse has multiple employees.

B. One employee only works in one warehouse.

C. Special personnel are responsible for one type of accessories in each warehouse, but one person can manage several types of accessories.

D. Parts of the same model can be placed in several warehouses.

Analysis: qnt cannot be determined by the above PNO, and is determined by the combination attribute (wno, PNO). function dependencies (wno, PNO)-> Eno exist. A person is responsible for one type of accessories in each warehouse, and one person can manage several types of accessories. Therefore, the owner can be determined only by the combination attribute (wno, PNO) -> Eno. Because an employee only works in one warehouse, there is ENO-> wno. As a type of accessories in each warehouse is the responsibility of a person, while an employee only works in one warehouse, there are (Eno, PNO)-> qnt.

Find the candidate keyword, because (wno, PNO)-> qnt, (wno, PNO)-> Eno, so (wno, PNO) can determine the entire tuples, is a candidate keyword. Based on ENO-> wno, (Eno, PNO)-> qnt, the entire tuples (Eno, PNO) can also be determined as another candidate keyword. The Eno, wno, and PNO attributes are both primary attributes and only one non-primary attribute qnt. It fully depends on any candidate keyword and is directly dependent on the function. Therefore, the relational mode is 3nf.

Analyze the main attributes. Because ENO-> wno, the main attribute Eno is the deciding factor of wno, but it is not a keyword, but only a part of the combination keyword. This results in the dependency of the master attribute wno on another candidate keyword (Eno, PNO), because (Eno, PNO)-> Eno, but in turn does not exist, p-> wno, so (Eno, PNO)-> wno is also a transfer dependency.

Although there is no dependency on the transfer of non-primary attributes to candidate key Liao, the transmission dependency of primary attributes on candidate keywords can also be troublesome. For example, a new employee is assigned to a warehouse but is currently in the internship phase and is not independently responsible for the management of certain accessories. The link cannot be inserted because a portion of PNO is missing. If an individual changes to safety regardless of accessories, the employee will also be deleted when the accessories are deleted.

Solution: divide it into the management EP (Eno, PNO, qnt), the keyword is (Eno, PNO), the job EW (Eno, wno), and the keyword is Eno

Disadvantage: the persistence of function dependencies after decomposition is poor. In this example, due to decomposition, the function dependency (wno, PNO)-> Eno is lost, and thus the original semantics is damaged. It does not show that a part in each warehouse is under the responsibility of a specialist. A component may be managed by two or more people at the same time. Therefore, the decomposed link mode reduces some integrity constraints.

A link is divided into multiple relationships. To make the decomposition meaningful, the minimum requirement is that the original information is not lost after the decomposition. This information includes not only the data itself, but also the mutual constraints between the data represented by function dependencies. The goal of decomposition is to achieve a higher degree of standardization, but at the same time, two problems must be taken into account: lossless connectivity and function dependency persistence. Sometimes it is often impossible to achieve both lossless connectivity and fully maintain functional dependencies. You need to make trade-offs as needed.

There is a relationship between the four paradigms from 1nf to bcnf:

Bcnf contains 3nf and 2nf contains 1nf.

Summary:

Purpose: normalization aims to make the structure more reasonable, eliminate storage exceptions, and minimize data redundancy to facilitate insertion, deletion, and update.

Principles: follow the concept singularity "one thing and one place" principle, that is, a relational model describes a link between an entity or an entity. The essence of standardization is the simplification of concepts.

Method: link mode projection is divided into two or more link modes.

Requirement: The decomposed link mode set should be "equivalent" to the original link mode, that is, after a natural link, the original link can be restored without losing information, and reasonable links between attributes can be maintained.

Note: If a link mode is completed, different link modes can be obtained. That is to say, the decomposition method is not unique. The Minimum Redundancy must be achieved on the premise that the decomposed database can express all information of the original database. Its fundamental goal is to save storage space, avoid data inconsistency, improve the efficiency of relational operations, and meet application requirements. In fact, it is not necessarily required that all models reach bcnf. Sometimes it is easier to query data by deliberately retaining some redundancy. This is especially true for database systems with low update frequency and high query frequency.

In relational databases, in addition to function dependencies, there are also multi-value dependencies and connection dependencies. This puts forward higher standardization requirements such as the fourth and fifth paradigms. Here, we will discuss it later.

Dear friends, what do you think after reading this article? In fact, any book on basic database theory will talk about these things, considering that many netizens come home halfway to make databases. If you have any questions, don't ask me. I 'd like to find a book on relational database theory, which may be of great help to you. The above is the basic theory. Please think about it. Have you considered following the above several paradigms when designing a database? Are there any problems in database design, think about it. What kind of paradigm does the above mentioned violate?

Few of the database designs I have ever seen are in line with the above several paradigms. Generally, the first paradigm can be followed by very few people who fully abide by the second and third paradigms, the followers must be experts in designing databases. There are few opportunities for the bcnf paradigm to appear, and integrity will be damaged. You can ignore it when designing, of course, the shortcomings can be solved through triggers in Oracle. In the future, when we work together to design, we also hope that you will abide by the above several paradigms.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.