relational database design Paradigm

Source: Internet
Author: User

Paradigm:

  When designing the relational database, we should design a reasonable relational database to conform to different specifications, and the different norm requirements are called different paradigms, and the higher paradigm database redundancy is lower.

The first step is to understand what the "paradigm (NF)" means. According to the textbook definition, the paradigm is "a set of relational patterns that conform to a certain level, which represents the degree of rationalization of the linkages between the various attributes within a relationship". Pretty obscure, huh? In fact, you can roughly interpret it as a level of design criteria that conforms to the table structure of a data table . Like home decoration buy building materials, the most environmentally friendly is the E0 class, followed by E1 class, there are E2 levels and so on. The database paradigm is also divided into 1nf,2nf,3nf,bcnf,4nf,5nf. Generally, when we design relational databases, it is sufficient to consider bcnf. A design that conforms to a high-level paradigm must conform to a low-level paradigm, such as a 2NF-compliant relational model, which must conform to 1NF.


There are currently six paradigms for relational databases: the first paradigm (1NF), the second paradigm (2NF), the third paradigm (3NF), the bath-cod paradigm (BCNF), the fourth normal form (4NF), and the fifth paradigm (5NF, also known as the perfect paradigm)In the design process of relational database, the design criterion (paradigm principle) is the soul of database. 6 Types of paradigmsVernacular ridicule:
  • first normal form (1NF): each attribute in a 1NF-compliant relationship cannot be divided A property is a field, which means that the field can only represent one meaning, with a value
  • The second normal form (2NF): Conforms to 1NF, and the non-primary attribute completely depends on the code to have the primary key
  • Third normal form (3NF): Conforms to 2NF, and eliminates the dependence of the transmission to eliminate redundancy, improve cohesion , reducing coupling
  • BC paradigm (BCNF): Also known as the BC Paradigm, conforms to 3NF, and the primary attribute is not dependent on the primary attribute It is actually a correction to the third paradigm, making database redundancy less
  • Fourth normal form (5NF): Conforms to BCNF, r<u,f>∈1nf, if for each non-trivial multivalued dependency of R x→→y (Y? x), X contains candidate codes, so Many-to-many relationships cannot appear in the R∈4NF table
  • Fifth paradigm (6NF): If each connection dependency in the relational mode r is implied by the candidate Code of R, then the R∈5NF final structure is re-established in the original structure.


Part of the content refers to Ningbo University Liu Yan teacher.
The next step is to explain each level of the paradigm, first of all the first paradigm (1NF). The
conforms to the 1NF relationship (which you can understand as a data table.) The difference between "relationship" and "relationship pattern" is similar to the difference between "class" and "Object" in object-oriented programming. "Relationship" is an example of "relational pattern", you can understand "relationship" as a table with data, and "relational mode" is the table structure of this data table. 1NF is defined as: each attribute in a relationship that conforms to 1NF cannot be divided. The situation shown in table 1 does not meet the requirements of 1NF.
Table 1

In fact, 1NF is the most basic requirement for all relational databases , when you create a data table in a relational database management system (RDBMS), such as SQL Server,oracle,mysql, If the design of the data table does not meet the most basic requirements, then the operation must not be successful. That is, as long as the data table already exists in the RDBMS, it must be 1NF compliant. If we want to represent the data in the table in an RDBMS, we have to design it as table 2 :
Table 2

But only 1NF design, there will still be too large data redundancy, insert exceptions, delete exceptions, modify the problem of the exception, for example, Design in Table 3 :
Table 3

    1. Each student's number, name, name, and department head are repeated several times. Data for each department and corresponding Dean are repeated multiple times- redundant data
    2. If the school has a new department but has not enrolled any students for the time being (for example, it was created in March, but not until August), It is not possible to add the data of the department name and the dean separately to the data table (note 1)-- Insert Exception

      Note 1: Any one of the attributes contained in the relationship (note 2) cannot be empty according to the requirement of entity integrity in three relationship integrity constraints , the combination of all attributes cannot be duplicated. In order to meet this requirement, the table in the diagram can only use the combination of the study number and the class name as the code, otherwise it will not be possible to divide each record in a unique region.

      Note 2: code: A property in a relationship, or a combination of several properties, used to differentiate each tuple (a tuple can be understood as each record in a table, that is, each row) .
    3. If the records of all students in a department are deleted, then the data of all the department and department heads will disappear (none of the students in the department, not that the department is gone). -- Delete exception
    4. if Ted Physico to the law department, then in order to ensure the consistency of data in the database, we need to modify the data of the Department and department director in three records. -- Modify Exception .

Because only 1NF database design has such problems, we need to improve the design standards, to eliminate the factors that lead to the above four problems, so that they conform to a higher-level paradigm (2NF), which is called "normalization."

The second paradigm (2NF) strict definition in relation theory I don't introduce much here (because it involves more bedding), only need to understand 2NF to 1NF of what improvements can be. The improvement is that2NF is based on 1NF, eliminating the partial function dependency of the non-main attribute on the code . Next, the four concepts involved in this sentence-"function dependency","code","Non-Principal attribute", and "partial function dependency" -are explained.

function dependency
We can understand that (but not particularly strict definition): If in a table, the value of the attribute (or attribute group) X is determined to determine the value of the property y, then you can say that the Y function depends on X, writing x→y. That is, no two records exist in the data table, they have the same value on the x attribute (or attribute group), and the value on the Y property is different. This is the origin of the "function dependent" name, similar to the function relationship y = f (x), where the value of x is determined, the value of Y must be deterministic.

For example, for the data in table 3, you cannot find any records, they have the same number and the corresponding names are different. So we can say that the name function depends on the school number , writing study number → name . But conversely, because students with the same name may appear, there may be different two student records, they have the same value on the name, but the corresponding number is different, so we can not say that the study number function depends on the name. Other function dependencies in the table are also as follows:

    • Department name → Department head
    • School Number → Department head
    • (School number, class name) → Score

However, the following function dependencies are not true:

    • School number → class name
    • Study number → Score
    • Class name → Department head
    • (School number, class name) → Name

There are also three concepts that can be expanded from the concept of "function dependency":
Full function dependency
In a table, if x→y, and for any true subset of X (if the attribute group X contains more than one attribute), X ' →y is not true, then we call Y for the X full function dependency , which is recorded as X f→y. (that F should be written right above the arrow, no Way out ...), correct notation ( Figure 1)
Figure 1

For example:

    • School Number f→ Name
    • (School number, class name) f→ score (note: Because the same school number corresponding to the score is not certain, the same class name corresponding to the score is not determined)

Partial function dependency
If the Y function is dependent on X, but at the same time Y is not a complete function dependent on X, then we call the Y function dependent on x, which is recorded as X P→y, as shown in Figure 2.

Figure 2


For example:

    • (School number, class name) p→ name


Pass-through function dependency
If the Z function relies on y, and the Y function depends on X (strictly speaking an x is not contained in Y, and y does not depend on the precondition of Z), then we call the Z-transfer function dependent on X, which is recorded as X t→ Z, such as Figure 3 .


Figure 3

code
Set K as a property or attribute group in a table, if all attributes except K are fully functional dependent on K (this "completely" do not miss), then we call K the candidate code , abbreviated as code . In practice, we can generally understand that: if the k is determined by the case, the table in addition to the value of all attributes except K will be determined, then K is the code. there can be more than one code in a single table. (For convenience in practical applications, you usually select one of the codes as the main code )

For example:
for Table 3, (school number, class name) This attribute group is the code. There is only one code in the table. (assuming that all classes have no duplicate names)

non-primary properties
The property that is contained in any one code becomes the primary property.

For example:
for table 3, there are two main properties, study number and lesson name .


can finally go back to look at 2NF. First of all, we need to determine whether table 3 meets 2NF requirements? According to the definition of 2NF, the basis of judgment is actually to see whether there is a partial function dependency of a non-primary attribute on the code in the datasheet . If present, the data table meets the requirements of 1NF only, and if it does not exist, it meets the requirements of 2NF. The method of judging is:

First step: Find all the codes in the data table.
Step Two: Find all the main properties According to the code obtained in the first step.
Step three: In the datasheet, except for all the main properties, all that is left is the non-primary property .
Step Fourth: see if there is a partial function dependency for non-primary property codes.

for Table 3, according to the previous four steps, we can do this:
First step:

    1. View all of the individual properties, when its value is determined, whether all remaining property values are determined.
    2. View all attribute groups that contain two attributes, and when its value is determined, whether all remaining property values are determined.
    3. ......
    4. View all attribute groups that contain six attributes, that is, all attributes, when its value is determined, whether all remaining property values are determined.

It looks like a lot of trouble, but here's the trick: If A is a code, then all attribute groups that contain a, such as (A, B), (A,c), (a,b,c), and so on, are not yards (because there is one in the code requirement).completelyfunction dependency ").

Figure 4Represents all the function dependencies in a table:
Figure 4

After this step is complete, you can get, table 3 of the code only one, is(School number, classname)

Step Two:
There are two main attributes:School NumberAndclass name


Step Three:
There are four non-primary attributes:nameDepartment NameHead of Departmentscore


Fourth Step:
For(School number, class name) → nameYesSchool number → name, a non-primary attribute existsnameto code(School number, class name)Part of the function dependency.
For(School number, class name) → department nameYesSchool Number → department name, there is a non-main attribute systemnameto code(School number, class name)Part of the function dependency.
For(School number, class name) → Department headYesSchool Number → Department head, there is a non-master property pair code(School number, class name)Part of the function dependency.

Therefore, table 3 existence of non-main properties of the code part of the function dependence, the highest only meet the requirements of 1NF, does not meet the requirements of 2NF.



In order for table 3 to meet 2NF requirements, we must eliminate these partial function dependencies, there is only one way, is to split the big data table into two or more smaller data tables, in the process of splitting, to achieve a higher level of the requirements of the paradigm, this process is called "pattern decomposition." The method of pattern decomposition is not unique, and here is one of the methods:
Elective Course (School number, class name, score)
Student (school number, name, department name, dean)

Let's start by judging the following,Course SelectionTable andStudentDoes the table meet the requirements of 2NF?

ForCourse SelectionTable, whose code is(School number, class name), the main attribute isSchool NumberAndclass name, the non-primary property isscoreSchool NumberDetermined, and cannot be uniquely determinedscoreclass nameDetermined, nor could it be uniquely determinedscore, so there is no non-primary attributescoreFor code(School number, class name)Part of the function is dependent, so this table meets the requirements of 2NF.

ForStudentTable, whose code isstudy number,The main property isSchool Number, the non-primary property isName, department nameAndHead of Department, since the code has only one property, it is impossible to have a partial function dependency of a non-primary attribute on the code, so this table meets the requirements of 2NF.

Figure 5Represents a new function dependency after pattern decomposition
Figure 5

Table 4 shows the new data after the pattern decomposition

Table 4

(It also involves how a pattern decomposition is the correct knowledge point, not the first introduction)

Now let's take a look and do the same thing, do we still have those problems?

    1. Ted Physico to the law department
      You only need to modify the value of the TED corresponding system once. -There are improvements
    2. is the data redundancy reduced?
      The student's name, department name and Dean are not repeated as many times as before. -There are improvements
    3. Delete all student records in a department
      The information of the Department is still lost. -No improvement
    4. Insert a new system with no students yet.
      This operation is not allowed because the student table code is a study number and cannot be empty. -No improvement

Therefore, only meet the requirements of 2NF, in many cases is not enough, and the cause of the problem is that there is still a non-master attributeHead of DepartmentFor codeSchool Numberdependent on the transfer function. In order to further solve these problems, we also need to improve the 2NF requirements of the data table to meet the requirements of 3NF.

Third paradigm (3NF)
3NF on the basis of 2NF, eliminating the transfer function dependency of the non-main attribute on the code。 That is, if there is a non-primary property on the code's transfer function dependency, it does not meet the requirements of 3NF.

Let's look at the design in table 4 to see if it meets the requirements of 3NF.

ForCourse SelectionTable, the main code is (school number, class name), the main attribute isSchool NumberAndclass name,A non-primary attribute has only one, which is a fraction, and cannot have a transitive function dependency, soCourse SelectionThe table is designed to meet the requirements of 3NF.

ForStudentTable, the main code isSchool Number, the main property isSchool Number, the non-primary property isnameDepartment NameAndHead of Department。 Because the school number → department name, at the same time the name → department director, so there are non-main propertiesHead of DepartmentFor codeSchool Numberis dependent on the transfer function, soStudentThe design of the table, does not meet the requirements of 3NF.

In order for the data table design to reach 3NF, we must further the pattern decomposition into the following form:
Elective Course (School number, class name, score)
Student (school number, name, name)
Department (department name, department head)

ForCourse SelectionTable, meet the requirements of 3NF, has been analyzed before.

ForStudentTable, code forSchool Number, the main property isSchool Number, the non-primary property isDepartment Name, it is impossible to have a non-primary attribute dependent on the pass function of the code, so it conforms to the 3NF requirement.

ForDepartmentTable, code forDepartment Name, the main property isDepartment Name, the non-primary property isHead of Department, it is not possible to have a non-primary property for the transfer function dependency of the code (at least three properties are possible to have a transitive function dependency), so meet the requirements of 3NF:


New function Dependencies 6
Figure 6

The new data table is shown in table 5


Table 5


Now let's take a look and do the same thing, do we still have those problems?

    1. Delete all student records in a department
      The information of the Department is not lost. -There are improvements
    2. Insert a new system with no students yet.
      Because the table and student table are currently independent of the two tables, so does not affect. -There are improvements
    3. Data redundancy is even less. -There are improvements


Conclusion
Thus, according to the 3NF requirements of the database design, basically solve the data redundancy is too large, insert the exception, modify the exception, delete the problem. Of course, in practice, often for performance or to meet the needs of expansion, often do 2NF or 1NF, but as a database designer, at least should know, 3NF requirements.

relational database design Paradigm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.