Database Chapter 6 Relational data theory

Source: Internet
Author: User

The sixth chapter of relational data theory

This chapter explains the relationship normalization theory in detail, which is not only the important theoretical foundation of relational database but also the powerful tool of database design, and the standardization theory provides the theoretical Guide and tool for database design.

Although the normalization theory is based on the relational model, it also has theoretical significance to the general database logic design.

A relational pattern should be a five-tuple:

R (U,D,DOM,F)

1) The Relation name R, which is symbolic tuple semantics;

2) a set of attributes U;

3) The domain D from which the attributes in the attribute group u are from;

4) attribute-to-domain mapping dom;

5) A set of data on the attribute group U is dependent on F;

3, 4 has little to do with pattern design, so it can be simplified as a ternary group in this chapter r<u,f>; When and only if a relationship on U satisfies F, it is called a relationship pattern r<u,f>.

Relationship as a two-dimensional table, a minimum requirement is: Each component must be an irreducible data item, the relational pattern satisfying this condition belongs to the first normal form (INF).

Data dependence is a kind of constraint relationship between attributes and attributes, which is related to data through the equality of values between attributes. The most important of these are function dependencies and multivalued dependencies.

Fully dependent (full) means Y needs all the information in X to determine its own value, and a partial dependency (partly) means that Y only needs a subset of the information in X to determine its own value.

Relational database The relationship between the brother and the boss needs to meet certain requirements, and meet the requirements of different levels for different paradigms.

Meet the minimum requirements called the first normal form, referred to as the INF, in the first paradigm to meet further requirements for the second paradigm, the rest and so on.

Multi-valued dependency (MVD)

If a relational pattern is 4NF, it must be bcnf.

In simple terms, there is no duplicate column in the first paradigm; the second paradigm attribute is completely dependent on the primary key; the third normal attribute does not depend on other non-primary property keys.

Introduction to relational database design paradigm

.1 First normal form (1NF) No duplicate columns
The so-called First paradigm (1NF) refers to the fact that each column of a database table is an indivisible basic data item and cannot have multiple values in the same column, that is, an attribute in an entity cannot have multiple values or cannot have duplicate properties. If duplicate attributes are present, you may need to define a new entity, which is composed of duplicate attributes, and a one-to-many relationship between the new entity and the original entity. In the first normal form (1NF), each row of a table contains only one instance of information. In short, the first paradigm is a column with no duplicates.

Note: In any relational database, the first paradigm (1NF) is the basic requirement for relational schemas, and a database that does not meet the first normal form (1NF) is not a relational database.

1.2 Second normal form (2NF) property fully dependent on primary key [eliminate partial child function dependency]
The second paradigm (2NF) is established on the basis of the first paradigm (1NF), i.e. satisfying the second normal form (2NF) must first satisfy the first paradigm (1NF). The second normal form (2NF) requires that each instance or row in a database table must be divided by a unique region. It is often necessary to add a column to the table to store the unique identity of each instance. For example, the Employee Information table adds the employee number (emp_id) column because each employee's employee number is unique, so each employee can be uniquely differentiated. This unique attribute column is called the primary key or primary key, and the main code.
The second normal form (2NF) requires that the attributes of an entity depend entirely on the primary key. The so-called full dependency is the inability to have a property that depends only on the primary key, and if so, this part of the property and the primary key should be separated to form a new entity, and the new entity is a one-to-many relationship with the original entity. It is often necessary to add a column to the table to store the unique identity of each instance. In short, the second paradigm is that properties depend entirely on the primary key.

1.3 Third normal form (3NF) property does not depend on other non-primary properties [eliminate transitive dependencies]
Satisfying the third normal form (3NF) must first satisfy the second normal form (2NF). In short, the third paradigm (3NF) requires that a database table not contain non-primary key information already contained in other tables. For example, there is a departmental information table, where each department has a department number (dept_id), a department name, a department profile, and so on. Then the department number is listed in the Employee Information table, the department name, department profile and other departments related information can no longer be added to the Employee Information table. If there is no departmental information table, it should be built according to the third paradigm (3NF), otherwise there will be a lot of data redundancy. In short, the third paradigm is that properties do not depend on other non-principal properties.

Ii. Analysis of Paradigm application examples
The following is an example of a school's student system, which illustrates the application of these paradigms. First Paradigm (1NF): The fields in a database table are single attributes and cannot be divided. This single attribute consists of a basic type, including Integer, real, character, logical, date, and so on. In any current relational database management system (DBMS), it is impossible for a fool to make a database that does not conform to the first paradigm, because these DBMS do not allow you to divide a column of a database table into two or more columns. Therefore, it is impossible for you to design a database that does not conform to the first paradigm in your existing DBMS.
First, let's make sure that the content to be designed includes those. School number, student name, age, gender, course, course credit, department, academic results, office address, Department of telephone and other information. For the sake of simplicity we only consider these field information for the time being. There are several aspects to this information that we say are concerned about.

Students have the basic information
What did the students do when they chose those classes?
What are the credits for each lesson?
Students belong to that department, the basic information of the Department is what.

2.1 Example analysis of the second normal form (2NF)
First of all, we consider that all of this information is placed in a table (student, name, age, gender, course, course credit, department, academic results, office address, Office Phone) The following dependencies exist.
(school number) → (name, age, gender, department, office address, Office phone)
(course name) → (credits)
(School number, course) → (academic results)
Analysis of 2.1.1 Problems
Therefore, the following problems will arise if the requirements of the second paradigm are not satisfied

Data redundancy: The same course is taught by n students, and "credits" are repeated n-1 times; the same student took the M course, and the name and age were repeated m-1 times.

Update exception:
1) If the credit of a course is adjusted, the "credit" value of all the lines in the data sheet should be updated, otherwise the same course credit will be different.
2) Suppose that a new course is to be opened and no one has yet been enrolled. Thus, the course name and credits cannot be recorded in the database because there is no "learning number" keyword.

Delete exception: Assuming that a group of students have completed elective courses, these elective records should be removed from the database table. At the same time, however, the course name and credit information were also removed. Obviously, this can also lead to an insertion exception.
2.1.2 Solutions
Change the course of the elective selectcourse to the following three tables:
Student: Student (school number, name, age, gender, department, office address, Office phone);
Course: Course (course name, credits);
Elective relationship: Selectcourse (School number, course name, score).

2.2 Example analysis of the third paradigm (3NF)
Then look at the student table above student (school number, name, age, gender, department, office address, Office Phone), the keyword is a single keyword "study number", because the following decision relationship exists:

(school number) → (name, age, gender, department, office address, Office phone)
But there is a decision relationship
(school number) → (school) → (college location, college phone)
That is, there is a non-critical field "College location", "College phone" to the key field "study number" of the transfer function dependency.
It also has the condition of data redundancy, update exceptions, insert exceptions, and delete exceptions. (Data updates, delete exceptions are not analyzed here, you can refer to 2.1.1 for analysis)

According to the third paradigm, the student relationship table is divided into the following two tables to satisfy the third paradigm:

Student: (School number, name, age, gender, department);
Department: (Department, Office address, Office phone).

Database Chapter 6 Relational data theory

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.