In order to establish a relatively small and reasonable relational database, we must follow certain rules when designing relational database, that is, the design paradigm of relational database.
First normal Form (1NF)
The first paradigm of a relational database requires:
- All fields are indivisible.
For example, the Customer data table contains the customer name and address, and the address is made up of cities and streets. Apps often need to access city or street fields separately.
Data table Customers (name,city, street) conforms to the first paradigm, while the data table customers (name,address) does not meet the requirements of the first paradigm.
Define a data table that satisfies the first paradigm:
courses(student_id, course_id, school_name, president, credit)primary key(student_id, course_id)
The table above still has four questions:
Data redundancy: school_name, President attribute recurrence
Insert exception: If a new school does not start enrolling, it cannot be added to the datasheet.
Delete exception: If all students in a college graduate, the college information disappears
Modify exception: If the dean of a college changes, you need to revise the data of all students in the Institute;
If a student is transferred, it is necessary to revise the data for all the courses selected.
Second paradigm (Second normal Form, 2NF)
The second paradigm is defined on the basis of the first paradigm, which requires:
A candidate code is a function that determines the minimum set of all non-primary attributes, but does not exclude the case where the true subset of the candidate code determines some rather than all of the non-primary attributes.
The 2NF disallow the true subset function of the candidate code to determine any non-primary attribute.
As in the above data sheet:
courses(student_id, course_id, school_name, president, credit)primary key(student_id, course_id)
The School_name and President properties depend on the property student_id
and are not dependent on the property course_id
.
That is to say, the School,president attribute part function relies on the primary key, does not satisfy the second normal form request.
If you split the above table into two data tables:
courses(student_id, course_id, credit)primary key(student_id, course_id)students(student_id, school_name, president)primary key(student_id)
The data table after splitting satisfies the requirements of the second paradigm.
Now analyze the four problems that the first paradigm does not solve:
Data redundancy: Reduces the number of repetitions of School_name and president
Insert exception: not resolved
Delete exception: not resolved
Modify exception: Solve the problem of transfer of students, did not solve the problem of Dean replacement
Third paradigm (third normal Form, 3NF)
The third paradigm is defined on the basis of the second paradigm:
See the definition of the data sheet above
students(student_id, school_name, president)primary key(student_id)
There student_id
is a transitive function school_name
dependency president
, which does not satisfy the third paradigm.
For further splitting:
courses(student_id, course_id, credit)primary key(student_id, course_id)students(student_id, school_name)primary key(student_id)schools(school_name, president)primary key
The transfer function dependency is eliminated after splitting.
The third paradigm is analyzed to deal with the above four problems.
Data redundancy: All properties except primary key appear only once
Delete exception: Allow all students to be deleted and keep college information
Insert exception: Allow the establishment of college without students
Update exception: Change the Dean, student transfer all you need to change a record
The third paradigm can basically solve the above problems.
BCNF
The BC (boyce-codd) Paradigm further eliminates the transfer dependency on the basis of 3NF.
BCNF Requirements:
The relational pattern conforms to 3NF
All functions in the function dependency set F are dependent on x->f, and the left x must contain all the candidate codes for R.
That is, all candidate codes directly determine all non-primary attributes.
BCNF Decomposition algorithm
Given the relationship pattern R and the function dependency set F on it, R is performed to satisfy BCNF's lossless connection decomposition:
Reset Value p = {R}
Check the relationship mode in P if all satisfies bcnf, stop decomposition, otherwise repeat 3.
In P, the relationship pattern s, which is not satisfied with BCNF, must have a non-trivial dependent b->c, and B is not a candidate for s. Decompose s into s1= {b,c} and S2 = R-{C}, substituting s1,s2 for the position of S in P.
Example:
R = {A, B, C} F = {a->b, b->c}
Candidate A, the function depends on b->c, because B is not candidate key, so we want to decompose:
{B, C} {A, B}
The IV paradigm (Forth normal Form, 4NF)
4NF Requirements:
Example:
phone(user_id, phone, cell)
This table design is obviously unreasonable if a user has more than one phone and multiple cells at the same time.
You can use the following split scheme:
type)
The data table after splitting satisfies the 4NF.
"Reprint" Relational Database design paradigm