A look at the database paradigm Introduction (1NF,2NF,3NF,BC nf,4nf,5nf)

Source: Internet
Author: User
Tags ming

Original: http://josh-persistence.iteye.com/blog/2200644

First, the basic introduction

When designing the relational database, we should design a reasonable relational database to conform to different specifications, and the different norm requirements are called different paradigms, and the higher paradigm database redundancy is lower.

There are currently six paradigms for relational databases: the first paradigm (1NF), the second paradigm (2NF), the third paradigm (3NF), the bath-cod paradigm (BCNF), the fourth paradigm (4NF) and the fifth paradigm (5NF, also known as the perfect paradigm). The paradigm that satisfies the minimum requirements is the first paradigm (1NF). The second paradigm (2NF) is further satisfied on the basis of the first paradigm, and the rest of the paradigms are referred to by analogy. In general, the database only needs to meet the third normal form (3NF) on the line.

The containment relationship of the paradigm. A database design if it conforms to the second paradigm, it must also conform to the first paradigm. If it conforms to the third paradigm, it must also conform to the second paradigm ...

Basic concepts of databases

to understand the paradigm, you must first know what a relational database is, and simply say that a relational database is a two-dimensional table to hold data. Between table and table can ... (Omit 10W words), if you are familiar with the database, you can ignore the following concepts.

entity : something that exists in the real world and can be distinguished. such as "A student", "a book", "a course" and so on. It is worth emphasizing that the word "things" here is not only the "things" that can be seen and touched, but also the virtual one, rather "the relationship between the teacher and the school".

attribute : The textbook is interpreted as: "an entity has a certain characteristic", thus, the attribute is initially a logical concept, for example, "sex" is a property of "human". In a relational database, a property is a physical concept, and attributes can be thought of as "a column in a table."

tuples : A row in a table is a tuple.

Component : A property value of a tuple. In a relational database, it is an atomic operation, that is, when the relational database is doing anything, the attribute is "non-divided." Otherwise it's not a relational database.

code: The table can uniquely determine a tuple of a property (or attribute group), if such a code has more than one, then everyone is called candidate code, we choose a candidate code out to do the boss, it is called the main code.

Full code : If a code contains all the attributes, the code is full code.

main attribute : An attribute is the primary attribute as long as it appears in any candidate code.

Non-primary attribute : In contrast to the above, there is no candidate code, this property is a non-primary attribute.

External Code : a property (or attribute group), it is not a code, but its other table code, it is outside the code.

Candidate Code: If the value of a property or attribute group in a relationship can uniquely identify a tuple, and any of its true subsets are no longer identifiable, the attribute group is called a (super code) candidate.

Ii. 6 Paradigms

As mentioned earlier, the higher the paradigm, the less redundant the data. In fact, there is no redundant database design can be done. However, no redundant database is not necessarily the best database , sometimes in order to improve operational efficiency, it is necessary to reduce the paradigm standard, appropriate retention of redundant data. The practice is to adhere to the third paradigm when designing the conceptual data model, and to lower the standard of normalization into the design of the physical data model. Lowering the paradigm is adding fields, allowing redundancy. (The most typical is not only in some data tables as a foreign key user_id, also save user_name, so although the violation of the database paradigm to increase the user_name field, but improve efficiency, reduce the acquisition of user_id and then go to the user table to get the user Operation of name)

So in practice, we just need to consider the database to meet the third paradigm, the following in the most popular way to explain the database paradigm.

First normal form (1NF): attribute is not divided

(1NF is an atomic constraint on a property, requiring that the attribute be atomic and non-decomposed)

Name Tel Age

Mobile Landline

Josh 13612345678 021-9876543 22

Wang 13988776655 010-1234567 21

Obviously, tel This attribute can also be decomposed, broken down into mobile phones and landline these two properties, do not meet the first paradigm of the database, not the relational database! Therefore, we do not have such a "table" in any relational database management system.

The second normal form (2NF): Conforms to 1NF, and the non-primary attribute is completely dependent on the code.

(2NF is a unique constraint on a record, requiring a unique identifier, that is, the uniqueness of the entity, and more commonly, the primary key ID)

If this is not understandable, just look at the following scientific explanations:

The main attribute in a candidate code can also be several. If a primary property is not a single candidate, it cannot determine any of the non-primary properties. Give a counter example: We consider a primary school administration system, students assigned a teacher in class, a textbook, a classroom, a time, we all go to class, no problem. So how is the database designed? (Student class table)

Student Course teacher Teacher title textbook class time

Xiao Ming first year Chinese (on) Big treasure Associate Professor "Elementary language 1" 101 14:30

A student in a class, must be in a particular classroom. So there are classrooms (students, classes)

A student has a class that must be taught by a particular teacher. So there are teachers (students, courses)

A student in a class, his teacher's job title can be determined. So there are (students, courses)--teacher title

A student in a class must be a specific textbook. So there are textbooks (students, courses)

A student in a class, must be at a certain time. So there are (students, courses)--Class time

So (student, course) is a code.

However, a course, must designate a textbook, first-grade language is certainly used in the "primary language 1", then there is the curriculum----teaching materials. (student, course) is a code, but the course decides the textbook, which is called incomplete dependence, or partial dependence. In this case, the second paradigm is not satisfied! What's wrong with it? You can think about:

1, the headmaster to add a new course called "Calculus", the textbook is "College Mathematics", How to do? Students have not chosen a class, and the student is the main attribute, the main attribute can not be empty, how to record the course, textbooks to remember? ...... Are you depressed? (Insert exception)

2, the next semester no students to learn first-year language (on), to learn a first-year language (the next) went, then the table will not exist in the first grade language (on), there is no "primary language 1". At this time, the headmaster asked: first grade Chinese (on) What teaching materials AH? ...... Are you depressed? (Delete exception)

3, the headmaster said: First grade language (on) for the teaching materials, replaced by the "University of Chinese." 10,000 students have chosen such a class, the change is very big Ah! It's exhausting. Are you depressed? (Modify exception)

How should that be solved? Projection decomposition, which decomposes a table into two or more tables

Student Course Teacher Teacher Title class time

Xiao Ming's first year Chinese (on) Da Bao Associate Professor 101 14:30

Student Class Table

Course Materials

First grade language (upper) "Primary language 1"

The third paradigm (3NF): Conforms to 2NF, and eliminates transitive dependencies.

(3NF is a constraint on field redundancy, that is, any field cannot be derived from another field, it requires no redundancy in the field)

As previously said: No data redundancy of the database is not necessarily the best database, so there is no redundant design, to be integrated to consider.

The "Student class table" above is 2NF compliant and can be verified as follows: Two primary properties are used alone, without determining any of the other four non-primary attributes. But it has a transitive dependency! In "Teacher" and "teacher title" here, a teacher must be able to determine a teacher title.

If this transitive dependency is not eliminated, it is possible that:

1, the teacher upgraded, changed the professor, to change the database, the table has n, changed n times ... (Modify exception)

2, no candidate this teacher's class, the teacher's title has not been recorded ... (Delete exception)

3, a new teacher, not assigned to teach what class, his title to remember? ...... (Insert exception)

How should that be solved? As with the above, the projection decomposition:

Student Course Teacher Class time

Xiao Ming first year Chinese (on) Big Treasure 101 14:30

Teacher Teacher Title

Da Bao Associate professor

BCNF: 3NF compliant, and the primary attribute is not dependent on the primary attribute.

If the relationship pattern is in the second paradigm, and each attribute is not passed with a key code, then r belongs to the BC paradigm.

Usually

The conditions of the BC paradigm have multiple equivalents: each non-trivial dependency must have a key code on the left side, and each determinant must contain a key code.

The BC paradigm checks both non-primary and primary properties. When only the non-primary attribute is checked, it becomes the third paradigm. The relationship of the BC paradigm must satisfy the third paradigm.

It is also possible to say that if a relationship reaches the third paradigm, and it has only one candidate code, or if each of its candidate codes is a single attribute, the relationship naturally reaches the BC paradigm.

Generally, a database design conforms to 3NF or BCNF.

The Forth paradigm: requires the deletion of many-to-many relationships within the same table.

Model Five: Re-establishing the original structure from the final structure.

A look at the database paradigm Introduction (1NF,2NF,3NF,BC nf,4nf,5nf)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.