Transferred from: http://blog.chinaunix.net/uid-10073362-id-225057.html
Database paradigm is an essential knowledge in database design, and without the understanding of paradigm, it is impossible to design a database with high efficiency and elegance. Even design the wrong database. And it's not that easy to understand and master the paradigm. In textbooks, relational algebra is generally used to interpret database paradigms. Although this can be very accurate expression of the database paradigm, but more abstract, not too intuitive, not easy to understand, more difficult to remember.
This paper introduces the paradigm in a more straightforward language, which is intended to facilitate understanding and memory, and may present some imprecise representations. But for beginners should be a good introduction. I write these purposes mainly to strengthen the memory, in fact, I also compare dishes, I hope when I am unfamiliar with some concepts, back to look at their notes, can quickly enter the state. If you find a mistake in it, please correct it.
Here's how to get started:
First, the basic concept
To understand the paradigm, you first have to know what a relational database is, and if you don't, I can simply not simply say that a relational database uses a two-dimensional table to hold data. Between table and table can ... (Omit 10W words).
Then you should understand the following concepts:
entity: something that exists in the real world and can be distinguished. such as "A student", "a book", "a course" and so on. It is worth emphasizing that the "things" here are not only the "things" that can be seen, but also virtual, such as "the relationship between teachers and schools".
attribute: The textbook is interpreted as:"an entity has a certain characteristic", thus, the attribute is initially a logical concept, for example, "sex" is a property of "human". In a relational database, a property is a physical concept, and attributes can be thought of as "a column in a table."
tuples: A row in a table is a tuple .
Component: A property value of a tuple. In a relational database, it is an atomic operation , that is, when the relational database is doing anything, the attribute is "non-divided." Otherwise it's not a relational database.
code: The table can uniquely determine a tuple of a property (or attribute group), if such a code has more than one, then everyone is called candidate Code , we choose a candidate code out to do the boss, it is called the main code .
Full code: If a code contains all the attributes, the code is full code.
main attribute: an attribute is the primary attribute as long as it appears in any candidate code.
Non-primary attribute: In contrast to the above, there is no candidate code, this property is a non-primary attribute.
External code: a property (or attribute group), it is not a code, but its other table code, it is outside the code.
Two, 6 paradigms
Well, the above has introduced all the basic concepts we need to master the paradigm, so let's talk about the paradigm. The first thing to understand is the inclusive relationship of the paradigm. A database design if it conforms to the second paradigm, it must also conform to the first paradigm. If it conforms to the third paradigm, it must also conform to the second paradigm ...
• First normal form (1NF): attribute is not divided.
The concept of attribute values has been described earlier, and we say that it is "non-divided". And the first paradigm requires that attributes are also not divided. So what is the difference between it and the attribute value? Give an example:
In this table, the attribute value is "divided". In the "Phone" attribute, the value of "xiaoming" is divided into two.
Neither of these cases satisfies the first paradigm. A database that does not meet the first paradigm, not a relational database! Therefore, we do not have such a "table" in any relational database management system. This can be done for the above situation: in this table, the attribute is "divided". That is, "phone" is divided into "mobile phone" and "landline" two properties.
• Second normal form (2NF): Conforms to 1NF, and the non-primary attribute is completely dependent on the code. (note is that the full dependency can not be part-dependent, with a function dependent w→a, if there is XW, there is x→a established, then called W→a is a local dependency, otherwise it is called w→a is fully functional dependency)
A student has a class that must be taught by a particular teacher. So there are (students, courses) and teachers;
A student in a class, must be in a particular classroom. So there are classrooms (students, courses).
A student in a class, his teacher's job title can be determined. So there are (students, courses) and teacher title;
A student in a class must be a specific textbook. So there are textbooks (students, courses)
A student in a class, must be at a certain time. So there are (students, courses)--Class time
So (student, course) is a code.
However, a course, must designate a textbook, first-grade language is certainly used in the "primary language 1", then there is the curriculum----teaching materials. (student, course) is a code, but the course decides the textbook, which is called incomplete dependence, or partial dependence. In this case, the second paradigm is not satisfied!
What's wrong with it? You can think about:
1, the headmaster to add a new course called "Calculus", the textbook is "College Mathematics", How to do? Students have not chosen a class, and the student is the main attribute, the main attribute can not be empty, how to record the course, textbooks to remember? ...... Are you depressed? (Insert exception)
2, the next semester no students to learn first-year language (on), to learn a first-year language (the next) went, then the table will not exist in the first grade language (on), there is no "primary language 1". At this time, the headmaster asked: first grade Chinese (on) What teaching materials AH? ...... Are you depressed? (Delete exception)
3, the headmaster said: First grade language (on) for the teaching materials, replaced by the "University of Chinese." There are 10,000 students selected this course, the change is great! It's exhausting. Are you depressed? (Modify/Update exception , here you may feel that the textbook "Primary language 1" to replace the "college language" can not be, but the replacement operation, although the computer is very fast, but after all, it will be replaced 10,000 times, resulting in a lot of time overhead)
How should that be solved? Projection decomposition, which decomposes a table into two or more tables
• The third paradigm (3NF): Conforms to 2NF, and eliminates transitive dependencies (that is, each non-primary attribute is not passed dependent on the candidate key, determining the transitive function dependency, which refers to the C pass-through function depends on A if there is a decision relationship of "a→b→c"). )
The "Student Class new Table" above is 2NF compliant, but it has a delivery dependency! Where is it? The problem is in the "teacher" and "teacher title" here. A teacher must be able to determine a teacher's title. (student, course)---teacher---title.
What's the problem? Think about:
1, the teacher upgraded, changed the professor, to change the database, the table has n, changed n times ... (Modify exception)
2, no candidate this teacher's class, the teacher's title has not been recorded ... (Delete exception)
3, a new teacher, not assigned to teach what class, his title to remember? ...... (Insert exception)
How should that be solved? As with the above, the projection decomposition:
· BC Paradigm (BCNF): 3NF compliant, and the primary attribute is not dependent on the primary attribute (that is, no field exists for the transfer function dependency of any of the candidate key fields)
The BC paradigm checks both non-primary and primary properties. When only the non-primary attribute is checked, it becomes the third paradigm. The relationship of the BC paradigm must satisfy the third paradigm.
It is also possible to say that if a relationship reaches the third paradigm, and it has only one candidate code, or if each of its candidate codes is a single attribute, the relationship naturally reaches the BC paradigm.
Here's an example: Suppose a warehouse management Relationship table (warehouse ID, store item ID, Administrator ID, number), and an administrator who works only in one warehouse; a warehouse can store multiple items.
The following decision relationships exist in this database table:
(Warehouse ID, store item id) → (Administrator id, quantity)
(Admin ID, store item id) → (warehouse ID, quantity)
So, (warehouse ID, store item ID) and (Administrator ID, store item ID) are the candidate keywords for storehousemanage, the only non-critical field in the table is the number, which is in accordance with the third paradigm. However, the following decision relationship exists:
(warehouse id) → (Administrator id)
(Administrator id) → (warehouse id)
That is, the key field determines the critical field, so it does not conform to the BCNF paradigm. It will have the following exception:
(1) Delete exception:
When the repository is emptied, all "store item ID" and "quantity" information is deleted, and the "Warehouse ID" and "Administrator ID" information are also deleted.
(2) Insert exception:
You cannot assign an administrator to a warehouse when no items are stored in the warehouse.
(3) Update exception:
If the warehouse has been replaced by an administrator, the administrator ID for all rows in the table is modified.
Break down the Warehouse management relationship table into two relational tables:
Warehouse Management: Storehousemanage (warehouse ID, administrator ID);
Warehouse: Storehouse (warehouse ID, store item ID, quantity).
Such database tables conform to the BCNF paradigm, eliminating deletion exceptions, insert exceptions, and update exceptions.
Generally, a database design conforms to 3NF or BCNF. In the BC paradigm, there are the fourth paradigm and the five paradigm.
• The IV paradigm: requires the deletion of many-to-many relationships within the same table.
• Model five: re-establish the original structure from the final structure.
In fact, the database design paradigm is the key to grasp the 1NF, 2NF, 3NF, BCNF
The following relationships exist between the four paradigms:
Here the main difference between 3NF and bcnf, a word is 3NF is to meet the non-primary attribute to the candidate code transfer function dependency, BCNF is to meet the absence of any one of the attributes (including non-primary and primary properties) on the candidate Code transfer function dependency .
[Database] Understanding database paradigm-Easy to understand