Relational database is now widely used database type, the design of relational database is the process of organizing and structuring data. It's easier for us to deal with a small database, but as the size of the database expands we will find that the SQL statements that the user manipulates the database will become unwieldy and complex. Worse, there is a good chance that the data will be incomplete and inaccurate. So it is necessary to make the data design more conform to the specification.
The most common design paradigms in real-world development are three:
1
. First Paradigm
That is, the column of the table is atomic, non-decomposition, that is, the column of information, can not be decomposed, only the database is a relational database (Mysql/oracle/db2/informix/sysbase/sql server), automatically meet 1NF;
In other words: Can be divided into points, to the point can not be divided so far!
The rational follow-up of the first paradigm needs to be determined according to the actual needs of the system. For example, some database systems need to use the "address" attribute, the "address" attribute should be directly designed as a database table field. However, if the system often accesses the "city" part of the "address" attribute, then it is not to be the "address" attribute to be re-split into provinces, cities, detailed address and other parts of storage, so that in the address of a part of the operation will be very convenient. This design satisfies the first paradigm of the database, as shown in the following table.
The user information shown in the table above follows the requirements of the first paradigm, which makes it very convenient to classify users using cities, and also improves the performance of the database.
2
. Second paradigm (ensure that each column in the table is related to the primary key)
On the basis of satisfying the first paradigm, there is no partial function dependency of non-critical fields on any of the candidate key fields in the database table (partial function dependency refers to the case where some fields in the composite keyword determine non-critical fields), or all non-critical fields are completely dependent on any set of candidate keywords. (In addition, all single-key database tables conform to the second normal form because there is no possible combination of keywords.) )
That is: The second paradigm needs to ensure that each column in a database table is related to a primary key, not just one part of the primary key (primarily for federated primary keys).
1. Use the single keyword as much as you can!
2. Each table only describes one kind of information, don't be silly to put all the information in a table!
the records in the table are unique and meet the 2NF, usually we design a primary key to implement
For example, to design an order information table, because there may be more than one item in the order, the order number and the product number are used as the federated primary key for the database table, as shown in the following table.
This creates a problem: The table is the Union primary key with the order number and the product number. In this table, the product name, unit, commodity price and other information is not related to the table's primary key, but only related to the product number. So this violates the design principle of the second paradigm.
And if the Order Information table is split, the product information is separated into another table, the Order Item table is also separated into another table, it is perfect. as shown below.
This design, to a large extent, reduces the redundancy of the database. If you want to get the product information for an order, use the item number to inquire in the product information sheet.
3
. Third paradigm (ensure that each column is directly related to the primary key column, not indirectly)
On the basis of the second paradigm, if there is no non-critical field in the data table, the transfer function dependency on either of the candidate key fields conforms to the third paradigm. The so-called transfer function dependence, refers to the existence of "a → B → C" decision relationship, then C The transfer function relies on A. This means that the fields in the table and the primary key directly correspond to no other intermediate fields.
That is: The third paradigm needs to ensure that each column of data in the data table is directly related to the primary key, not indirectly.
Plainly: The primary key must be the one that determines the value of a field!
It can be seen that the college location depends on the college, the college depends on the school number, the same college phone. Therefore, this does not conform to the third paradigm, which can also result in the following undesirable consequences:
(1) data redundancy: the same "college" is repeated n-1 times by N students, "College location" and "College phone" .
(2) Update Exception: If you adjust the location of a college, all the relevant rows in the data sheet of the "college location" value should be updated, otherwise the same college but the location of a different situation.
(3) Insert Exception: If you want to add a new college, no one is currently applying. This way, the data cannot be logged into the database because there is no "learning number" keyword.
(4) Delete exception: If a group of students have graduated, these student information records should be deleted from the database table. At the same time, however, the school, college location and college phone messages were also deleted. Obviously, this can also lead to an insertion exception.
Anti-3NF: However, no redundant database is not necessarily the best database, sometimes in order to improve operational efficiency, it is necessary to reduce the paradigm standard, appropriate retention of redundant data. The practice is to adhere to the third paradigm when designing the conceptual data model, and to lower the standard of normalization into the design of the physical data model. Lowering the paradigm is adding fields, allowing redundancy .
Case:
Reference
Http://www.cnblogs.com/linjiqin/archive/2012/04/01/2428695.html
http://blog.csdn.net/beijiguangyong/article/details/6249807
Preach Wisdom podcast MySQL optimization tutorial
Database Design--three paradigm