Three major database design paradigms: BCNF and 4NF

Source: Internet
Author: User

I. Understanding the database paradigm requires understanding several basic concepts:

Code: The table can uniquely identify an attribute (or attribute group) of a single tuples. If there is more than one such code, it is called a candidate code, we pick one from the candidate code to become the boss. It is called the master code. It is equivalent to the key value. Primary attribute: a property that appears in any candidate code is the primary attribute. Non-Primary attribute: In contrast to the above, it has not appeared in any candidate code. This attribute is a non-primary attribute. External code: an attribute (or attribute group). It is not a code, but the code of its other tables. It is an external code.

Ii. paradigm explanation

In order to establish a database with low redundancy and reasonable structure, certain rules must be followed when designing the database. In relational databases, such a rule is called a paradigm. The paradigm is a summary that meets a certain design requirement. To design a rational relational database, a certain paradigm must be met.

There are three most common design paradigms in actual development:

1. The first paradigm (ensuring that each column remains atomic)

The first paradigm is the most basic paradigm. If all the field values in the database table are unrecoverable Atomic values, it indicates that the database table meets the first paradigm.

The rational compliance of the first paradigm needs to be determined based on the actual needs of the system. For example, the "Address" attribute must be used in some database systems. You can directly design the "Address" attribute as a field in a database table. However, if the system frequently accesses the "city" section in the "Address" attribute, therefore, you have to re-split the "Address" attribute into provinces, cities, detailed addresses, and other parts for storage. This makes it very convenient to operate a part of the address. This design satisfies the first paradigm of the database, as shown in the following table.

The user information shown in the table follows the requirements of the first paradigm, which makes it very convenient to classify users using cities and improves the database performance.

2. Second Paradigm (ensure that each column in the table is related to the primary key)

The second paradigm goes further on the basis of the first paradigm. The second paradigm needs to ensure that each column in the database table is related to the primary key, rather than only a part of the primary key (mainly for the joint primary key ). That is to say, in a database table, only one type of data can be saved in one table, and multiple types of data cannot be saved in the same database table.

For example, to design an order information table, because there may be multiple commodities in the order, order numbers and commodity numbers should be used as the joint primary keys of the database table, as shown in the following table.

Order info table

In this case, a problem occurs: This table uses the order number and product number as the joint primary key. In this way, the product name, unit, and price information in the table are not related to the primary key of the table, but only to the product number. This violates the design principles of the second paradigm.

However, if we split the order information table, split the product information into another table, and split the order project table into another table, it would be perfect. As shown below.

This design greatly reduces the database redundancy. If you want to obtain the product information of an order, use the product number to query the product information table.

3. Third Paradigm (ensure that each column is directly related to the primary key column, rather than indirectly related)

The third paradigm needs to ensure that each column of data in the data table is directly related to the primary key, but not indirectly related.

For example, when designing an order data table, you can use the customer number as a foreign key and order table to establish a corresponding relationship. You cannot add other customer information (such as name and company) fields in the order table. As shown in the following two tables, the design is a database table that meets the third paradigm.

In this way, when querying order information, you can use the customer number to reference records in the customer information table. You do not have to enter the customer information multiple times in the order information table, which reduces data redundancy.

4. BCNF paradigm

Bois-ked Paradigm (BCNF is an improved form of 3NF)

A condition that satisfies the relationship pattern of BCNF:

1. All non-primary attributes are fully functional dependencies on each code.

2. All primary attributes are completely functional dependent on each code that does not contain it.

3. No full function of any attribute depends on any set of non-Code attributes.

Assume that the warehouse management relation table is StorehouseManage (warehouse ID, storage item ID, administrator ID, quantity), and one administrator works only in one warehouse. One warehouse can store multiple items. This database table has the following decision relationships:

(Repository ID, storage item ID) → (administrator ID, quantity)

(Administrator ID, storage item ID) → (warehouse ID, quantity)

Therefore, both (repository ID, storage item ID) and (administrator ID, storage item ID) are candidate Keywords of StorehouseManage, and the unique non-Keyword segments in the table are quantity, it conforms to the third paradigm. However, the following decision relationships exist:

(Repository ID) → (administrator ID)

(Administrator ID) → (repository ID)

That is, the keyword segment determines the keyword segment, so it does not conform to the BCNF paradigm.

Break down the warehouse management relationship table into two Relational Tables:

Warehouse Management: StorehouseManage (warehouse ID, administrator ID );

Repository: Storehouse (warehouse ID, storage item ID, quantity ).

Such database tables conform to the BCNF paradigm.

5. 4NF

4NF is the restriction that multi-value dependencies between attributes in the relational mode cannot be non-trivial and non-functional. According to the definition, for each non-trivial multi-value dependency X → Y, X contains the candidate code, so there is X → Y, therefore, the non-trivial multi-value dependencies allowed by 4NF are actually functional dependencies.

Here we will solve several concepts:

Function dependencies are simply described as follows: an attribute set determines another attribute set, indicating that another attribute set depends on this attribute set.

(1) data dependency in computer science, data dependency refers to a State in which data is referenced by a program structure. The most important among them are function dependencies and multi-value dependencies. (2) function dependency: set X. Y is the set of two properties of relational R. When the X attribute values of any two tuples in R are the same at any time, their Y attribute values are also the same, the X function determines that Y or Y function depends on X. (3) trivial function dependency when property set Y in a link is a subset of property set X (Y? X), there is a function dependent on X → Y, that is, a set of attribute functions determine all its subsets. This function dependency is called a trivial function dependency. (4) non-trivial function dependency when the property set Y in the link is not a subset of the property set X and the function dependency X → Y exists, this function dependency is called non-trivial function dependency. (5) completely functional dependency: Set X. Y is the set of two properties of relational R. x' is the true subset of X, and there is X → Y, but there is X for every x '! → Y, which means that the complete function of Y depends on X. (6) Some functions depend on X. Y is the set of two properties of the relational R. X → Y exists. if x' is the true subset of X, there is X' → Y, in this case, some functions of Y depend on X. (7) transmit function dependencies. set X, Y, and Z to a set of different properties in the relational R. X → Y (Y!) exists! → X), Y → Z, that is, the Z-passing function depends on X. (8) multi-value dependency R (U) is a relational mode on the property set U. X, Y, and Z are the subsets of U, and Z = U-X-Y. In the relational mode R (U), multiple values depend on X → Y. if and only if any of the relational R (U), the given pair (x, z) values have a set of values of Y. These values are only determined by the value of x and are not related to the value of z. Ordinary multi-value dependency and non-ordinary multi-value dependency: If X → Y, and Z is an empty set, X → Y is an ordinary multi-value dependency. If Z is not empty, it is called a non-trivial multi-value dependency.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.