Relational Database Standardization Theory

Source: Internet
Author: User

 

Relational Database Standardization Theory
A relational database is composed of a group of relational modes. A relational database consists of a group of attribute names. A relational database is designed to group A group of attribute names that are already associated with each other, and organize each set of attribute names into links.
1. Functions of relationship Standardization
The so-called standardization is the process of replacing the original relationship with a more concise and standardized structure.
2. function dependency
2.1 relationship between attributes
There are two types of relationships between objects: one is the relationship between objects and the other is the relationship between attributes within objects.
Links between attributes can be divided into the following three types:
(1) One-to-one relationship)
Take the employee mode as an example: employee (employee ID, name, title, Department ). If there are no duplicate employees in the enterprise (or organization), the link between the employee ID and the name is. A employee ID uniquely identifies a name. A name can also determine a unique employee ID.
Set X and Y to the two properties (sets) of relational R ). If, for any specific value in X, Y corresponds to at most one value, and vice versa, X and Y are one-to-one links.
(2) One-to-multiple connections (m)
In employee mode, employee IDs and titles are one-to-multiple connections. One employee ID corresponds to only one title (for example, Hu Yimin can only correspond to engineers), but one title can correspond to multiple employee numbers (for example, an engineer can correspond to multiple employees ).
Set X and Y to the two properties (sets) of relational R ). If any specific value in X corresponds to at most one value in Y, but one value in Y can correspond to N values in X, Y indicates that X is a one-to-multiple link.
(3) Multi-to-Multi-contact (M: m)
In employee mode, there are many-to-many connections between titles and departments. One type of titles can be distributed across multiple departments (for example, each department can have engineers), and one department can also have multiple titles.
Set X and Y to the two properties (sets) of relational R ). If M values correspond to any specific value in X, and a value in Y can correspond to N values in X, that is, Y-to-X is multi-to-Multi-contact.
The three relationships between these attributes are actually the reflection of mutual dependency and mutual restriction between attribute values, which is called data dependency between attributes.
There are three types of Data Dependencies: function dependency (FD), multiva-lueddependency (SCC), and joindependency (JD ), the most important among them are function dependencies and multi-value dependencies.
2.2 function dependency
Function dependency is a link between attributes. If the value of an attribute is given, the value of another attribute can be uniquely identified (found.
Definition: function dependency refers to two properties or attribute groups in relation R where X and Y are R. If any of the r relations exist: for each specific value of X, Y has only one specific value corresponding to it, which means that the property y function depends on the property X. In other words, the property x function determines the property y, which is recorded as xà y. X is the deciding factor, and Y is the deciding factor. When Y is a subset of X, it is called an ordinary function dependency. Since the dependencies of ordinary functions are always true, if no special declaration is made, the function dependencies mentioned later in this book do not contain the dependencies of ordinary functions.
This definition can be simply expressed as: if the value of attribute x determines the value of attribute y, the property y function depends on Attribute X.
The three relationships between attributes discussed earlier do not have function dependencies in each of them.
(1) If the relationship between x and y is, there is a function dependency: X dependencies> Y.
(2) If the two Property sets X and Y are M: 1, there is a function dependency: xà y.
(3) If the relationship between x and y is M: N, function dependency does not exist.
2.3. Code definition
Define K as an attribute or attribute group in the relational mode R (u, F), and K 'as any real subset of K. If kà U does not exist, k is the R candidate code (candidatekey.
· If there are more than one candidate code, select one of them as the primary code (primarykey );
· The attribute contained in any candidate code is called the primary attribute );
· Attributes not contained in any candidate code are called non-primary attributes or non-key attributes );
· In the relational mode, the simplest case is that a single attribute is a code, called a singlekey. In the most extreme case, the entire Attribute Group is a code, called a full key ).
Defines two relational modes, R and S. X is the property or attribute group of R, and X is not the code of R, however, if X is the code of S (or has the same meaning as the code of S), X is the external code of R (foreignkey.
2.4 function dependency and code uniqueness
A code is a minimum attribute group consisting of one or more attributes that uniquely identify tuples. The code is always unique in the link, that is, the code function determines other attributes of the link. Therefore, in a link, the code value is always unique (if the code value is repeated, the entire tuples will be repeated ). Otherwise, the entity integrity rules are violated.
Different from the uniqueness of codes, in a link, the deciding factor of a function dependency may be unique or not unique. If we know that a decides B, and A and B are in the same relationship, but we still cannot know whether a can determine all attributes except B, therefore, we cannot know whether a is unique in the link.
3. Normalization of the Relationship Mode
3.1 normalization of Relationship Models
When all components in a link are non-sharable data items, the relationship is normalized.
The relationship can be divided into 5 levels based on the degree of normalization, namely 1nf, 2nf, 3nf (bcnf), 4nf, and 5nf. Those with a higher degree of normalization must be a subset of those with a lower degree of normalization
3.2. 1nf)
Define if the relational mode R does not contain multi-value attributes, r satisfies the first paradigm, or 1nf (firstnor-malform), and is recorded as R belongs to 1nf.
1nf is the minimum requirement for Standardization. The relationship that does not meet 1nf is non-normalized.
3.3 Second Paradigm (2nf)
Defines that X and Y are two different properties or attribute groups of relational R, and X belongs to y. If there is a real subset x'' of X, so that x' is true, it is said that some functions of Y depend on X, and vice versa, it is said that all functions of Y depend on X,
Defines that if a relational R belongs to 1nf, and all its non-primary attributes depend entirely on any candidate code of R, R belongs to the second paradigm, and r belongs to 2nf.
Inference: If the relational mode R contains 1nf and each of its candidate codes is a single code, r belongs to 2nf.
3.4. Third Paradigm (3nf)
Defined in relational R, x, y, and z are three different properties or attribute groups of R. If x, y, however, if y \ --> X and Y is not a subset of X, the transfer of Z depends on X.
Define if the relational mode R belongs to 2nf, and each of its non-primary attributes does not depend on any candidate code, R is called the third paradigm, and r belongs to 3nf.
Inference 1 if the relational mode R belongs to 1nf and each of its non-primary attributes is neither partially dependent nor dependent on any candidate code, then R belongs to 3nf.
Inference 2: The relational mode without non-primary attributes must be 3nf.
3.5. Improved 3nf--bcnf
Defining the relational mode R (u, f) belongs to NF. If any function of F depends on xà y (Y is not a subset of X), X contains a code of R, r belongs to bcnf.
In other words, in relational mode R, if every decision factor contains a code, r belongs to bcnf.
According to the definition of bcnf, the following inference can be obtained: If R belongs to bcnf
· All non-primary attributes in R are fully functional dependencies on each code;
· All primary attributes in R are fully functional dependencies for every code that does not contain it;
· R does not have any attribute that fully functions depend on any set of non-Code attributes.
Theorem: If R belongs to bcnf, r belongs to 3nf.
If a relational mode reaches bcnf, it has been completely isolated within the scope of function dependencies, eliminating data redundancy, insertion, and deletion exceptions.
4. multi-value dependency and the fourth paradigm
The R (u) is a relational mode on the property set U. x, y, and z are the subsets of U, and z = u-x-y. If a pair of (x, z) values are given for any relation R (u), there is a group of y values corresponding to them, this set of y values is only determined by the X value and is not related to the Z value. The multiple values of Y depend on X, or the multiple values of X determine y.
If Z is an empty set in the definition, it is called xà à y as an ordinary multi-value dependency. Otherwise, it is non-trivial multi-value dependency.
Defines that if the relational mode R belongs to 1nf, for each non-trivial multi-value of R, it depends on xà à y (Y is not a subset of X). If X contains code, R is called the fourth paradigm, that is, r belongs to 4nf.
If a relational model belongs to 4nf, it must belong to bcnf, but the relational model of bcnf is not necessarily 4nf. All non-trivial multi-value dependencies in R are actually function dependencies.
5. Relationship Normalization
The purpose of relationship normalization is to solve problems such as data redundancy, insertion and deletion exceptions, and cumbersome updates in the Relationship Mode. The basic idea is to eliminate the inappropriate part of data dependency, so that the relationship modes can be separated to some extent, so that a link can describe a concept, a link between entities or entities. Therefore, the essence of standardization is the simplification of concepts.
The basic principle of standardization is: gradually standardize from low to high, weigh the advantages and disadvantages, and make it happen. Generally, the basic requirement is to satisfy the third paradigm.
It generally takes the following steps to convert an nonstandard data structure into a third paradigm:
(1) break down the structure into several relationships that belong to the first paradigm.
(2) For those relationships that have composite codes and have function dependencies with some non-primary attributes, we must continue to break them down so that the obtained relationships all belong to the second paradigm.
(3) If there is a non-master attribute transfer dependent on code in the Link, continue to break down the link so that the link belongs to the third paradigm.
The normalization process of the Relationship Mode is implemented through projection decomposition, that is, a mode is decomposed into several high-level relationship modes by projection operations. This projection decomposition is not unique.

Reprinted: http://blog.csdn.net/hbrqlpf/archive/2007/11/15/1887204.aspx

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.