relational Database design basics (function dependency, lossless connectivity, hold function dependency, paradigm)

Last Update:2016-06-10 Source: Internet

Author: User

Tags closure ming

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Contact (Relationship)
1:1 Contact: If each entity in the entity set E1 can only be associated with one entity in the entity set E2, and vice versa, then the entity set E1 to E2 is a one-to-many contact, recorded as 1:1;
1:n Contact: One to many, remember as 1:n;
M:N Contact: Many-to-many links, recorded as m:n.

Http://zh.wikipedia.org/wiki/%E5%85%B3%E7%B3%BB%E4%BB%A3%E6%95%B0_ (%e6%95%b0%e6%8d%ae%e5%ba%93)

function dependency (Functions Dependency)

definition
Set the relationship mode R (u), the attribute set U={a1,a2,...,an},x,y is a subset of the Properties collection U, and if any of the possible relationship mode R (U) r,r any two tuples u, V, if there is u[x]=v[x], there is u[y]=v[y], then the X function determines Y, or called the Y function depends on X. denoted by symbolic x→y. where x is the determinant and y is the determining factor. If there is no possibility that the property value of two tuples on X can exist in any one possible relationship with R (U), The attribute values on Y are not equal. The
(1) function dependency is the concept of semantic category, which can only be determined by semantics to determine a function dependency. The definition of a
(2) function-dependent x→y requires that the tuple in any possible relationship mode R of R be satisfied with the function dependency condition.
term
(1) if x→y, Then X is called the determinant (determinant)
(2) if x→y,y→x, called X<->y.
(3) If Y is not a function dependent on X, called X-/-> y.
(4) x→y, if Y does not contain X, is called x→y as non-trivial function dependency. (5) X→y, if Y contains X, it is called x→y is ordinary function-dependent.
(6) complete function dependency (full functional dependency): in R (U), set X, Y is a different subset of attributes in the relational mode R (U), if there is x→y, and does not exist Any true subset of X ' X ', which makes X ' →y, is called Y complete function dependent (full functional dependency) on X. Remember as X-f->y.

(7) part of the function depends on: In the relationship mode R (U), X, Y is a different subset of attributes in the relationship mode R (U), if X→Y is established, if there is any true subset X ' in X, and X ' →y is also established, then the Y pair of X is part of the function dependency, recorded as:x-p->y.

(8) Set R is the relationship pattern, U is its attribute set, K is contained in U. if k complete function determines u, then K is the candidate key of R (also called candidate keyword, candidate code). Attributes that are contained within any candidate key are called key attributes (also called primary properties), and properties that are not key properties are called non-key attributes (also called non-primary properties). Obviously, the candidate key can uniquely identify the tuple of the relationship. Candidate keys may not be unique, and you typically specify a candidate key as the primary key for the identity tuple.

Uniqueness (2) Any true subset of K does not satisfy the condition ———— The nature of the entire entity set, rather than the nature of a single entity.
It includes super code, candidate code, main code.

A question to understand what a candidate key is

School Number	Name	Gender	Age	Department	Professional
20020612	Li Hui	Man	20	Computer	Software development
20020613	Zhang ming	Man	19	Computer	Software development
20020614	Wang Xiaoyu	Woman	18	Physical	Mechanical
20020615	Li Shuhua	Woman	17	Biological	Zoology
20020616	Zhao Jing	Man	21st	Chemical	Food Chemistry
20020617	Zhao Jing	Woman	20	Biological	Botany

There is a student information table in the "title" database as shown above, in which the set of attributes that cannot be a candidate key is () (select an item)

A) {study number} b) {School number, name} c) {age, Department} D) {name, gender} e) {name, professional}

"Parse" through the concept, we can see that the super-key contains the candidate key, the candidate key contains the primary key. The primary key must be unique. for what? Because his grandfather's super-key is the only one.
We analyze the above topic, a,b,c,d,e,5 the answer can be as a super-key, the combination of their collection can be used to uniquely identify a data record (entity).
Please note our requirements: Candidate keys. The candidate key requirement is a super-key that cannot contain extra attributes , so let's take a look at answer B. In answer B, if we don't use names, we can have the only
Identifies a data entity, which can be said to be superfluous in the Name field. Then obviously, the B option contains the extra field properties. So the answer to this question should be B.

Then the other 4 options can be used as a candidate key, assuming that fortunately, a) {study number} is selected as the user is using a candidate key to uniquely identify the tuple, then he was lucky to get the title of the primary key

"Answer" b

(9) If the attribute subset X of the relationship R is the candidate key for another relationship s, then X is the foreign key of R about S. the primary key and the foreign key describe the relationship between the relationships.
(10) transfer function Dependency: In relational mode R (U), if y→x,x→a, and XY (x does not determine y), AX (A does not belong to X), then called Y→a is a transitive dependency.

Inference rules for function dependencies

1. Logic implication
Given a relational pattern, it is not enough to consider a given function dependency, and it is necessary to find out other function dependencies that are established in the relational schema.
Logic implication: Set F is the relationship mode R (U) function dependency set, starting from F, can prove that some other function dependency also established, we call these function dependency by F logic implication . "F implies X→y" means that the relationship instance satisfies the x→y.
For example, if you set f={a→b,b→c}, the function dependency a→c is implied by F logic, which is written as: F |= a→c. That is, the function dependency set F logic implication function relies on a→c.

3. Armstrong Axiom
Set U as the general set of attributes, F is a set of function dependencies on U, for relational pattern R (u,f), X, Y, Z for a subset of the attribute U, there are the following inference rules:
A1:reflexive Law(reflexivity) if Y X U, then x→y is the letter F.
A2:Augmented Law(augmentation) If x→y is a letter of F, and Z is a subset of U, then Xz→yz is the letter F. The XZ and YZ in the formula are shorthand for x∪z and y∪z.
A3:Transfer Law(transitivity) If x→y and y→z are the letters of F, then x→z is the letter of F.
The function dependence obtained by the reflexive law is a trivial function dependency, and the use of the reflexive law is not dependent on F, but only on the attribute set U.
The axiom of Armstrong is effective and complete. This axiom system can be used to deduce the closure f+ of F. It is troublesome to calculate f+ directly by using Armstrong Axiom. According to A1, A2, A3 These three rules of reasoning can also be used to simplify the work of computational f+. The following three inference rules are expanded as follows:
＊Merge Rules: by X→y, X→z, X→yz
＊pseudo-Delivery rules: by X→y, Wy→z, xw→z
＊Decomposition rules: by X→yz, there are x→z,x→y
Armstrong axioms can have multiple representations, for example, augmented law A2 can be replaced with merge rules. For example, the augmented law A2 can be deduced by the A1 of the reflexive law, the transfer law A3 and the merging rules.
　　 Proof: xz→x (A1: Reflexive Law) x→y (given condition) xz→y (A3: Transitive law) xz→z (A1: Reflexive law) xz→yz (Consolidation rule)
4. Closure of attribute set
In principle, for a relational pattern R (u,f), the closed-packet f+ of the function dependency set F can be computed repeatedly using the preceding rule based on the known function dependency F. However, it is very difficult and unnecessary to use inference rules to find out all of its function-dependent f+. Therefore, a subset of closures can be computed, that is, selecting a subset of attributes to determine which properties a subset of this attribute determines, which is the concept of using attribute set closures.
(1) Definition of attribute set closures
Set F as the function dependency set on the attribute set U, X∈u, that is, X is a subset of U. All properties that are determined by the X function under the function dependency set F are closures of attribute set X under F+, and are recorded as x+. namely x+={a| X→a}.
(2) The algorithm for calculating attribute set closure x+ is as follows:
Input: X,f
Output: x+
Steps of the iterative algorithm:
① the initial value of the selected x+ is X, which is x+={x};
② calculates x+, x+={x,z}, where Z satisfies the following conditions:
Y is a true subset of x+, and a function-dependent y→z exists in F. In fact, it is to use the subset of attributes in x+ as the determinant of function dependency, search for the function dependency set in F, find the determined attribute Z of function dependency and put it into x+.
③ judgment: If x+ has not changed? or x+ equals u? Then x+ is the result of the request and the algorithm terminates. otherwise turn ②.
Because U is poor, the above iterative process is terminated after a limited number of steps.

　For example, known relational mode R (u,f), U={a,b,c,d,e,g},f={ab→c,d→eg,c→a,be→c,bc→d,ac→b,ce→ag}, Ask (BD) +
Solution:
① (BD) + = {BD};
② Compute (BD) +, scan function dependency in F, whose left is a function dependency of b,d or BD, gets a d→eg. Therefore, (BD) + = {Bdeg}.
③ Compute (BD) +, find all function dependencies in F for the left bdeg, there are two: D→eg and Be→c. SO (BD) +={(BD) ∪egc}={bcdeg}.
④ Compute (BD) +, in F to find the left part of the bcdeg subset of the function dependencies, except those already found, there are three new function dependencies: C→a,bc→d,ce→ag. Get (BD) +={(BD) ∪adg}={abcdeg}.
⑤ Judgment (BD) +=u, the algorithm ends. Get (BD) +={abcdeg}.
Description: The above description (B,D) is a candidate code for the relational pattern.

5. Validity and completeness of the Armstrong axiom system
The validity of the axiom system of ①armstrong means that each function dependence deduced from F based on the Armstrong axiom system must be a function dependency of f logic.
The completeness of the axiom system of ②armstrong means that for each function dependency of f logic, it can be deduced from F based on the Armstrong axiom system.

6. Minimal function dependency set (Minimum function dependency set)

Definition: If the function dependency set F satisfies the following conditions, it is said that f is the minimum function dependency set or minimum overwrite.
The right part of any of the functions in ①f has only one property;
There is no such function dependent x→a in ②f, which makes F and f-{x→a} equivalent;
There is no such function in ③f that the x→a,x has a true subset of Z so that f-{x→a}∪{z→a} is equivalent to F.
The minimum function dependency set is divided into three steps:
1. To the right of all dependencies in F into a single element
This problem fd={abd->e,ab->g,b->f,c->j,cj->i,g->h}; has been satisfied

2. redundant properties on the left.
The practice is to remove one of the attributes and see if it is still possible to deduce
This topic:abd->e, remove A, then (BD) + does not contain E, it can not be removed, the same b,d are not redundant properties
Ab->g, there's no
cj->i, because c+={c,j,i} contains I so j is redundant. Cj->i will become c->i
f={abd->e,ab->g,b->f,c->j,c->i,g->h};

3. Remove all redundant dependencies in F.
The procedure is to remove a relationship from F, such as removing (x->y), then x+ in F, and if Y is in x+, the x-> is superfluous. Need to be removed.

This problem if f minus abd->e,f will be equal to {ab->g,b->f,c->j,c->i,g->h}, and (ABD) +={a,d,b,f,g,h}, which does not contain E. All is not superfluous.

similarly (AB) +={a,b,f} does not contain G, so it is not redundant.
B+={B} not redundant, c+={c,i} not redundant
C->i,g->h can not be removed.
So the minimum function dependency set is f={abd->e,ab->g,b->f,c->j,c->i,g->h};

Multi-valued dependency

1. Definition

Set R (U) is a relational pattern on the attribute set U. X, Y, Z is a subset of U, and z=u-x-y. The multi-valued dependency x→→y in the relational mode R (U) is established, and when and only if any relationship to R (U) r, the given pair (x,z) value has a set of Y values, the set of values is determined only by the X value and is independent of the z-value.

If X→→y, and z=, it is called x→→y for trivial multi-valued dependence . Otherwise called X→→y is a non-trivial multivalued dependency .

Multivalued dependencies can also be formalized as follows: In any relationship R (U) of the relational mode R, if for any two tuples t,s, there is t[x]=s[x], the tuple w,v∈r (W and V can be the same as S and T), so w[x]=v[x]=t[x], and W[y]=t[y] , W[z]=s[z],v[y]=t[z], that is, the two new tuples of the Y value of the interchange s,t tuple must be in R, then the y Multi-value depends on X, which is recorded as X→→y. Where X and Y are subsets of U, Z=u-x-y.

A multi-valued dependency is a definition of 4NF, which is much more complex than a function dependency and is not clearly stated in many books.

2, say simple point is

In a relational pattern, a function dependency cannot represent a one-to-many relationship between property values, some of which are not directly related, but have indirect relationships that have no direct contact but are indirectly associated with data dependencies that are called multivalued dependencies. For example, there is no direct link between teachers and students, but teachers and students can connect teachers and students by name or classroom.

3, examples are as follows

"Example 1" has such a relationship < warehouse manager, warehouse number, inventory product number >, suppose a product can only be placed in a warehouse, but a warehouse can be a number of administrators, then corresponding to a < warehouse manager, inventory products, there is a warehouse number, and in fact, This warehouse number is only related to the inventory product number, which is not related to the administrator, saying this is a multi-valued dependency.

"Example 2" (C,b) a value (physical, optical principle) corresponds to a set of T values (Li Ping, Wang Qiang, Liu Ming), this set of values is determined only on the value of course C, that is, for (c,b) Another value (physics, General physics), it corresponds to a set of T values (Li Ping, Wang Qiang, Liu Ming The value of reference B, though, has changed. So T-multivalued depends on C, that is, c→→t.

4. Multi-valued dependence has the following properties

Multi-valued dependencies are symmetric. Even if x→→y, then X→→z, wherein Z=u-x-y.

Multi-valued dependencies have transitivity. Even if x→→y,y→→z, then x→→z-y.

A function dependency can be seen as a special case of multi-valued dependencies.

If x→→y,x→→z, then X→→yz.

If x→→y,x→→z, then x→→y∩z.

If x→→y,x→→z, then x→→y-z,x→→z-y.

The validity of a multivalued dependency is related to the scope of the property set.

If the multivalued dependent x→→y is established on R (U), for y ' Y, there is not necessarily a x→→y ' establishment. But if the function relies on x→y on R, then there is x→y ' for any y ' Y.

Paradigm

columns with no Duplicates "

• If the relationship mode R is the first paradigm, and every non-primary property in R relies on a candidate key for R, then it is called the second Normal mode. " eliminate the partial function dependency of the non-primary attribute on the key "
Cons: Delete exceptions, insert exceptions, modify complex
The partial and transitive function dependencies of the non-primary attribute on the key are eliminated.
Partial and transitive function dependencies "

x), X contains the code, it is called R∈4NF. " eliminate multivalued dependencies for non-trivial and non-functional dependencies "

? ? http://zh.wikipedia.org/wiki/%E6%95%B0%E6%8D%AE%E5%BA%93%E8%A7%84%E8%8C%83%E5%8C%96 The following discussion is based on the premise that:
R is a relational pattern with a function dependency set F, (R1, R2) is a decomposition of R.

First we give a seemingly irrelevant but very important concept: the closure of the attribute set.
Make Alpha a property set. We call the set of all properties determined by the Alpha function in the function dependency set F as the closure of the α under F, recorded as α+.
Here is an algorithm for calculating α+, the input of which is the function dependency set F and the attribute set α, and the output is stored in the variable result.
Algorithm one:
Result:=α;
while (result changed) do
The For each function relies on the β→γin F do
Begin
Ifβ∈result then result:=result∪γ;
End

Property set closures are calculated with the following two common uses:
• Determine if α is a super-code, by calculating α+ (the closure of α under f) to see if α+ contains all of the properties in R. If so, then Alpha is the super Code of R.
• Verify that the function dependency is established by verifying whether the β∈α+. That is, the α+ is computed with a property closure to see if it contains beta.

(Forgive me for using the ∈ notation to denote the inclusion relationship between two sets, the one that represents the contained symbol I can't find, and everyone knows what it means.) ）

Look at an example, the November 2005 Department of the morning 37 questions:

The candidate keyword for the function dependency set f={a1→a2,a3→a2,a2→a3,a2→a4},r on the given relationship R (A1,A2,A3,A4) is ________.
(PNS) A. A1 B. a1a3 c. a1a3a4 D. a1a2a3

First we calculate the a1+ according to the algorithm above.
RESULT=A1,
Because of A1→a2,a1∈result, so RESULT=RESULT∪A2=A1A2
Because of A2→a3,a2∈result, so result=result∪a3=a1a2a3
Because of A2→a4,a2∈result, so RESULT=RESULT∪A3=A1A2A3A4
Because of A3→a2,a3∈result, so RESULT=RESULT∪A2=A1A2A3A4

By calculating what we see, a1+ =result={a1a2a3a4}, so A1 is R's super-code, which is naturally the candidate for R keyword. Select a for this title.

Well, with the front cushion, we get to the point.
the judgment of non-destructive decomposition.
If the R1∩R2 is a R1 or R2, the decomposition (R1,R2) on R is a lossless decomposition. This is a sufficient condition that when all constraints are function dependent it is necessary (for example, a multivalued dependency is a non-function dependent constraint), but that is sufficient.
remain dependent on the judgment.
If each function dependency on F is established on a relationship after its decomposition, the decomposition is dependent (this is a sufficient condition).
If the above judgment fails, it is not possible to assert that decomposition is not dependent, but to use the following common method for further judgment.
The method is expressed as follows:
Algorithm two:
Use the following procedure for each α→β on F:
Result:=α;
while (result changed) do
For each decomposed RI
t= (Result∩ri) +∩ri
Result=result∪t
The property closures here are computed under the function dependency set F. If result contains all the properties of β, the function relies on α→β. Decomposition is to remain dependent when and only if all dependencies of f in the above process are persisted.

Here is an example, the May 2006 Department of the morning 43 questions:

Set the relationship mode R<u, F>, where U={a, B, C, D, e},f={a→bc,c→d,bc→e,e→a}, the decomposition Ρ={r1 (ABCE), R2 (CD)} is satisfied (43).
(+) A. has non-destructive connectivity, maintains function dependencies
B Non-destructive connectivity, maintaining function dependencies
C Non-destructive connectivity, no maintenance of function dependencies
D Does not have non-destructive connectivity, does not maintain function dependencies

Make a non-destructive link judgment first. R1∩r2={c}, calculate c+. Result=c
Because of C→d,c∈result, so RESULT=RESULT∪D=CD
Visible C is the R2 of the Super code, the decomposition is a lossless decomposition.

Then make the judgment to remain dependent.
A→bc,bc→e, E→a are established on the R1 (that is, each function depends on both sides of the property is in the R1), C→d is established on the R2, so the decomposition is to remain dependent.

Select a.

Let's look at an example of a complex point. May 2007 several 40-41 questions.

Given the relationship pattern R<u, F>,u={a, B, C, D, e},f={b→a,d→a,a→e,ac→b}, the candidate keywords are
(40), the decomposition Ρ={r1 (ABCE), R2 (CD)} is satisfied (41).
(+) A. Abd
B ABE
C Acd
D Cd
(A) A. has non-destructive connectivity, maintains function dependencies
B Non-destructive connectivity, maintaining function dependencies
C Non-destructive connectivity, no maintenance of function dependencies
D Does not have non-destructive connectivity, does not maintain function dependencies

See, and the previous question how similar!
For the first question, the closure of the ABCD four options is calculated separately,
(ABD) + = {Abde}
(ABE) + = {Abe}
(ACD) + = {ABCDE}
(CD) + = {ABCDE}
Choose D.

Look at the second question again.
Make a non-destructive link judgment first. R1∩r2={c}, calculate c+. Result=c
Therefore, C is neither R1 nor R2, and the decomposition does not have non-destructive decomposition.

Then make the judgment to remain dependent.
B→a,a→e,ac→b on the R1, d→a in the R1 and R2 are not set up, it is necessary to make further judgments.
Since b→a,a→e,ac→b are all kept (because their elements are in R1), we have to judge whether D→a is also maintained.

For d→a application algorithm two:
Result=d
For R1,result∩r1=ф (empty set, can not find the empty set of symbols, use this expression), T=ф,result=d
Again to r2,result∩r2=d,d+ =ade, T=d+∩r2=d,result=d
The result does not change after a loop, so the last Result=d does not contain a, so d→a is not persisted and the decomposition is not dependent.

Choose D.

relational Database design basics (function dependency, lossless connectivity, hold function dependency, paradigm) (RPM)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More