Application of normalization Theory of database review (eighth time on-machine content)

Source: Internet
Author: User
Tags closure

Disclaimer: This article is for the author to review the database course with a simple record of notes, if there are errors, please point out, thank you.

First, the theoretical basis

1. non-destructive connectivity (Lossless join): Set relationship mode R (u,f) is decomposed into several relational patterns R1 (U1,F1), R2 (U2,F2), ..., Rn (UN,FN), where u=u1u2 ... Un, and there is no Unuj, FI is the projection of F on the UJ, if R is equal to the result of R1,r2,...,rn natural connection, then the decomposition of the relationship mode R is a non-destructive connection.

(Don't look at it!) )

Simply put, if the set of tuples that are naturally connected to the new decomposed relationship is exactly the same as the original, it is called a lossless connection.

2. function Dependency persistence (Preserve Dependency): Set relationship mode R (u,f) is decomposed into several relational patterns R1 (U1,F1), R2 (U2,F2), ..., Rn (UN,FN), where u=u1u2 ... UN, and there is no UNUJ type, FI is the projection of F on uj; if any of the functions contained in F is dependent on it, it must also be (F1 U F2 ... U Fn), it is said that the decomposition of the relationship mode R has a function-dependent persistence.

(The concept is complicated, don't look, the skull hurts)

Simply put, if each function dependency on F is set on a relationship after its decomposition, the decomposition is maintained by the function dependency (note: This is a sufficient condition).

3. The normalization process is not only a non-destructive connection, but also has the function of dependency retention. (Only one side will have problems), the former can ensure that no loss of information, the latter can alleviate or resolve a variety of abnormal situations.

4. Non-normalized technology: Sometimes it is possible to reduce or even discard the paradigm of relational pattern, improve the efficiency of database operation. For example, frequently querying data from two tables, in order to avoid frequent connections, you can have appropriate data redundancy.

(1) Table segmentation: Horizontal segmentation, vertical segmentation.

(2) The main advantages of non-standard design

    • Reduced connection required for query operations
    • Reduced number of foreign keys and indexes
    • Can advance statistical calculation, improve the response speed of the query

(3) The main problems of non-normalization

    • Increased data redundancy
    • affect the integrity of the database
    • Reduces the speed of data updates
    • Increased physical space occupied by the storage table

5. Closure concept

    • Closure of the property set: Given the relationship R (U,F), the set of all X→a A in f,f+ is called the closure of X, which is recorded as x+.
    • Function-dependent closures: If f is a function dependency set of the relational mode R (U), we refer to F and all the set of functions dependent by F logic as the closure of F, denoted as f+.

Rule: If X is a subset of U, x→φ belongs to f+.

Use of property set closures:

    • Determine if α is a super-code, by calculating α+ (the closure of α under f), see if α+ contains all the properties in R. If so, then Alpha is the super Code of R.
    • Verify that the function dependency is established by verifying whether the β∈α+. That is, the α+ is computed with a property closure to see if it contains beta.
Second, the example of the story (the eighth time on the machine)

Basic concept: http://www.cnblogs.com/AlvinZH/p/6856432.html

key points in solving the problem: determining whether the function depends on x→y can be derived from F, can be converted to x+ and determine if Y is a subset of x+. (The problem of F-closure can be translated into the problem of attribute set closure)

The method of solving the problem two: the closure algorithm of the attribute set:

    1. Place a into a +.
    2. For each FD (function dependent), if the left part belongs to A +, the right part is placed in A +.
    3. Repeat to a + cannot be expanded.

The main points of solving methods : The inference rule of function dependence:

    1. If xy->z, then x->z,y->z (error)
    2. If x->y, then Xz->yz
    3. If X->y, X->z, then X->yz
    4. If x->y,z∈y, then x->z
    5. If x->y,y->z, then x->z
    6. If X->yz, then x->y,x->z
    7. If a->b,bc->d, then ac->d

Example 1. A relationship is known to have an attribute: A,b,c,d,e,f. Suppose that the relationship has a function dependent on f={ab->c, Bc->ad, d->e,cf->b}, to determine whether Ab->d,d->a is contained in these function dependencies.

Solution: The former translates to the closure of attribute set AB: {A,b,c,d,e},d is a subset of its closures, so the former is contained in these function dependencies. The latter asks for the closure of attribute set D: {D,e},a is not a subset of its closures, so the latter does not contain it.

Example 2. Known relational patterns R (u,f), U={a,b,c,d,e,g},f={ab->c, C->a,bc->d, Acd->b, D->eg, Be->c, CG->BD, CE->AG}, Judge whether Bd->ac belongs to f+.

Answer: The subject changed a kind of ask law, the truth is the same: Judge Bd->ac into not set up. The closure that is converted to BD: {B,d,e,g,c,a},ac is a subset, so belongs to.

Example 3. The candidate key for the function dependency set f={a1→a2,a3→a2,a2→a3,a2→a4},r on the given relationship R (A1,A2,A3,A4) is __A__.

A. A1 B. a1a3 c. a1a3a4 D. a1a2a3

answer: to ask for a candidate keyword, first ask which attribute set's closure contains all attributes (Super code). A1 closure: {A1,A2,A3,A4}, meets the requirements.

Example 4. There is a relationship mode R (Employee name, project name, salary, department name, department manager). If required, each employee may participate in multiple projects, each with a salary (one person and one project at the same time to determine the salary); Each project belongs to only one department; Each department has only one manager.

① the basic function dependencies and key codes of the relational mode R.

Basic function Dependency: (Employee name, project name) → salary, project name → department name, Department name → Department manager

Key code: (Employee name, project name)
② explains that R is not a 2NF reason and decomposes R into 2NF.

2NF Features: There is no partial function dependency of the non-primary attribute on the candidate code.

(Draw a good understanding of QWQ)

Here (employee name, project name) p→ department name, there is a non-main attribute to the candidate code part of the function dependency, so does not belong to 2NF.

Broken down into 2NF:R1 (employee name, project name, salary), R2 (project name, department name, department manager).
The ③ then decomposes R into 3NF and explains the reason.

Analysis: 3NF Features: The non-primary attribute has no partial function dependency on the candidate code, and no transitive dependency. Obviously there is a transitive dependency: project name t→ Department Manager, need to further split.

Decomposition: R1 (name of employee, project name, salary), R2 (project name, department name), R3 (project name, Department manager).

Reason: The codes of the above three tables are (employee name, project name), (project name), (project name), each non-main attribute has no partial function dependency on the candidate code, and no transfer function dependency.

Example 5. With relational mode R (A,B,C,D,E,F), its function depends on set f={e->d, C->b, Ce->f, b->a}.

① indicates the primary key of R and explains the reason.

Analysis: The primary key is first asked which property set's closure contains all the properties.
Answer: The primary key is CE, because the property set CE's closure is {c,e,d,b,f,a} and contains all the properties.
②r highest in the first paradigm.

Analysis: Ta said is generally the first paradigm, of course, the need for a simple analysis, to facilitate the third question:)
Solution: The highest belongs to the first paradigm, because there is a partial function dependency of the non-primary attribute on the candidate code: C->B.
③ decomposition R is 3NF.

Analysis: This drawing really good understanding, first decomposed into a second paradigm, and then decomposed into a third paradigm.

Second paradigm: (E,d), (C,b,a), (c,e,f) (Elimination of partial function dependencies of non-primary attributes)
Third paradigm: (E,d), (CB), (BA), (c,e,f) (Elimination of non-principal attribute transfer function dependency)

Solution Point Four : Determine whether a pattern decomposition has a non-destructive connectivity :

1. Non-destructive connection theorem: a decomposition of relational mode R (u,f), Ρ={r1<u1,f1>,r2<u2,f2>} The sufficient and necessary condition for a lossless connection is: u1∩u2→u1-u2€f+ or U1∩U2→U2-U1 €f+.

In simple terms, it is

Note: The above theorem can only be judged divided into two relations, if divided into three and above, how to do it (exam will not test so difficult, rest assured), the book 197 pages of Table method, in-depth study: http://blog.csdn.net/qq379666774/article/ details/16969493.

2. Inference ( Sufficient condition ): If the R1∩R2 is a R1 or R2 (The U part depends on K, then the K is the Super code), then the decomposition (R1,R2) on R is a lossless decomposition.

Note: Notice here is sufficient condition, when all constraints are function dependent it is necessary (such as multivalued dependency is a non-function dependent constraint), but this is enough, commonly used this.

Example 6. R (A,b,c,d,e), R's function dependency set F={a->bc,cb->e,b->d,e->a}. Determine if R1 (A,b,c), R2 (A,d,e) is a lossless connection.

answer : r1∩r2= (a), a closure of {a,b,c,e,d}, found that A is the R1 of the super code, or even the primary key, so is a lossless connection. (note is sufficient condition)

key points Five : Determine whether a pattern decomposition has function-dependent persistence:

1. Sufficient conditions : if every function dependency on F is established on a relationship after its decomposition, the decomposition is dependent.

2. If the above judgment fails, it is not possible to assert that decomposition is not dependent, because it is only sufficient, and the following general method is used to make further judgments.

Algorithm: Use the following procedure for each α→β on F:

result:=α;  while  do     for Each decomposed Ri        t= (Result∩ri) + ∩ri        result=result∪t

The (Result∩ri) + attribute closure is computed under the function dependency set F. If result contains all the properties of β, then the function relies on α→β. Decomposition is to remain dependent when and only if all dependencies of f in the above process are persisted.

Example 7. Set the relationship mode R<u, F>, where U={a, B, C, D, e},f={a→bc,c→d,bc→e,e→a}, then decompose Ρ={r1 (ABCE), R2 (CD)} satisfies ___A_ _。

A has non-destructive connectivity, maintains function dependencies
B Non-destructive connectivity, maintaining function dependencies
C Non-destructive connectivity, no maintenance of function dependencies
D Does not have non-destructive connectivity, does not maintain function dependencies

answer : First judge the non-destructive connection. R1∩r2={c}, Calculate c+={c,d}, Visible C is the R2 of the Super code, the decomposition is a lossless decomposition.

Then judge to remain dependent. A→bc,bc→e, E→a are established on the R1 (that is, each function depends on both sides of the property is in the R1), C→d is established on the R2, so the decomposition is to remain dependent. (can be judged by sufficient conditions).

Example 8. Given the relationship pattern R<u, F>,u={a, B, C, D, e},f={b→a,d→a,a→e,ac→b}, the candidate key is _ (1)D_, then decomposition Ρ={r1 (ABCE), R2 (CD)} satisfies _ (2)D_.

(1) A. ABD B. ABE C. ACD D. Cd
(2) A. has non-destructive connectivity, maintains function dependencies
B Non-destructive connectivity, maintaining function dependencies
C Non-destructive connectivity, no maintenance of function dependencies
D Does not have non-destructive connectivity, does not maintain function dependencies

answer : First question, calculate closures for each option.

(ABD) + = {A,b,d,e}
(ABE) + = {A,b,e}
(ACD) + = {A,b,c,d,e}
(CD) + = {A,b,c,d,e}

Choose D.

Second question, first judge the non-destructive connection. R1∩r2={c}, Calculate C+={c}. C is neither R1 nor R2, but it cannot be judged that the decomposition does not have a non-destructive decomposition, because this method of judgment is sufficient condition. We use the theorem to judge:

U1∩u2={c}, whose closure is {C}

U1-u2={abe}

U2-U1={D}

Therefore, u1∩u2→u1-u2€f+ and u1∩u2→u2-u1€f+ are not established, that is, do not have a non-destructive connectivity.

Again judge to remain dependent:

B→a,a→e,ac→b on the R1, d→a in the R1 and R2 are not set up, but still need to make further judgments.

We apply algorithms to D→a:

Result=d
To R1,result∩r1=ф,t=ф,result=d
Again to r2,result∩r2=d,d+ =ade, T=d+∩r2=d,result=d
The result does not change after a loop, so the last Result=d does not contain a, so the d→a is not persisted, so the decomposition is not dependent.

Choose D.

Alvinzh

Source: http://www.cnblogs.com/AlvinZH/

The copyright belongs to the author Alvinzh and blog Park all, welcome reprint and Commercial, but without the consent of the author must retain this paragraph statement, and in the article page obvious location to the original link, otherwise reserves the right to pursue legal responsibility.

Application of normalization Theory of database review (eighth time on-machine content)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.