The description of a link is called relation schema ). A relational model should be a quintuple. It can be formally represented as R (u, D, Dom, F ). Here, r is the relational name, U is the set of attribute names that constitute the link, D is the domain from which the attribute in the attribute group U comes from, and Dom is the set of image of the attribute to the domain, f is the dependency set of data between attributes.
The link mode can be abbreviated as R (A1, A2 ,..., An ). Where R is the link name, A1, A2 ,..., An is the attribute name. Domain Name and Attribute-to-domain images are often directly described as the type and length of attributes.
A link is actually the state or content of the link mode at a certain time point. That is to say, the link mode is a type, and the link is its value. The relational mode is static and stable, while the relational mode is dynamic and constantly changing over time, because relational operations are constantly updating the data in the database. However, in practice, the relationship model and relationship are often collectively referred to as the relationship. Readers can distinguish between them from the context.
Relationship Pattern paradigm
There are four main paradigms: 1nf, 2nf, 3nf, and bcnf. The order from left to right is more strict than that of one. To conform to a certain paradigm, it must also satisfy all its vertices. Generally, the database design of a project can reach 3nf, and redundancy can be appropriately increased according to the specific circumstances, without the need to strictly abide by the so-called norms.
To put it simply, 1nf requires only correlated fields in a table and only one piece of information in a field. This is only the most basic requirement. Although 2nf, 3nf, and bcnf have different descriptions, they are similar in terms of data characteristics. It is like saying that you should not write down a batch of orders from a factory, put the factory area and telephone number in the same table, and use two tables to avoid wasting data storage space. Because the same factory may trade several times, but there is no need to record all the information for each transaction.
Compare the Functional Dependencies allowed by the paradigm, as shown in the association between the four paradigms.
The following describes each paradigm.
2.3.4.2 first paradigm
In each specific relational R in relational mode R, if each attribute value is the smallest data unit that cannot be further divided, R is called the relationship of the first paradigm.
For example, the employee number, name, and phone number form a table (a person may have an office phone number and a home phone number). There are three methods to standardize 1nf:
The first is to repeat the employee ID and name. In this way, the keyword can only be a phone number.
Second, the employee number is a keyword. The phone number is divided into two attributes: the phone number of the organization and the residential phone number.
Third, the employee number is a keyword, but each record must have only one phone number.
The first method is the least desirable of the above three methods. The last two cases are selected based on the actual situation.
2.3.4.3 second paradigm
The second Paradigm (2nf) of a link is defined as: If the relational mode R is 1nf, and every non-primary attribute in R is completely dependent on a candidate keyword of R, R is the second paradigm, which is abbreviated as 2nf.
[Example 2.40]There is a relational model R (student ID s #, course number C #, score g, instructor TN, instructor expertise TS), R-based function dependency set f = {(s #, C #) → G, C # → TN, TN → ts} to determine whether R is 2nf.
Solution:
(1) easy to see,The relational mode R is 1nf.. Because R complies with the definition of the relationship, all attribute values of R are atomic values that cannot be further divided.
Whether R is 2nf should be determined based on the definition of 2nf.
First, you must determine the function dependencies between attributes in the relational mode R. If the function dependency set of R is not directly given, it should be determined according to the semantics. In this example, the R-based function dependency set F has been provided directly. We can use the arms inference rules and the methods described below, further determine which primary attributes are in R, which are non-primary attributes, and which attributes are composed of the candidate keywords.
Method ① writeFunction dependency setFunction dependencies in F to help analysis. Method ① is characterizedDirect.
F = {(s #, C #) → G,
C # → TN,
TN → TS
}
Method ② use a directed graph to indicate the function dependency between attributes, and the node represents the attribute. The box contains multiple nodes to indicate the Attribute combination, and the directed arrow to indicate the function dependency. In this exampleFunction dependency Diagram2.9. Method ② is characterizedIntuitive.
Figure 2.9 function dependency graph example
Method ③ combine the relational mode R with the function dependency set F. The Attribute combination is expressed by an underscore (or an upper line), and the function dependency is expressed by a directed arrow. In this exampleFunction dependency Diagram2.10. Method ③ is characterizedSimple.
Figure 2.10 function dependency example
Alibaba Cloud's inference rules can be introduced by F: (s #, C #) → {s #, C #, G, TN, TS}, that is, Attribute combination (s #, C #) is the candidate keyword of R (R only has this candidate key ). A value of (S #, C #) uniquely identifies a tuples in R (and there are no additional attributes ).
In r, s # And C # Are primary attributes, while other attributes g, TN, and Ts are non-primary attributes.
With the help of the above figure, we can see that the non-master attribute G is completely dependent on the key: (s #, C #) → G. However, for non-primary attributes TN, ts is partially dependent on keys (they only depend on the true subset C # of keys #).Because R existsPartial dependency of non-primary attributes on candidate keysSo the relational mode R is not 2nf.
R is partially dependent on candidate keys by non-primary attributes, which may cause data redundancy, data operation exceptions, and other problems. The relationship R can be broken down into two 2nf relational modes without interrupting the connection:
P = {R1, R2}, R1 = {s #. C #, g}, R2 = {C #, TN, TS }.
[Example 2.41]Among the course selection relation SCI (SNO, CNO, grade, and credit), SnO indicates the student ID, CNO indicates the course number, gradege indicates the score, and credit indicates the credit.
The preceding condition indicates a combination of keywords (SNO, CNO)
There are the following problems when using the above link mode in applications:
A. data redundancy. Assume that 40 students take the same course and 40 credits are repeated.
B. The update is abnormal. If the credits of a course are adjusted, the corresponding credit values of the corresponding tuples must be updated, and the credits of the same course may be different.
C. Insert exceptions. For example, if you plan to open a new course, because no one takes the course or has no student ID keyword, you can only save the course and credits after someone takes the course.
D. Deletion exception. If the student has completed his/her career and the electives record has been deleted from the current database, the course number and credits may be completely deleted from the database, and the course and credits record cannot be saved.
Cause: the non-Keyword attribute credit only depends on CNO, that is, the credit part depends on the combined keyword (SNO, CNO) rather than the full dependency.
Solution: It is divided into two relational modes: SC1 (SNO, CNO, grade), C2 (CNO, credit ). The new relationship includes two relational modes, which are connected by the external keyword CNO in SC1. When necessary, they are naturally connected, restoring the original relationship.
2.3.4.4 third paradigm
3nf: If the relational mode R is 2nf, and every non-primary attribute in R does not pass a candidate keyword dependent on R, R is the third paradigm, which is abbreviated as 3nf.
[Example 2.42] continued the previous example 2.40 (R (student ID s #, course number C #, score g, instructor TN, instructor expertise TS)),Judge link mode R1 = {s #. C #, g}, R2 = {C #, TN, TS} is 3nf.
Solution:
(1) In relational mode R1 = {s #, C #, g}, the candidate keyword is (s #, C #), and the primary attribute is s #, C #, the non-primary attribute is g, and the function dependency is (s #, C #) → G. Because R1 does not have a dependency on the passing of candidate keywords for non-primary attributes, the relational mode R1 is 3nf.
(2) In relational mode r2 = {C #, TN, TS}, the candidate keyword is C #, the master attribute is C #, and the non-master attribute is TN, ts, function dependencies are C # → TN, TN → ts. Because there is a dependency on C # ts for the transfer of candidate keywords by non-primary attributes in R2, the relational mode R2 is not 3nf.
The relationship R2 can be broken down into two 3nf relational modes without interrupting the connection:
P = {R3, R4}, R3 = {C #, tn}, r4 = {tn, TS }.
[Example 2.43]For example, each attribute (SNO, sname, DNO, dname, location) represents a student ID,
Name, Department, Department name, and Department address. Judge link modeS1Whether it is 3nf.
The SNO keyword determines the attributes. Because it is a single keyword, there is no part of the dependency problem, it is 2nf.
However, there is a lot of redundancy in this relationship. The attributes DNO, dname, and location of the student will be stored, inserted, deleted, and modified repeatedly.
Cause: A dependency is transmitted in the link. The SNO keyword decides the location function by passing the dependency: Sno-> DNO, and DNO-> location. That is to say, SnO does not directly determine the non-primary attribute location, not 3nf.
Solution: the transfer dependency cannot be left in each link mode.
Solution: There are two relationships: S (SNO, sname, DNO), D (DNO, dname, location)
Note: The outer keyword DNO cannot be found in link S. Otherwise, the relationship is lost.
2.3.4.5 Boyce-codd paradigm
The Boyce-codd Paradigm (bcnf) of a link is defined as follows: if the relational mode R is 1nf, and every function in R depends on X → y (Y then X ), if X is the Super keyword of R, R is called Boyce-codd paradigm, which is abbreviated as bcnf.
From the definition of bcnf, we can draw the following conclusions:
(1) keys of all non-primary attributes are fully functional dependencies;
(2) All primary attribute pairs do not have full function dependencies on keys that do not contain them;
(3)No attribute full function depends on any Attribute combination of non-Keys.
Unlike 2nf and 3nf, bcnf is defined based on 1nf. However, bcnf is actually an improved form of 3nf. 3nf only takes into account the dependency of non-primary attributes on keys. bcnf also includes the dependency of primary attributes on keys. Bcnf must meet higher requirements than 3nf. If the relational mode R is bcnf, R must be 3nf. Otherwise, it may not be true.
[Example 2.43] continued renewal 2.42 (Student ID s #, course number C #, score g, instructor TN, instructor expertise TS) To determine whether the 3nf relational model R3 = {C #, tn}, r4 = {tn, TS} is bcnf.
Solution: In relational mode R3, functions depend on C # → tn. the determining factor C # is the key of R3;
In the relational mode, R4 has function dependency tn → ts, and the determining factor TN is the key of R4;
R3 and R4 both meet the definition of bcnf. Therefore, the two relational modes are bcnf.
[Example 2.44]The Component Management Relationship Model WPE (wno, PNO, Eno, qnt) lists the Warehouse number, accessory number, employee number, and quantity respectively. The following conditions are met:
A. a warehouse has multiple employees.
B. One employee only works in one warehouse.
C. Special personnel are responsible for one type of accessories in each warehouse, but one person can manage several types of accessories.
D. Parts of the same model can be placed in several warehouses..
Analysis: qnt cannot be determined by the above PNO, and is determined by the combination attribute (wno, PNO). function dependencies (wno, PNO)-> Eno exist. A person is responsible for one type of accessories in each warehouse, and one person can manage several types of accessories. Therefore, the owner can be determined only by the combination attribute (wno, PNO) -> Eno. Because an employee only works in one warehouse, there is ENO-> wno. As a type of accessories in each warehouse is the responsibility of a person, while an employee only works in one warehouse, there are (Eno, PNO)-> qnt.
Find the candidate keyword, because (wno, PNO)-> qnt, (wno, PNO)-> Eno, so (wno, PNO) can determine the entire tuples, is a candidate keyword. Based on ENO-> wno, (Eno, PNO)-> qnt, the entire tuples (Eno, PNO) can also be determined as another candidate keyword. The Eno, wno, and PNO attributes are both primary attributes and only one non-primary attribute qnt. It fully depends on any candidate keyword and is directly dependent on the function. Therefore, the relational mode is 3nf.
Analyze the main attributes. Because ENO-> wno, the main attribute Eno is the deciding factor of wno, but it is not a keyword, but only a part of the combination keyword. This results in the dependency of the master attribute wno on another candidate keyword (Eno, PNO), because (Eno, PNO)-> Eno, but in turn does not exist, p-> wno, so (Eno, PNO)-> wno is also a transfer dependency.
Although there is no dependency on the transfer of non-primary attributes to candidate key Liao, the transmission dependency of primary attributes on candidate keywords can also be troublesome. For example, a new employee is assigned to a warehouse but is currently in the internship phase and is not independently responsible for the management of certain accessories. The link cannot be inserted because a portion of PNO is missing. If an individual changes to safety regardless of accessories, the employee will also be deleted when the accessories are deleted.
Solution: divide it into the management EP (Eno, PNO, qnt), the keyword is (Eno, PNO), the job EW (Eno, wno), and the keyword is Eno
Disadvantage: the persistence of function dependencies after decomposition is poor. In this example, due to decomposition, the function dependency (wno, PNO)-> Eno is lost, and thus the original semantics is damaged. It does not show that a part in each warehouse is under the responsibility of a specialist. A component may be managed by two or more people at the same time. Therefore, the decomposed link mode reduces some integrity constraints.