The relational model has three important components:
- Data. All the data in the database and its interlinkages are organized into "relationships" (two-dimensional forms).
- Data manipulation. The relational model provides a complete set of advanced relational operations to support various operations on the database. Relational operations are divided into two categories: relational algebra and relational calculus.
- Data integrity rules. The data in the database must satisfy the three class integrity rules such as entity integrity, referential integrity, user-defined integrity, and so on.
Five basic operations of relational algebra
Relational algebra is a set of advanced operations that are an operand, in relation to a collection of tuples of the same number of attributes. The operation of relational algebra can be divided into two categories:
- The traditional set operation: the inverse (division) of the Cartesian product, the difference, the intersection, the Cartesian product (multiplication).
- Extended relational Operations: Vertical segmentation (projection) of relationships, horizontal partitioning (selection), bonding of relationships (connections, natural connections).
The relationship R and S have the same relational pattern, and R and S are composed of a set of tuples that belong to R or belong to S, which are recorded as R∪s.
The relationship between R and S has the same relationship pattern, and the difference between R and S is a set of tuples that belong to R but do not belong to S, which is recorded as R-s.
- Cartesian product (Cartesian product)
The values of R and s are set by R and S, and the Cartesian product defining R and S is a set of tuples (R + s), the first R component (attribute value) of each tuple comes from a tuple of R, and the second component of S is from a tuple of s and is recorded as R x S. If R has m tuples and S has n tuples, then R x S has m x n tuples.
For example: the Cartesian product of {a,b},{0,1,2} is {(a,0), (a,1), (a,2), (b,0), (b,1), (b,2)}
This operation is to split a relationship vertically, eliminate some columns, and rearrange the order of the columns. For example, ABC three columns, select the CA two column and reorder it as ΠCA (send symbol, CA as subscript), or π31 (column ordinal).
According to some conditions, the relationship is split horizontally, that is, the qualifying tuple is selected. When writing, arithmetic constants are enclosed in single quotation marks, and attribute ordinals or attribute names do not need to be enclosed in quotation marks.
Traditional collection operations such as:
Four combined operations of relational algebra
The intersection of R and S is a set of tuples that belong to R and belong to S, which is recorded as R∩s. The result of the intersection operation equals R-(r-s) or S-(s-r).
In fact, it is the selection operation from the Cartesian product of relation R and S.
- Natural connection (Natural join)
In general, natural connections are used in cases where R and S have public properties, and if two relationships have no public properties, their natural connections are converted to Cartesian product operations. Plainly, it is similar to the two tables in the database join connection based on the primary foreign key.
Words are not good. Describes this operation, similar to finding the specified condition in a primary table, and is the result of satisfying the specified condition of multiple rows and columns, and then dividing the result set vertically to remove the condition column.
Division represents at least the list of students who are enrolled in the course in COURSE1,COURSE2,COURSE3:
Application examples of relational algebra operations
In the relational algebra operation, it is called the relational algebraic expression that the five basic operations undergo a finite compound equation, and the result of the expression is still a relationship.
There are 4 relationships in which each query statement is expressed in a relational algebraic expression. Note: and (∪), Difference (-), Cartesian product (x), selection (σ), projection (π), natural connection (??). ), or (∨), and (∧)
- Teacher Relationship T (t#,tname,title)
- Course Relationship C (c#,cname,t#)
- Student Relations S (s#,sname,age,sex)
- Course Selection Relationship SC (S#,c#,score)
(1) The study number and achievement of the C2 course number is searched.
SELECT s#, score from SC WHERE C # = ' C2 '; Relational algebra Expressions: Πs#,score (σc#= ' C2 ' (SC))
(2) The study number and name of the C2 course number.
SELECT s#, SNAME from S INNER joins SC on s.s# = sc. s# WHERE SC. C # = ' C2 '; Relational algebra expression: Πs#,sname (σc#= ' C2 ') (S?? SC))
(3) Retrieve the student number and name of at least one of the courses taught by LIU.
Πs#,sname (σtname= ' LIU ' (S??) Sc?? C?? T))
(4) Search the student number for the elective course number C2 or C4.
πs# (σc#= ' C2 ' ∨c#= ' C4 ' (SC))
(5) To retrieve the student number of at least elective course number C2 and C4 course.
πs# (σc#= ' C2 ' ∧c#= ' C4 ' (SC?? SC)) Note: Here you need to do the Cartesian product operation of the relationship itself, a single relationship or table cannot do a field with the operation.
(6) The name and age of the student who did not study C2 course.
Πsname,age (s)-πsname,age (σc#= ' C2 ' (s?? SC)) Note: First find out the student's name and age of the C2 course, then use the whole student and it to do "minus" operation.
(7) The name of the student who studied all the courses.
Πs#,c# (SC) indicates the student's elective course; πc# (C) denotes all courses;
Students who have studied all the courses can use the division operation to indicate that the operation result is the number s# set: πs#,c# (SC) ÷πc# (C);
Finally make a connection get student name: Πsname (S?? (πs#,c# (SC) ÷πc# (C)))
(8) The search course includes the student number of the students who have studied the course number S3.
Πs#,c# (SC) ÷πc# (σs#= ' S3 ' (SC))
Relational algebra Expressions Query summary: Generally, the query involves the relationship to take, perform Cartesian product or natural connection operation to get a large table, and then perform a horizontal split (select operation) and vertical segmentation (projection operation). However, when the query involves negation or all values, it is necessary to use the (poor) operation or (division) operation.
Two extended operations for relational algebra
When the relationship R and S do a natural connection, it is a tuple consisting of equal values on the public properties of the two relationships, which naturally causes some tuples in R or S to be discarded during operation. Sometimes, you may need to save these discarded tuples, you need to do an outer join.
Outer joins are divided into: outer connection, left outer connection, right outer connection. After the connection, empty space is filled in null.
- External and (Outer Union)
The previously defined and operational requirements for R and S have the same relational pattern. But if the relationship between R and S is different, the properties of the new relationship are composed of all the properties of R and S, the public attribute is taken only once, the tuple of the new relationship is composed of the tuple of R or S, and the new added attribute of the tuple is filled with null value, which is an "external and" operation.
Relational calculus
The relational calculus can be divided into the tuple relation calculus and the domain relation calculus, the former takes the tuple as the variable, the latter takes the attribute as the variable, the abbreviation tuple calculus and the domain calculus.
- Tuple relationship Calculus
T is a tuple variable that represents a fixed tuple of tuples; P is a formula, also known as a predicate in mathematical logic, that is, a conditional expression in a computer language. {T | P (t)} represents the collection of all tuple t that satisfy the formula P.
- There are 3 forms of the Atomic formula (Atoms):
- R (s), where R is the relationship name and S is a tuple variable, which means "s is a tuple of relationship R".
- S[i]θu[j],s and u are tuple variables, and θ is an arithmetic comparison operator that represents "the θ relationship between the first and second components of the tuple S and the U-J components". Example: S[1] < u[2].
- A s[i]θa or aθu[j],a is a constant that represents the "I-component value of a tuple s" satisfies the θ relationship with constant a. For example: s[1] = 3.
The concept of "free" and "constrained (Bound)" variables are used when defining relational calculus operations. In a formula, if a tuple variable does not use the "quantifier (?)" or "Universal quantifier (?)" Symbol definition, it is called a "free-tuple variable", otherwise it is called a "constrained tuple variable." A free tuple variable is similar to an external variable or global variable in a programming language, constraining a tuple variable to a local variable defined in a similar procedure.
- The recursive definition of the formula (formulas) is as follows:
- Each atom is a formula in which the tuple variable is a free variable.
- If P1 and P2 are formulas, then ┐p1, P1∨P2, P1∧P2, P1 and P2 are all formulas. Respectively said: "P1 is not true", "P1 or P2 or both are true", "P1 and P2 are true", "if P1 is true P2 must be true."
- If P1 is a formula, then (? s) (P1) and (? s) (P1) are also formulas. The following propositions are represented: "There is a tuple s that makes the formula P1 true" and "for all tuple s makes the formula P1 true."
- The precedence of the various operators in the formula is from high to low: θ,? and?, ┐ (symbol: Non), ∨ (symbol: OR) and ∧ (symbol: And),=>.
- Formulas can only be composed of the above-mentioned forms.
In the function {T | In P (t)}, T is the only free-tuple variable in P (t).
The following table, (a), (b) is the relationship R and S, (c) ~ (e) are the values of the following five tuple expressions, respectively:
- R1 = {T | S (t) ∧t[1] > 2} solution: T is a tuple contained in S, and each t satisfies a condition of t[1] > 2, that is, the value of column A is greater than 2.
- R2 = {T | R (t) ∧┐s (t)} solution: T is a tuple that is contained in R, and also satisfies a non-s (t), which is not a tuple in S.
- R3 = {T | (? u) s (t) ∧r (U) ∧t[3] < u[2]} solution: T is a tuple contained in S, and S.C < r.b is the one that meets the requirements.
(a) Table R (b) Table S
A |
B |
C |
|
A |
B |
C |
1 |
2 |
3 |
|
1 |
2 |
3 |
4 |
5 |
6 |
|
3 |
4 |
6 |
7 |
8 |
9 |
|
5 |
6 |
9 |
(c) R1 (d) R2 (e) R3
A |
B |
C |
|
A |
B |
C |
P align= "Center" > |
A |
B |
C |
3 |
4 |
6 |
|
4 |
5 |
6 |
&NBSP; |
1 |
2 |
3 |
5 |
6 |
9 |
|
7 |
8 |
9 |
&NBSP; |
3 |
4 |
6 |
|
|
|
|
|
|
|
|
|
|
|
Transformation of a relational algebraic expression to a tuple expression
A relational algebraic expression can be converted to a tuple expression in an equivalent way. Since all relational algebraic expressions can be combined with five basic operations, it is possible to represent five basic operations in a tuple calculus expression. For example: R∪s can be used {T | R (t) ∨s (t)}; R–s available {T | R (t) ∧┐s (t)} indicates.
The security constraints and equivalence of relational operations
In database technology, operations that do not produce infinite relationships and infinite validations are called security operations , and the corresponding expressions are called security expressions , and the measures taken are called security constraints . In relational algebra, the basic operation is projection, selection, sum, difference, Cartesian product, there is no "complement" operation of the set, so the relational algebra operation is always safe. But the relational calculus is not, may have the infinite relation and the infinite verification question, the relational calculus must have the security restraint the measure, the relational calculus is safe.
Optimization of relational algebraic expressions
How to arrange the sequence of selection, projection and connection in the relational operation can not only save time and space, but also high execution efficiency, which is the problem that the optimization of relational algebraic expression needs to be solved.
The relationship between R (A, B) and S (C,D) is a two-tuple relationship, and if there is a query that can be expressed as an algebraic expression of the relationship: E1 =πa (σb=c∧d = ' A "(R x s)); This is not a good query, if you choose to multiply the s, the performance will improve: E1 =πa (Σb=c ( R xσd = ' (S) '); Further analysis can be seen that the Cartesian product and the subsequent selection of B = C can be combined into a form of equivalent connection, and further optimization: E1 =πa (R?? B=cσd = ' (S) '). Note: Here the red b=c should be written in?? Symbol below.
Many systems use heuristic optimization method to optimize the relationship expression, this optimization strategy is not related to the storage technology of the relationship, mainly discusses how to arrange the order of operation reasonably, spends less time and space, its basic idea can be represented as three heuristic rules:
- Perform the selection operation as early as possible.
- Perform the projection operation as early as possible.
- Avoid doing the Cartesian product directly, combining the sequence of selections and projections before and after the Cartesian product operation.
Relational operations-Database system principles