Paradigm analysis, paradigm

Source: Internet
Author: User

Paradigm analysis, paradigm

1. the paradigm theory is a Codd method used for data modeling. Practice has proved that it is suitable for micro-analysis of a single link. It is better to use the later ER Method for macro analysis. In fact, the strength of ER analysis lies only in Macro Analysis. Sometimes the results cannot be guaranteed only by Using ER methods. Recalling previous contract examples, there are two kinds of analysis practices: one is to summarize the contract and the goods, and the other is to summarize the contract, the goods and the manufacturer. The principle used by the previous scheme is that attributes are unique to an instance of an object and are classified into that entity. The latter scheme utilizes the principle that attributes can be expressed by new entities as long as they have independent meanings and possibilities. From the perspective of the solution, it seems that the entity of the second solution is more complicated, but not from the perspective of the final link model. For example, the contract (contract number, order date, manufacturer name, vendor address) must be the same as the vendor name, and there must be some redundancy. The second solution does not have such redundancy, such as: contract number (contract number, order date, vendor number) vendor (vendor number, vendor name, vendor address)

So in actual use, we can use the E-R model for macro modeling, the details of the things to the paradigm Analysis Solution (through the paradigm analysis filter to get the final reasonable table structure)

The application database paradigm can bring many benefits, but the most important benefits are as follows:

1. reduce data redundancy (this is the main benefit, and other benefits are included here)

2. Eliminate exceptions (insert exceptions, update exceptions, and deletion exceptions)

3. Make data organization more harmonious...

Paradigm analysis is suitable for solving such problems. Based on the Macro Analysis of ER, the optimal and non-redundant relational model can be obtained by using the paradigm analysis (to put it bluntly, the paradigm analysis solves the data redundancy problem in the relational model ).


2 Basic Concepts these basic concepts are the basis of the entire paradigm analysis theory.

2.1 function dependencies are usually determined and determined dependencies in the relationship, such as the Dependency between non-primary attributes and primary attributes. As a relationship corresponding to real-world entities, a relationship should represent a complete concept. Therefore, non-primary attributes depend on primary attributes, which is a basic requirement. This dependency is called function dependency because it is very similar to the dependent variable relationship in the function.

2.2 The dependency between ordinary functions and non-ordinary function dependency attributes determines their own dependencies. This is called the dependency between ordinary functions. Obviously, this is of little significance. Therefore, the function dependency usually refers to the dependency between non-ordinary functions. The so-called non-trivial function dependency refers to the dependency between an attribute and other attributes.

2.3 Test the relationship between full function dependency and some function dependencies: (student ID, Department, Department Head name, course name, score)

Student ID: department head name Course name score

000001 computer money instructor Java 89

000001 computer money instructor database 70

000002 information instructor Li database 81

000003 Information Retrieval by instructor Li 90

Note: A department has several students, and a student only belongs to one department. A Department has only one director. A student can take multiple courses, and each course has several optional courses; each student has a score for each course.

The primary key of the link is (student ID, course name). Only the score attribute is completely determined by the primary key in non-primary attributes. Such dependency is completely functional dependency. Instead, it is decided by the primary key on the surface, but only by the student ID. This dependency is part of the function dependency.

2.4 The transfer function dependency still examines the above relationship. The head of the department is determined by the primary key on the surface, but only by the department where the student ID is located, therefore, this dependency belongs to the dependent function.

Both part of the function dependency and the passing of the function dependency are manifested as the internal consistency in the broken relationship. In fact, the Department and Department Head name in the above data are obviously redundant, all of them are not completely dependent.

3. The Normal Form is the meaning of the standard Form. According to the degree of compliance, different levels of paradigm can be obtained. Similar to the standard of physical fitness, different standards form different levels. A relationship that does not meet the requirements of the paradigm is called a paradigm violation.

3.1 2NF and 3NF 2NF are the paradigms that eliminate part of function dependencies, while 3NF is the paradigm that eliminates the transfer of function dependencies. The methods for eliminating these non-full dependencies are the same. The split method is used to separate existing relationships into independent and complete subrelationships to achieve full dependency. Step 1: locate the primary key attribute group. 2) identify all dependencies, including full function dependencies, partial function dependencies, and passing function dependencies.

Relationship: (student ID, department, course name, score)

Course name score of the student ID's Department

000001 computer Java 89

000001 computer database 70

000002 Information Database 81

000003 Information Retrieval 90

The primary key attribute group is student ID, and the full function dependency of the Course name is student ID. The function dependency of the Course name-> score part is student ID-> which of the following methods has been split: (student ID, course name, score) (student ID, Department) the above practices eliminate some function dependencies to form 2NF.

All database tables with single keywords comply with the second paradigm, because it is impossible to have a combination of keywords.

Link: (student ID, Department, Department Head name)

Department Head name of student ID

000001 computer instructor Cheng

000001 computer instructor Cheng

000002 information instructor ding

000003 information instructor ding

The primary key attribute group is: Student ID full function dependency: Student ID-> the system where the student ID is located, the transfer function dependency is: department ID-> department head name. The above method has been split: (student ID, the above practices eliminate the 3NF transfer function dependency.

3.2 BCNF, whether 2NF or 3NF, can be included in the BCNF category. BCNF is a broad paradigm concept proposed by Boyce and Codd. The BCNF standard is: The left side of any non-trivial function dependency must contain the key code. From a conceptual point of view, none of the above functional dependencies and the transferred Functional Dependencies meet the requirements of BCNF, if some function dependencies exist or function dependencies are transmitted, the left side of a dependency does not fully contain the key code. Therefore, 2NF and 3NF are not BCNF, and BCNF must be 2NF and 3NF (inverse negation ). The above normalization practices are the solutions of BCNF.

3.3 4NF the preceding dependencies belong to the scope of function dependencies. Function dependencies are a type of Data Dependencies. In addition, Data Dependencies also contain multi-value dependencies. However, BCNF can only achieve the highest degree of decomposition within the scope of function dependencies. In the case of multi-value dependencies, it cannot be solved, just as good antiviral drugs cannot solve genetic diseases. The following table of movie star information is obviously redundant:

Movie name age in the city where movie stars are located

Jackie Chan 123 Hong Kong City Hunter 1990

Jackie Chan 456 Shanghai City Hunter 1990

Jackie Chan 123 Hong Kong 1988

Jackie Chan, 456 Shanghai drunk boxing, 1988

Jackie Chan 123 Hong Kong I am 1995

Jackie Chan, 456 Shanghai I am 1995

The reason for the above redundancy lies in the query requirements. Although the table below is simple, it cannot meet such query: "Which movies have been filmed by movie stars in Hong Kong ?"

Movie name age in the city where movie stars are located

Jackie Chan 123 Hong Kong City Hunter 1990

Jackie Chan 123 Hong Kong 1988

Jackie Chan, 456 Shanghai I am 1995

However, from the perspective of BCNF, there is no problem, because the primary key attribute actually contains all the attributes, so there are no problems such as function dependencies and passing function dependencies. The reason for redundancy is that it has multi-value dependency, and the relationship clearly places the two groups of irrelevant content in a relationship, that is, the movie star's address and the movie star's film. This also requires the use of the split method to normalize it. From the above problems, we can see that, if you do not want to have any problems, we should try to ensure the consistency and integrity of the relationship. The primary attribute completely determines the non-primary attribute, instead of putting a lot of content that does not belong to the same concept into a link, it is easy to generate a paradigm violation and data redundancy.

3.4 1NF the so-called 1NF is the basic requirement for the relational model, that is, one field in a record can only have one value. A value can be assigned to a variable. Obviously, multiple values cannot be assigned to a variable at the same time. Any relational model must satisfy this condition, just as air is the basic condition for human survival. In fact, the relational database management system we use cannot make records that do not meet 1NF requirements. The main content of this paradigm is how to process fields with multiple values. For example, the student basic information table includes the following fields: (student ID, name, gender, birthday, and family member). For the family member field, logically, it must be a value. Although the user may enter multiple values, such as "FATHER: Zhang XX, Mother: Li XX", since it is processed as a field, it represents the information of a family member. If you need to query individual family members, this field is no longer a value logically. Simple Single-field processing is neither convenient to query nor accurate to express the differences between multiple members. There are two solutions: one is horizontal scaling, such as: (student ID, name, gender, birthday, father, mother, the design idea is to logically separate non-integral fields into logically integral fields. But there is a problem that is not easy to expand. It is difficult to express more family members effectively, which can both be expressed and redundant. Second, vertical scaling, for example, (student ID, name, gender, birthday) (student ID, family member category, name). This approach uses ER analysis, it is divided into two entities: Students and family members.

4. Definition of paradigm 4.1 function dependency 4.1.1 Definition R (U) is a relational mode on the property set U, and X and Y are the subsets of U. If any possible relationship between R (U) is r, the attribute values of two tuples on X cannot be the same in r, but the attribute values on Y cannot be the same, it is called "X function determines Y" or "Y function depends on X" and is recorded as X → Y.

4.1.2 principles of decomposition and merging are as follows: X → Y1, X → Y2. Then, X → Y1 and Y2 are mainly used to simplify the dependency expression during normative analysis. Note that they are only limited to the Right of function dependency.

4.1.3 if the conversion rules are X → Y, Y → Z, then X → Z indicates the existence of the passed function dependency. Note that the premise is that there is no Y → X.

4.1.4 If the equivalence principle is "X → Y, Y → X", then "X" is equivalent to "Y", which is recorded as "X → Y. The dependency between equivalent attributes does not need to be considered. The following relationship does not exist: Student (student ID, name, ID card number): Student ID → ID card number, ID card number → name

4.1.5 Supplement 1) function dependency is applicable to all relational instances, not partially satisfied. 2) function dependency is the concept of semantic category, which may change due to local requirements.

4.2 trivial function dependencies and non-trivial function dependencies 4.2.1 are defined in the relational mode R (U). For the subset X and Y of U, if X → Y, but Y belongs to X, X → Y is an ordinary function dependency. If X → Y, but Y does not belong to X, X → Y is a non-trivial function dependency.

4.2.2 ordinary dependency rules A1, A2 ..., An → B1, B2 ..., Bm is equivalent to A1, A2 ..., An → C1, C2 ..., Ck, where C is A subset of B, but C is not included in. The following functional dependencies are all trivial dependencies:

Student ID → student ID, name

Student ID, course number → student ID, score

Can be simplified:

Student ID → name

Student ID, course number → score

4.3 full function dependency and partial function dependency

4.3.1 defined in relational mode R (U), if X → Y, and for any real subset of X, there is no x' → Y, the complete function Y depends on X and is recorded as X-f → Y (f indicates full ). If X → Y, but Y is not fully dependent on X (with X '→ Y), Y is called dependent on X, record as X-p → Y (p indicates partial ).

4.3.2 code is defined according to the definition of full function dependency. You can strictly define K as the attribute or Attribute combination in the relational mode R <U, F>. If K-f → U, K is called a Candidate Key of R ). If the relational mode R has multiple candidate codes, select one of them as the Primary key ). How do I select a primary code? For example, select a student ID or an ID card number? Are sequential numbers or meaningful fields used?

4.4 transfer function dependency in relational mode R (U), if X → Y, Y → Z, and Y is not included in X, and Y → X is not included, in this case, the Z transfer function depends on X. If Y → X, that is, X then → Y, Z is directly dependent on X.

4.5 paradigm

4.5.1 1NF if all attributes of a relational model R are basic data items that cannot be divided, rε 1NF. The first paradigm is the minimum requirement for the relational model. Databases that do not meet the first paradigm cannot be called relational databases. However, the relational model that satisfies the first paradigm is not necessarily a good relational model.

4.5.2 2NF if the relational model rε 1NF, and every non-primary attribute is completely dependent on the R code, rε 2NF. In terms of definition, 2NF must be 1NF, and vice versa.

4.5.3 if such code X, attribute group Y, and non-master attribute Z (Z does not belong to Y) does not exist in the 3NF relational mode R <U, F>, Y → Z, but no Y → X is true, it is called R <U, F> ε 3NF. In terms of definition, 3NF must be 2NF, because some function dependencies can also be understood as passing dependencies, and its left side can be regarded as Y (X → Y), while the right side can be regarded as Z, that is, if some function dependencies do not meet the 3NF definition requirements, 3NF must be 2NF. Otherwise, 2NF is not necessarily 3NF.

4.5.4 BCNF relational mode R <U, F> ε 1NF. If each function of R depends on X → Y, if Y does not belong to X, X must contain a candidate code, so rε BCNF. In terms of definition, BCNF must be 3NF and 2NF. However, 3NF is not necessarily BCNF. Although it is rare, the relationship is as follows:

Students Teachers Courses

Zhang Liang, Li Yushan, Management

Zhang Liang, Liu Weiguo, Computer Basics

Wang Hai Li Yushan Management

Wang Hai Wei Fang English

Luo Jianjun Liu Weiguo Computer Basics

Note: 1) a teacher can only take one course. 2) Once a student chooses a course, the corresponding function dependency of the teacher is determined as Students, courses → Teachers → Courses if Students and Courses are selected as the master code, the existence of Teachers → Courses will undermine BCNF rules. Therefore, the above relationship is not BCNF, however, in the Teachers → Courses dependency, there is no non-primary attribute dependency and partial dependency on code passing, meeting the 3NF condition. This also shows that 3NF's "incomplete" nature is mainly manifested in the possibility of a primary attribute-to-code dependency and transmission dependency.

Summary: The paradigm analysis mainly aims to reduce data redundancy. Practice has proved that it is suitable for micro-analysis of a single link (that is, analysis of a single table). It is better to use the later ER Method for macro analysis. In actual production, we can first carry on the E-R modeling to the demand, obtain the relational table, then use the paradigm Analysis Method "filter" again to get the final reasonable table structure.

General steps of paradigm analysis:

① Find the primary key (Primary attribute group)

② Find the function dependencies (relational decision) of all non-primary attributes and complete the above process to split the table.

Experience shows that pure primary keys (attributes with no practical significance, such as auto-incrementing sequence numbers) can be ignored for paradigm analysis, after the paradigm analysis, add the pure primary key to the primary attribute group (the primary attribute group and the pure primary key must have a one-to-one correspondence relationship ).

Until BCNF is a solution to the function dependency problem, data dependency also includes other dependencies, such as multi-value dependency, which cannot be solved by the BC paradigm.


References:

1. http://dev.yesky.com/424/2220924.shtml

2. http://www.njcie.com/bbs/dispbbs.asp? Boardid = 19 & Id = 1174

(Video (enter the address of thunder ):

Http://www.njcie.com/DB/files/51.wmv

Http://www.njcie.com/DB/files/52.wmv

3. http://www.cnblogs.com/youring2/archive/2012/04/20/2459609.html

4. http://www.njcie.com/bbs/dispbbs.asp? Boardid = 19 & Id = 914

5. http://www.njcie.com/bbs/dispbbs.asp? Boardid = 19 & Id = 132 & page = 2

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.