A detailed explanation of the three main paradigms of database

Source: Internet
Author: User


2014-02-25 21:381961 People ReadComments (0)CollectionLift

The database design paradigm is the specification that database design needs to meet, the database that satisfies these specification is concise, the structure is clear, at the same time, does not take place insert (insert), delete and update operation exception. The reverse is a mess, not only to the database programmer to create trouble, and ugly, may have stored a large number of unnecessary redundant information.

Paradigm Description

1.1 First Normal (1NF) non-repeating columns

The so-called First paradigm (1NF) refers to the fact that each column of a database table is an indivisible basic data item and cannot have multiple values in the same column, that is, an attribute in an entity cannot have multiple values or cannot have duplicate properties. If duplicate attributes are present, you may need to define a new entity, which is composed of duplicate attributes, and a one-to-many relationship between the new entity and the original entity. In the first normal form (1NF), each row of a table contains only one instance of information. In short, the first paradigm is a column with no duplicates.

Note: In any relational database, the first paradigm (1NF) is the basic requirement for relational schemas, and a database that does not meet the first normal form (1NF) is not a relational database.

For example, the following database tables are in accordance with the first paradigm:

Field 1

Field 2

Field 3

Field 4

Such database tables do not conform to the first paradigm:

Field 1

Field 2

Field 3

Field 4

Field 3.1

Field 3.2

A field in a database table is a single attribute and cannot be divided. This single attribute consists of a basic type, including Integer, real, character, logical, date, and so on. Obviously, in any current relational database management system (DBMS), it is impossible for a fool to make a database that does not conform to the first paradigm, because these DBMS do not allow you to divide a column of a database table into two or more columns. Therefore, it is impossible for you to design a database that does not conform to the first paradigm in your existing DBMS.

1.2 Second normal form (2NF) property fully dependent on primary key [eliminate partial child function dependency]

If the relationship pattern R is the first paradigm, and every non-primary property in R relies on a candidate key for R, then it is called a second-paradigm pattern.

The second paradigm (2NF) is established on the basis of the first paradigm (1NF), i.e. satisfying the second normal form (2NF) must first satisfy the first paradigm (1NF). The second normal form (2NF) requires that each instance or row in a database table must be divided by a unique region. It is often necessary to add a column to the table to store the unique identity of each instance. This unique attribute column is called the primary key or primary key, and the main code.

For example, the Employee Information table adds the employee number (emp_id) column because each employee's employee number is unique, so each employee can be uniquely differentiated.

In short, the second normal form (2NF) is completely dependent on the primary key for the non-master attribute.

The so-called complete dependency refers to the inability to exist only to rely on the main key part of the property (with function dependent w→a, if there is XW, there is x→a established, then called W→a is a local dependency, otherwise it is called w→a is a complete function dependency). If present, then this part of the attribute and primary key should be separated to form a new entity, and the new entity is a one-to-many relationship with the original entity.

Assume that the selection relationship table is Selectcourse (school number, name, age, course name, score, credits), keyword for the combination of keywords (study number, course name), because of the following decision relationship:

(School number, course name) → (name, age, score, credits)

This database table does not meet the second normal form because of the following decision relationship:

(course name) → (credits)

(school number) → (name, age)

That is, the presence of a field in the combo key determines the non-keyword situation.

Because it does not conform to 2NF, the following questions exist for this class selection relationship:

(1) Data redundancy:

The same course by N students elective, "credit" repeated n-1 times, the same student elective m courses, name and age repeated m-1 times.

(2) Update exception:

If the credit of a course is adjusted, the "credits" value of all the rows in the data sheet should be updated, otherwise the same course credit will be different.

(3) Insert exception:

Suppose a new course is to be opened and no one has yet been enrolled. Thus, the course name and credits cannot be recorded in the database because there is no "learning number" keyword.

(4) Delete exception:

Assuming that a group of students has completed elective courses, these elective records should be removed from the database table. At the same time, however, the course name and credit information were also removed. Obviously, this can also lead to an insertion exception.

Change the course of the elective selectcourse to the following three tables:

Student: Student (school number, name, age);

Course: Course (course name, credits);

Elective relationship: Selectcourse (School number, course name, score).

Such database tables conform to the second paradigm, eliminating data redundancy, update exceptions, insert exceptions, and delete exceptions.

In addition, all single-key database tables conform to the second normal form, as there is no possible combination of keywords.

1.3 Third normal form (3NF) property does not depend on other non-primary properties [eliminate transitive dependencies]

If the relationship pattern R is the second normal, and each non-primary attribute does not pass a candidate key that relies on R, then R is called a third-paradigm pattern.

Satisfying the third normal form (3NF) must first satisfy the second normal form (2NF). The third paradigm (3NF) requires that a database table not contain non-primary key information already contained in other tables.

For example, there is a departmental information table, where each department has a department number (dept_id), a department name, a department profile, and so on. Then the department number is listed in the Employee Information table, the department name, department profile and other departments related information can no longer be added to the Employee Information table. If there is no departmental information table, it should be built according to the third paradigm (3NF), otherwise there will be a lot of data redundancy.

The third paradigm (3NF): On the basis of the second paradigm, if there is no non-critical field in the data table the transfer function dependency on either of the candidate key fields conforms to the third paradigm. In short, the third paradigm is that properties do not depend on other non-principal properties.

The so-called transfer function dependency, refers to if there is a "a→b→c" decision relationship, the C transfer function depends on A.

Therefore, a database table that satisfies the third paradigm should not have the following dependencies:

key field → Non-critical field x→ non-critical field Y

Assume that the Student relationship table is student (school number, name, age, school, college location, college phone), the keyword is a single keyword "study number" because of the following decision relationship:

(school number) → (name, age, school, college location, college phone)

This database is 2NF compliant, but does not conform to 3NF because of the following decision relationship:

(school number) → (school) → (college location, college phone)

That is, there is a non-critical field "College location", "College phone" to the key field "study number" of the transfer function dependency.

It also has data redundancy, update exceptions, insert exceptions, and delete exceptions, which readers can analyze on their own.

The Student relations table is divided into the following two tables:

Student: (School number, name, age, school);

College: (College, location, telephone).

Such database tables conform to the third paradigm, eliminating data redundancy, update exceptions, insert exceptions, and delete exceptions.



Cases:

The most common design paradigms in real-world development are three:

1. First paradigm (ensuring that each column remains atomic)

The first paradigm is the most basic paradigm. If all the field values in a database table are non-exploded atomic values , the database table satisfies the first paradigm.

The rational follow-up of the first paradigm needs to be determined according to the actual needs of the system. For example, some database systems need to use the "address" attribute, the "address" attribute should be directly designed as a database table field. However, if the system often accesses the "city" part of the "address" attribute, then it is not to be the "address" attribute to be re-split into provinces, cities, detailed address and other parts of storage, so that in the address of a part of the operation will be very convenient. This design satisfies the first paradigm of the database, as shown in the following table.

The user information shown in the table above follows the requirements of the first paradigm, which makes it very convenient to classify users using cities, and also improves the performance of the database.

2. Second paradigm (ensure that each column in the table is related to the primary key)

The second paradigm is based on the first paradigm in a more advanced layer. The second paradigm needs to ensure that each column in a database table is related to the primary key, not just one part of the primary key (primarily for the Federated primary key). In other words, in a database table, only one data can be saved in a table, and multiple data cannot be saved in the same database table.

For example, to design an order information table, because there may be more than one item in the order, the order number and the product number are used as the federated primary key for the database table, as shown in the following table.

Order Information Form

This creates a problem: The table is the Union primary key with the order number and the product number. In this table, the product name, unit, commodity price and other information is not related to the table's primary key, but only related to the product number. So this violates the design principle of the second paradigm.

And if the Order Information table is split, the product information is separated into another table, the Order Item table is also separated into another table, it is perfect. as shown below.

This design, to a large extent, reduces the redundancy of the database. If you want to get the product information for an order, use the item number to inquire in the product information sheet.

3. Third paradigm (ensure that each column is directly related to the primary key column, not indirectly)

The third paradigm needs to ensure that each column of data in a data table is directly related to the primary key, not indirectly .

For example, when designing an order data table, the customer number can be used as a foreign key and order table to establish the corresponding relationship. Instead of adding fields to the order form about other customer information (such as name, company, etc.). The design shown in the following two tables is a database table that satisfies the third paradigm.

This way, when querying the order information, we can use the customer number to refer to the records in the Customer information table, and do not have to enter the contents of the customer information in the Order Information table more than once, reducing the data redundancy


A detailed explanation of the three main paradigms of database

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.