Turn: 14 skills in Database Design (review and review)

Source: Internet
Author: User

1. Relationship between original documents and entities
It can be a one-to-one, one-to-many, and many-to-many relationship. In general, they are a one-to-one relationship: an original document pair

Corresponds to only one entity. In special circumstances, they may be one-to-many or many-to-one relationships, that is, an original document corresponds to multiple instances.

, Or multiple original documents correspond to one entity. The entity here can be understood as a basic table. After clarifying the corresponding relationship, we design

The input interface is advantageous.
[Example 1]: A copy of employee resume. In the human resources information system, there are three basic tables: employee basic information table and society.

Relationship table and work history table. This is a typical example of "one original document corresponds to multiple entities.

2. Primary Key and foreign key
Generally, an object cannot have neither a primary key nor a foreign key. In the E-R diagram, the entity at the leaf location can define the primary key,

You can also not define a primary key (because it has no children), but you must have a foreign key (because it has a father ).
The Design of primary keys and Foreign keys plays an important role in the design of global databases. After the design of the global database is complete

Database Design experts in the United States said: "keys are everywhere. There is nothing except keys." This is his database design experience.

It also reflects his highly abstract thoughts on the core of the information system (Data Model. Because: The primary key is the entity's high abstraction, the primary key and

Foreign key pair, indicating the connection between entities.

3. Basic table Properties
The basic table is different from the intermediate table and temporary table because it has the following four features:
(1) atomicity. Fields in the basic table cannot be decomposed.
(2) primitive. The record in the basic table is the record of the raw data (basic data.
(3) adequacy. From Basic Tables andCodeThe data in the table. All output data can be derived.
(4) stability. The structure of the basic table is relatively stable, and the records in the table must be stored for a long time.
After understanding the nature of the basic table, you can distinguish the basic table from the intermediate table and temporary table when designing the database.

4. paradigm standards
The relationship between the basic table and its fields should satisfy the third paradigm as much as possible. However, database design that meets the third paradigm is often not

The best design. In order to improve the operational efficiency of the database, it is often necessary to reduce the paradigm standard: appropriately add redundancy to achieve a space-for-time

.
[Example 2]: There is a basic table for storing items, as shown in table 1. The existence of the "amount" field indicates that the table is not designed properly.

The third paradigm, because the "amount" can be obtained by multiplying the "unit price" by "quantity", indicating that "amount" is a redundant field. However, add

The redundant field "amount" can increase the speed of query statistics. This is the practice of changing the space for time.
In Rose 2002, two types of columns are required: Data columns and calculation columns. Columns such as "amount" are called "calculation columns", while"

Columns such as unit price and quantity are called data columns ".
Table 1 Table Structure of the commodity table
Product Name product model unit price quantity amount
TV 29 2,500 40 100,000

5. A general understanding of the three paradigms
A general understanding of the three paradigms is of great benefit to database design. In database design, in order to better apply the three paradigms

The following three paradigms must be understood in a general sense ):
1 paradigm: 1nf is an atomic constraint on attributes. Attributes must be atomic and cannot be decomposed;
Second paradigm: 2nf is the uniqueness constraint on the record and requires that the record have a unique identifier, that is, the uniqueness of the object;
Third paradigm: 3nf is a constraint on field redundancy, that is, any field cannot be derived from other fields, and it requires that the field is not redundant.

.
No redundant database can be designed. However, non-redundant databases may not be the best database.

For row efficiency, we must lower the paradigm standard and retain redundant data as appropriate. The specific approach is to follow the third paradigm in conceptual data model design.

To lower the paradigm standard to be considered in the design of the physical data model. Reducing the paradigm is to add fields and allow redundancy.

6. Be good at identifying and correctly handling many-to-many relationships
If there is a many-to-many relationship between two entities, this relationship should be eliminated. The solution is to add a third instance between the two instances.

Body. In this way, the original multi-to-many relationship is now two one-to-many relationships. We need to rationally allocate the attributes of the original two Entities

To three entities. The third entity here is essentially a complex relationship, which corresponds to a basic table. Generally speaking, the number

Data warehouse design tools cannot identify many-to-many relationships, but can process many-to-many relationships.
[Example 3]: In the "library information system", "books" are an entity, and "readers" are also an entity. These two entities

The relationship is a typical multi-to-many relationship: A book can be borrowed by multiple readers at different times, and a single reader can borrow multiple

This book. To this end, you need to add a third entity between the two. This entity is named "borrow and return", and its attribute is: borrow time, borrow

It also indicates (0 indicates borrowing books, 1 indicates Returning books). In addition, it should also have two foreign keys (the primary key of the "book" and the primary key of the "Reader")

It can be connected to "books" and "readers.

7. Primary Key pK value method
PK is suppliedProgramThe table connection tool used by members can be a numerical string without physical significance, which is automatically implemented by the program by adding 1. Or

It is a combination of physical field names or field names. However, the former is better than the latter. When PK is a combination of field names, it is recommended that the fields

The number should not be too large. If the number is too large, the index takes up a large amount of space and the speed is slow.

8. Correct understanding of data redundancy
Duplicate primary keys and Foreign keys in multiple tables does not belong to data redundancy. This concept must be clear. In fact, many people are still unclear.

. The repeated occurrence of Non-key fields is data redundancy! It is also a low-level redundancy, that is, repetitive redundancy. Advanced redundancy is not a field

Repeated appears, but the field is derived.
[Example 4]: the unit price, quantity, and amount fields in the product. The "amount" is derived from the "unit price" multiplied by the "quantity ".

It is redundant, and it is a kind of advanced redundancy. Redundancy is designed to speed up processing. Only low-level redundancy increases

Data inconsistency, because the same data may be input multiple times from different time points, locations, and roles. Therefore, we advocate advanced Redundancy

Inherent redundancy), opposed to low-level redundancy (repetitive redundancy ).

9. There is no standard answer to the E--R Diagram
There is no standard answer to the E--R diagram of the information system, because its design and painting are not the only, as long as it covers the business of the system requirements

Range and function content are feasible. Instead, modify the E--R diagram. Although it does not have the only standard answer, it does not mean it can be random.

Design. The standard of good E-R diagram is: clear structure, simple association, moderate number of entities, reasonable attribute allocation, no low-level redundancy.

10. View technology is useful in Database Design
Unlike basic tables, code tables, and intermediate tables, a view is a virtual table that depends on the real table of the data source. View is for programmers

Using a window of a database is a form of comprehensive basic table data, a method of data processing, and a type of user data confidentiality.

Means. In order to perform complex processing, increase computing speed, and save storage space, the definition depth of a view generally cannot exceed three layers. If Layer 3

If the view is not enough, you should define a temporary table on the view and define the view on the temporary table. In this way, the depth of the view is

Not limited.
Views play a more important role in certain information systems related to national political, economic, technical, military, and security interests. These

After the physical design of the basic table of the system is completed, the first layer view is created on the basic table. The number and structure of this layer view are as follows:

The number and structure are exactly the same. It is also stipulated that all programmers are only allowed to operate on The View. Only the database administrator,

Only when multiple people master the "Security Key" Can they directly operate on the basic table. Readers can think about this: Why?

11. Intermediate tables, reports, and temporary tables
An intermediate table is a table that stores statistical data. It is designed for data warehouses, output reports, or query results. Sometimes it does not have a primary key

Foreign key (except data warehouse ). Temporary tables are designed by programmers and are used by individuals to store temporary records. The base table and intermediate table are in the DBA dimension.

Protection: temporary tables are automatically maintained by programmers themselves.

12. integrity constraints are manifested in three aspects
Domain Integrity: Use check to implement constraints. In the database design tool, when defining the value range of a field, there is a ch

The Eck button that defines the value city of a field.
Integrity of reference: it can be achieved through PK, FK, and table-level triggers.
User-Defined Integrity: it is a business rule that is implemented using stored procedures and triggers.

13. The method to prevent Database Design patching is the "Three Principles"
(1) The fewer tables in a database, the better. Only the number of tables less, in order to explain the system E--R diagram less refined, remove

Repeated redundant entities form a high degree of abstraction for the objective world, and carry out systematic data integration to prevent patching design;
(2) The fewer fields that combine primary keys in a table, the better. Because of the primary key function, one is to create a primary key index, and the other is as a sub-table.

Because of the foreign key, the number of fields in the primary key combination is less, which not only saves the running time, but also saves the index storage space;
(3) The fewer fields in a table, the better. Only when the number of fields is small can it indicate that there is no duplicate data in the system, and

There is little data redundancy, and more importantly, the reader is urged to "change columns to rows". This prevents the fields in the subtable from being pulled into the primary table.

, Leaving many blank fields in the master table. The so-called "Change columns to rows" means to pull out part of the content in the master table and create a separate table.

Sub-table. This method is very simple. Some people just don't get used to it, don't adopt it, don't execute it.
The practical principle of database design is to find a proper balance between data redundancy and processing speed. "Three shao" is an overall overview

A single principle cannot be isolated. This principle is relative, not absolute. The "three-plus" principle is certainly incorrect. Try

Think: If you override the same functionality of the system, the E--R diagram of one hundred entities (one thousand properties in total) is certainly more than two hundred entities (two thousand properties in total)

The E--R graph is much better.
We advocate the "Three shao" principle, which allows readers to learn to use Database Design technology for system data integration. The data integration step is

The file system is integrated into the application database, the application database is integrated into the topic database, and the topic database is integrated into the global integrated database.

The higher the degree of integration, the stronger the data sharing, the less information island phenomenon, the whole enterprise information system in the global E-R

Number, number of primary keys, and number of attributes.
The purpose of advocating the "Three shao" principle is to prevent readers from using the patch technology to constantly add, delete, and modify databases so that enterprise data

The database becomes a "Garbage Collection" for designing database tables at will, or a "Miscellaneous" for database tables, which eventually leads to the basic tables and replicas in the database.

The data tables, intermediate tables, and temporary tables are disorganized and countless. As a result, the information systems of enterprises and institutions cannot be maintained and paralyzed.
The "three-plus" principle can be implemented by anyone. This principle is the theory of "patching methods" for designing databases. "Three shao" Principle

It requires a high level of database design skills and art. Not everyone can do it, because this principle is to put an end to it.

Use the patching method to design the theoretical basis of the database.

14. Ways to Improve Database Operation Efficiency
Under the given system hardware and system software conditions, the ways to improve the operational efficiency of the database system are as follows:
(1) Reduce the paradigm, increase redundancy, use fewer triggers, and use more stored procedures during database physical design.
(2) When the computation is very complex and the number of records is very large (for example, 10 million records), complex computation must first be performed outside the database

After the file system is processed in C ++, It is appended to the table. This is the experience of designing the telecom billing system.
(3) It is found that there are too many records in a table. For example, if there are more than 10 million records, the table should be divided horizontally. The horizontal segmentation method is,

Take a value of the primary key PK of the table as the boundary, and horizontally split the records of the table into two tables. If a table contains too many fields, for example

Divide the table vertically and divide the original table into two tables.
(4) Optimize the Database Management System DBMS, that is, optimize various system parameters, such as the number of buffers.
(5) Optimize the program using the data-oriented SQL language as much as possibleAlgorithm.
In short, to improve the operational efficiency of the database, the database system-level optimization, database design-level optimization, and program-level optimization must be performed.

At the same time.

The fourteen skills mentioned above are gradually summarized by many people in a large number of database analysis and design practices. For those experienced

Readers cannot stick to it, but must understand it, be realistic, and be flexible. And gradually achieve: release in the Application

Development and Application in development.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.