14 Tips for Database design

Source: Internet
Author: User

1. Relationship between the original document and the entity
Can be a pair of one or one-to-many, many-to-many relationships. In general, they are a one-to-a-kind relationship: the original document

Should and only correspond to one entity. In special cases, they may be a one-to-many or many-to-one relationship, i.e. a single original document corresponding to multiple real

One entity, or multiple original documents. The entities here can be understood as basic tables. After understanding this relationship, we design

The input interface has great advantages.
Example 1〗: An employee's biographical information, in the Human Resources information System, the corresponding three basic tables: Employee basic information table, social

Relationship tables, work resume tables. This is a typical example of "a single original document corresponding to multiple entities".

2. Primary key and foreign key
Generally, an entity cannot have no primary key and no foreign key. In the E-r diagram, entities in the leaf area can define the primary key,

It is also possible to not define a primary key (because it has no descendants), but must have a foreign key (because it has a father).
The design of primary key and foreign key plays an important role in the design of global database. When the design of the global database is complete, there is a

U.S. database design experts said: "Key, Everywhere is the key, in addition to the key, nothing", this is his database design experience

Also reflects his highly abstract idea of the core (data model) of the information system. Because: The primary key is the high abstraction of the entity, the primary key and

A pair of foreign keys that represents a connection between entities.

3. Nature of basic tables
A base table differs from an intermediate table, a temporary table, because it has the following four attributes, for example:
(1) atomicity. The fields in the base table are non-biodegradable.
(2) Primitive nature. The records in the base table are the records of the original data (the underlying data).
(3) Deductive nature. By the data in the base table and the Code table, all the output data can be derived.
(4) stability. The structure of the base table is relatively stable, and the records in the table are stored for a long time.
After you understand the nature of the base table, you can differentiate the base table from the intermediate table and the temporal table when you design the database.

4. Paradigm Standard
The relationship between the base table and its fields should satisfy the third paradigm as much as possible. However, the database design that satisfies the third paradigm is often not

The best design. In order to improve the efficiency of database execution, it is often necessary to reduce the paradigm standard: To add redundancy appropriately, to achieve space change time

The purpose.
Example 2〗: There is a basic table for storing goods, as seen in table 1. The existence of the "Amount" field indicates that the design of the table is unsatisfactory

Foot in the third paradigm, since "amount" can be obtained by multiplying "unit price" by "quantity", stating "Amount" is a redundant field. However, add?

The "amount" of this redundant field can increase the speed of query statistics, which is the practice of space-changing time.
In Rose 2002, it is stipulated that there are two types of columns: data columns and computed columns. The "Amount" column is called a computed column, and the

The column of unit price and quantity is called the data column.
Table 1 table structure of the commodity table
Commodity name Commodity model unit Price quantity amount
TV 29 "2,500 40 100,000
  
5. Popular understanding of three paradigms
A popular understanding of three paradigms has great advantages for database design. In the database design, in order to better apply three paradigms, it is

There must be a common understanding of the three paradigms (popular understanding is enough to understand, not the most scientific and accurate understanding):
The first paradigm: 1NF is an atomic constraint on attributes, requiring attributes to be atomic and non-decomposed.
The second paradigm: 2NF is a unique constraint on records, requiring records to have a unique identity, that is, the uniqueness of the entity;
The third paradigm: 3NF is a constraint on field redundancy, meaning that no field can be derived from another field, it requires no redundancy in the field


There is no redundant database design to do. However, databases that are not redundant are not necessarily the best databases, sometimes to improve the

Efficiency, it is necessary to reduce the standard of normalization and to retain redundant data appropriately. The detailed approach is to follow the third paradigm when designing the conceptual data model

, the work of reducing paradigm standards is put into consideration when designing physical data models. Reducing the paradigm is adding the field, agreeing to redundancy.

6. Be good at identifying and correctly dealing with many-to-many relationships
This relationship should be eliminated if there is a many-to-many relationship between the two entities. The way to eliminate this is to add between the two? The third real

Body. In this way, the original a many-to-many relationship, now become two one-to-many relationship. To properly assign the attributes of the original two entities

to three entities. The third entity here is essentially a more complex relationship, which corresponds to a basic table. Generally speaking, the number

The library design tool does not recognize many-to-many relationships, but it can handle many-to-many relationships.
Example 3: In "Library information System", "book" is an entity, "reader" is also an entity. The two entities '

Relationship is a typical many-to-many relationship: A book can be borrowed by multiple readers at different times, and a reader can borrow more

This book. To do this, add it between the two? The third entity, which is named "Borrowed book," has the following properties: Borrowing time, borrowing

Also the logo (0 means borrowing, 1 means returning), in addition, it should also have two foreign keys ("book" The Primary Key, "reader" of the primary key), so that

It can be connected with "book" and "Reader".

7. The primary key PK value method
PK is an inter-table connection tool for program apes, which can be a digital string with no physical meaning, and is implemented by the program itself. can also be

A combination of field names or field names that are physically meaningful. But the former is better than the latter. When PK is a combination of field names, the recommended field

Not too many, not only the index occupies a large space, and the speed is slow.

8. Correct understanding of data redundancy
The primary key and the foreign key in the multi-table recurrence, does not belong to the data redundancy, this concept must be clear, actually many people still unclear

。 The recurrence of non-key fields is the data redundancy! And is a kind of low-level redundancy, that is, repetitive redundancy. Advanced redundancy is not a field

Occurs repeatedly, but the derivation of the field appears.
Example 4〗: "Unit price, quantity, Amount" Three fields in a product, "Amount" is derived from "unit price" multiplied by "quantity"

, it is redundant and is a high level of redundancy. The purpose of redundancy is to improve processing speed. Only low-level redundancy will be added.

Due to the same data, it is possible to never enter the same time, place, or role multiple times. Therefore, we advocate advanced Redundancy (PI-

redundancy), against low-level redundancy (repetitive redundancy).

9. e--r Chart No standard answer
The E--r diagram of information system has no standard answer, because its design and drawing is not unique, it only needs to cover the business of system demand.

Scope and function content, is feasible. Conversely, to change the E--r diagram. Although it does not have the only standard answer, it does not mean being able to

Design. The standard of a good e-r chart is: Clear structure, simple association, moderate number of entities, reasonable allocation of attributes, no low-level redundancy.

10. View technology is very useful in database design
Unlike basic tables, code tables, and intermediate tables, a view is a virtual table that relies on a real table of data sources. View is for program Ape

The use of a form of a database is a form of data synthesis of the base table, is a method of data processing, is a security of user data

Means. In order to perform complex processing, increase computation speed and save storage space, the definition depth of the view should not exceed three layers. If the three floor

The view is still not sufficient, you should define the temporal table on the view and then define the view on the temporary table. With this iterative definition, the depth of the view is

is not limited.
The role of views is more important for certain information systems related to national political, economic, technical, military and security interests. These

After the physical design of the basic table of the system, the first layer view is set up on the basic table immediately, the number and structure of this view, and the basic table

The number and structure are exactly the same. and stipulates that all the procedural apes are only allowed to operate on the view. Just a database administrator, with

A "security key" that is shared by multiple people, and the ability to operate directly on a basic table. Let the reader think: why is this?

11. Intermediate tables, reports and temporary tables
The intermediate table is the table that holds the statistics, which is designed for the data warehouse, the output report, or the query results, and sometimes it has no primary key and

Foreign key (except Data Warehouse). The temporary table is the program ape individual design, the storage temporary record, is used by the individual. The base table and the intermediate table are made up of the DBA dimension

The program ape itself uses the program to maintain itself voluntarily.

12. Completeness constraint table now three aspects
Domain Integrity: Use Check to implement constraints, in the database design tool, the field value range is defined, there is a CH

Eckbutton, which defines the value of the field through the city.
Participation Integrity: With PK, FK, table-level trigger to achieve.
User-defined integrity: It is a business rule that is implemented with stored procedures and triggers.

13. The way to prevent database design patching is "three-little principle"
(1) The smaller the number of tables in a database, the better. Only the number of tables, the ability to explain the system E--r diagram few but good, removed the

Redundant entities have formed a high degree of abstraction of the objective world, and the systematic data integration has prevented the patching design;
(2) The fewer fields in a table combine primary keys, the better. Because of the role of the primary key, one is to build the primary key index, and the second is to do as a child table

Foreign key, so the number of fields of the combined primary key is less, not only saves the execution time, and saves the index storage space;
(3) The smaller the number of fields in a table, the better. There are only a few fields, talent indicates that there is no data recurrence in the system, and

There is very little data redundancy, and it is more important to urge the reader to learn to "row", which prevents fields in the child table from being pulled into the main table.

, leaving a lot of spare fields in the main table. The so-called "row of rows" is to pull out a portion of the main table and build a separate

The child table. This method is very easy, some people are not accustomed to, do not adopt, do not execute.
The practical principle of database design is to find the right balance between data redundancy and processing speed. "Three Little" is a general overview

Think, a comprehensive view, cannot isolate a certain principle. The principle is relative, not absolute. The "more than three" principle is certainly wrong. Try

Think: If the system is covered with the same function, 100 entities (total 1000 properties) of the E--r graph, certainly more than 200 entities (2000 properties)

E--r map, much better.
Advocating the principle of "three little" is called the reader to learn to use the database design technology for system data integration. The steps of data integration are to

The file system integrates into the application database, integrates the application database into the subject database, and integrates the subject database into the global comprehensive database.

The higher the degree of integration, the more data sharing, the less information island phenomenon, the whole enterprise information System of the global E-r diagram of the entity

The number, the number of primary keys, and the number of attributes will be less.
The purpose of advocating the principle of "three little" is to prevent readers from using patching technology, and constantly change the database to make additions and deletions, so that enterprise data

The library becomes the "garbage heap" of any design database table, or "clump" of the database table, and finally causes the basic table in the database, generation

The Code table, the intermediate table, the temporary table is disorderly, countless, causes the enterprise information system to be unable to maintain and paralysis.
"More than three" principle no matter what people can do, the principle is "patching method" design database of Crooked Science said. The principle of "three little"

Is the principle of few but good, it requires a high degree of database design skills and art, not no matter what people can do, because the principle is to eliminate

A theoretical basis for designing a database with a "patching method".

14. Ways to improve the efficiency of database implementation
Under the condition of the given system hardware and system software, the method of improving the execution efficiency of the database system is:
(1) in the database physical design, reduce the paradigm, add redundancy, less use of triggers, multi-use stored procedures.
(2) When the calculation is very complex, and the number of records is very large (such as 10 million), the complex calculation is first outside the database, to

When the file system is processed in C + +, the final storage is appended to the table. This is the experience of telecom billing system design.
(3) A table is found to have too many records, such as more than 10 million, you want to cut the table horizontally. The practice of horizontal cutting is that

The table's record level is cut to two tables, using a value of the table's primary key PK as the line. If you find that there are too many fields for a table, such as exceeding

80, the table is cut vertically, and the original table is decomposed into two tables.
(4) The database management system DBMS system optimization, that is, the optimization of various system parameters, such as the number of buffers.
(5) When using the data-oriented SQL language for programming, the optimization algorithm should be taken as far as possible.
In a word, to improve the efficiency of database execution, it must be optimized from database system level, database design level and program implementation level.

, these three levels work at the same time.

The above 14 skills, is a lot of people in a large number of database analysis and design practice, gradually summed up. For these experiences,

Use, readers can not help hard sets, rote memorization, and to digest understanding, seeking truth from facts, flexible grasp. and gradually achieve: in the application of hair

Applications in development.

14 Tips for Database design

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.