Database Design Principles

Source: Internet
Author: User

Database Design Principles


1. The relationship between the original document and the object can be one-to-one, one-to-many, and many-to-many. In general, they are a one-to-one relationship: an original document corresponds to only one entity.
In special circumstances, they may be one-to-many or many-to-one relationships, that is, an original document corresponds to multiple entities, or multiple original documents correspond to one entity.
The entity here can be understood as a basic table. After clarifying the corresponding relationship, it will be of great benefit to the design of the input interface.

[Example 1]: A copy of employee resume. In the human resources information system, three basic tables are required: employee basic information table, social relationship table, and work history table.
This is a typical example of "one original document corresponds to multiple entities.

2. For Primary keys and Foreign keys, an object cannot have both primary keys and Foreign keys. In the E-R diagram, the entity at the leaf location can either define a primary key or not
(Because it has no children), but it must have a foreign key (because it has a father ).

The Design of primary keys and Foreign keys plays an important role in the design of global databases. After the design of the global database is complete, an American database design specialist
He said: "keys are everywhere. There is nothing except keys." His database design experience also reflects his information system core.
High abstraction of the Mind (Data Model. Because: The primary key is the height abstraction of the object. The pairing between the primary key and the foreign key indicates the connection between the object.

3. The basic table is different from the intermediate table and temporary table because it has the following four features:
(1) atomicity. Fields in the basic table cannot be decomposed.
(2) primitive. The record in the basic table is the record of the raw data (basic data.
(3) adequacy. All output data can be derived from the data in the basic table and code table.
(4) stability. The structure of the basic table is relatively stable, and the records in the table must be stored for a long time.
After understanding the nature of the basic table, you can distinguish the basic table from the intermediate table and temporary table when designing the database.

4. The relationship between the basic table of the paradigm standard and its fields should satisfy the third paradigm as much as possible. However, database design that meets the third paradigm is often not the best design.
In order to improve the efficiency of database operation, we often need to reduce the paradigm standard: appropriately increase redundancy to achieve the purpose of changing the space for time.

[Example 2]: There is a basic table for storing items, as shown in table 1. The existence of the "amount" field indicates that the table design does not meet the third paradigm,
Because the "amount" can be obtained by multiplying the "unit price" by "quantity", the "amount" is a redundant field. However, add the redundant field "amount,
This can increase the speed of query statistics, which is the practice of changing the space for time.
In Rose 2002, two types of columns are required: Data columns and calculation columns. Columns such as "amount" are called "calculation columns", while "unit price" and
Columns such as "quantity" are called "Data columns ".

Table 1 Table Structure of the commodity table
Product Name product model unit price quantity amount
TV 29 2,500 40 100,000

5. A general understanding of the three paradigms means a general understanding of the three paradigms, which is of great benefit to database design. In database design, in order to better apply the three paradigms, we must understand
Three paradigms (general understanding is sufficient, not the most scientific and accurate understanding ):
1 paradigm: 1NF is an atomic constraint on attributes. Attributes must be atomic and cannot be decomposed;
Second paradigm: 2NF is the uniqueness constraint on the record and requires that the record have a unique identifier, that is, the uniqueness of the object;
Third paradigm: 3NF is a constraint on field redundancy, that is, any field cannot be derived from other fields, and it requires that the fields are not redundant.

No redundant database can be designed. However, databases without redundancy may not be the best. Sometimes, to improve operational efficiency, you must reduce
Low paradigm standard, with redundant data properly retained. The specific approach is to follow the third paradigm in conceptual data model design and reduce the workload of paradigm standards to physical
Data Model Design considerations. Reducing the paradigm is to add fields and allow redundancy.

6. Be good at identifying and correctly handling the many-to-many relationship. If there is a many-to-many relationship between two entities, this relationship should be eliminated. The solution is to add a third entity between the two. In this way, the original
Multiple-to-many relationships are now two one-to-multiple relationships. The attributes of the original two entities should be reasonably allocated to the three entities. The third one here
An object is essentially a complex relationship that corresponds to a basic table. Generally, database design tools cannot identify many-to-many relationships,
Multi-to-many relationships.

[Example 3]: In the "library information system", "books" are an entity, and "readers" are also an entity. The relationship between the two entities is one
Typical multi-to-many relationship: A book can be borrowed by multiple readers at different times, and a single reader can borrow multiple books. Therefore
The third entity is added. The entity name is "borrow and return". Its attribute is: Borrow and return time, borrow and return sign (0 indicates borrowing and returning; 1 indicates Returning books). In addition,
It should also have two foreign keys (the primary key of the "book" and the primary key of the "Reader"), so that it can be connected to the "book" and "Reader.

7. Primary Key PK value method PK is a table Connection Tool for programmers. It can be a non-physical numeric string and is automatically added by the program with 1. It can also be physical.
The combination of field names or field names. However, the former is better than the latter. When PK is a combination of field names, it is recommended that the number of fields should not be too large.
It occupies a large amount of space and is slow.

8. Correct Understanding of repeated appearance of redundant primary keys and Foreign keys in multiple tables does not belong to data redundancy. This concept must be clear. In fact, many people are still unclear. Unique non-key fields
The re-occurrence is data redundancy! It is also a low-level redundancy, that is, repetitive redundancy. Advanced redundancy does not appear repeatedly, but is derived from a field.

[Example 4]: the unit price, quantity, and amount fields in the product. The "amount" is derived from the "unit price" multiplied by the "quantity", which is redundancy,
It is also a type of advanced redundancy. Redundancy is designed to speed up processing. Only low-level redundancy increases data inconsistency because the same data can
It can be input multiple times from different times, locations, and roles. Therefore, we advocate advanced redundancy (derivative redundancy) against low-level redundancy (repetitive redundancy ).

9. E--R Diagram No standard answer Information System E--R Diagram No standard answer, because its design and painting is not the only, as long as it covers the business scope and functional content of the system requirements,
Is feasible. Instead, modify the E--R diagram. Although it does not have the only standard answer, it does not mean that it can be designed at will. Good E-R diagram of the standard is:
Clear structure, concise Association, moderate number of entities, reasonable attribute allocation, no low-level redundancy.

10. View technology is useful in database design. Unlike basic tables, code tables, and intermediate tables, a view is a virtual table that relies on the real tables of the data source. The view is for programmers to use the database.
A window is a form of base table data synthesis, a method of data processing, and a means of user data confidentiality. For complex processing,
Increase computing speed and save storage space. The definition depth of a view generally cannot exceed three layers. If the three-layer view is not enough, you should define a temporary table on The View,
Define the view on the temporary table. In this way, the depth of the view is unlimited.

Views play a more important role in certain information systems related to national political, economic, technical, military, and security interests. The basic tables of these systems are complete.
After the physical design, create the first-level view on the basic table. The number and structure of the view on this layer are exactly the same as the number and structure of the basic table.
It is also stipulated that all programmers are only allowed to operate on The View. Only the database administrator can take the "Security Key" jointly controlled by multiple personnel ",
In order to operate directly on the basic table. Readers can think about this: Why?

11. an intermediate table, report, and temporary table is a table that stores statistical data. It is designed for data warehouses, output reports, or query results. Sometimes it does not have a primary key or a foreign key (Data Warehouse
Library ). Temporary tables are designed by programmers and are used by individuals to store temporary records. The base table and intermediate table are maintained by the DBA, and the temporary table is maintained by the programmer.
Automatically maintained by the application.

12. integrity constraints are manifested in the integrity of the three fields: Use Check to implement constraints. In the database design tool, when defining the field value range, there is a Check button
It defines the Value City of the field.
Integrity of reference: it can be achieved through PK, FK, and table-level triggers.
User-Defined Integrity: it is a business rule that is implemented using stored procedures and triggers.

13. The method to prevent Database Design patching is the "Three-less principle" (1) the fewer tables in a database, the better. Only the number of tables is less, in order to show that the system E--R diagram is less refined, remove the redundant
Entities form a high degree of abstraction for the objective world, and carry out systematic data integration to prevent patching design;

(2) The fewer fields that combine primary keys in a table, the better. Because the primary key is used to create a primary key index, and the other is used as the foreign key of the sub-table
The number of fields with primary keys is reduced, which not only saves the running time, but also saves the index storage space;

(3) The fewer fields in a table, the better. Only when the number of fields is small can it indicate that there is no data duplication in the system and there is little data redundancy.
More importantly, the reader is urged to learn how to "change columns to rows", which prevents the field in the sub-table from being pulled into the main table.
Multiple spare fields. The so-called "Change columns to rows" means to pull out part of the main table and create a sub-Table separately. This method is simple.
Single, some people do not get used to, do not adopt, do not execute.

The practical principle of database design is to find a proper balance between data redundancy and processing speed. "Three shao" is an overall concept,
A principle cannot be isolated. This principle is relative, not absolute. The "three-plus" principle is certainly incorrect. Imagine: if the system is covered with the same power
Yes, a E--R graph of one hundred entities (one thousand properties in total) is definitely much better than a E--R graph of two hundred entities (two thousand properties in total.

We advocate the "Three shao" principle, which allows readers to learn to use Database Design technology for system data integration. The data integration step is to integrate the File System
For the application database, the application database is integrated into the topic database, and the topic database is integrated into the global integrated database. The higher the degree of integration, the data
The more shared, the less information islands, the number of entities, the number of primary keys, the number of attributes in the global E-R diagram of the enterprise information system
The smaller the number.

The purpose of advocating the "Three shao" principle is to prevent readers from using the patch technology to constantly add, delete, and modify databases, making enterprise databases arbitrary.
Design the "Garbage Collection" of database tables, or the "Miscellaneous" of database tables, and finally create basic tables, code tables, intermediate tables, and temporary tables in the database.
The information systems of enterprises and institutions cannot be maintained.

The "three-plus" principle can be implemented by anyone. This principle is the theory of "patching methods" for designing databases. The "three less" principle is less refined
Principle: it requires a high level of database design skills and art, not everyone can do it, because this principle is to prevent the use of "patching methods"
Design the theoretical basis of the database.

14. Ways to improve the operational efficiency of the database system under the given system hardware and system software conditions:
(1) Reduce the paradigm, increase redundancy, use fewer triggers, and use more stored procedures during database physical design.
(2) When the computation is very complex and the number of records is very large (for example, 10 million records), complex computation must first be performed outside the database
After the formula is processed in C ++, the database is appended to the table. This is the experience of designing the telecom billing system.
(3) It is found that there are too many records in a table. For example, if there are more than 10 million records, the table should be divided horizontally. The method of horizontal segmentation is to use the primary key of the table
When a value of a PK is a boundary, the records of the table are divided horizontally into two tables. If you find that there are too many fields in a table, for example, more than 80
Vertically split the table and split the original table into two tables.
(4) Optimize the Database Management System DBMS, that is, optimize various system parameters, such as the number of buffers.
(5) When using the data-oriented SQL language for programming, we should try to adopt optimization algorithms.
In short, to improve the operational efficiency of the database, the database system-level optimization, database design-level optimization, and program-level optimization are required.
At the same time.

The fourteen skills mentioned above are gradually summarized by many people in a large number of database analysis and design practices. For the use of these experiences, readers should not be able to stick to them, but should digest and understand them, be realistic, and be flexible. And gradually achieve: Development in the application and application in the development.


From: http://www.javaeye.com/topic/281611

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.