Database design tips [go to]_ database other

Source: Internet
Author: User
1. Relationship between the original document and the entity
Can be a pair of one or one pairs of many, many-to-many relationship. In general, they are one-to-one relationships: a single original document corresponding to one entity. In special cases, they may be one-to-many or many-to-many relationships, i.e. a single original document corresponding to multiple entities, or multiple original documents corresponding to an entity. The entities here can be understood as basic tables. It is good for us to design the input interface after clarifying the corresponding relationship.

Example 1〗: An employee biographical information, in the Human resources information system, corresponds to three basic tables: Employee basic Situation table, social relations table, work resume form. This is a typical example of "a single original document corresponding to multiple entities."

   2. Primary key and foreign key
Generally, an entity cannot have neither a primary key nor a foreign key. In an e-R diagram, an entity in a leaf position can define a primary key or do not define a primary key (because it has no descendants), but must have a foreign key (because it has a father).
The design of primary key and foreign key plays an important role in the design of global database. When the global database was designed, a U.S. database design expert said: "The key, everywhere is the key, in addition to the key, nothing", this is his database design experience, but also reflects his information system core (data model) of the highly abstract thinking. Because: The primary key is the height abstraction of the entity, the pairing of the primary key and the foreign key, representing the connection between the entities.

   3. The nature of the basic table
A basic table differs from an intermediate table, a temporary table, because it has four features as follows:
(1) Atomic nature. The fields in the basic table are not to be decomposed.
(2) Original nature. Records in the base table are records of the original data (underlying data).
(3) Deductive nature. From the basic table and the data in the Code table, you can derive all the output data.
(4) stability. The structure of the base table is relatively stable, and the records in the table are stored for a long time.
Once you understand the nature of a basic table, you can differentiate the base table from the intermediate table and the temporary table when you design the database.

The relationship between the

4. Paradigm Standard
Basic table and its fields should satisfy the third normal form as much as possible. However, database design that satisfies the third paradigm is often not the best design. In order to improve the efficiency of the database, it is necessary to reduce the normal standard: increase the redundancy appropriately and achieve the goal of changing time in space.
Example 2〗: There is a basic table for storing items, as shown in table 1. The existence of the "Amount" field indicates that the design of the table does not satisfy the third normal form, since "amount" can be obtained by multiplying "unit price" by "quantity", stating that "amount" is a redundant field. However, to increase the "amount" of this redundant field, you can improve the speed of query statistics, this is the way to change time in space.
in Rose 2002, you specify two types of columns: data columns and computed columns. Columns such as amount are called computed columns, and columns such as unit price and quantity are called data columns.
Table 1 table structure for commodity tables
Commodity name commodity model price quantity amount
TV 29 inch 2,500 100,000

5. Popular understanding of three paradigms
Popular geography Solving three paradigms is good for database design.  In the database design, in order to better apply the three paradigms, it is necessary to have a popular understanding of the three paradigms (popular understanding is sufficient understanding, not the most scientific and accurate understanding):
First normal form: 1NF is the atomic constraint on the property, requiring the property to be atomic and not to be decomposed;
  The second normal paradigm: 2NF is the uniqueness of the record constraint, requiring the record to have a unique identity, that is, the uniqueness of the entity;
Third normal form: 3NF is a constraint on the redundancy of a field, that is, any field cannot be derived from another field and requires no redundancy in the field.
No redundancy in database design can be done. However, the database without redundancy is not necessarily the best database, sometimes in order to improve the efficiency of operation, it is necessary to reduce the normal standard and keep redundant data appropriately. The practice is to adhere to the third paradigm when designing the conceptual data model, and to reduce the standard of normal work into the physical data model design. The lower paradigm is adding fields, allowing redundancy.

   6. Be good at recognizing and dealing with many pairs of relationships correctly
This relationship should be eliminated if there is a many-to-many relationship between the two entities. The solution was to add a third entity between the two. In this way, it turns out that a many-to-many relationship now turns into a one-to-many relationship of more than two. The properties of the original two entities are reasonably allocated to three entities. The third entity here is essentially a more complex relationship, which corresponds to a basic table. Generally speaking, database design tools do not recognize many-to-many relationships, but can handle many-to-many relationships.

Example 3: In the Library information System, "book" is an entity, "reader" is also an entity. The relationship between the two entities is a typical many-to-many relationship: A book can be borrowed by many readers at different times, and a reader can borrow more books. To do this, to add a third entity between the two, the entity is named "Borrow book," and its properties are: Borrow time, borrow also sign (0 means to borrow books, 1 means return book), in addition, it should also have two foreign keys ("books" PRIMARY Key, "reader" primary key), so that it can with "book" and "Reader" connection.

   7. Primary key PK's value method
PK is for programmers to use the Inter-table connection tool, can be a no physical significance of the number string, automatically added by the program to achieve. It can also be a combination of a field name or a field name that has a physical meaning. But the former is better than the latter. When PK is a combination of field names, the number of recommended fields should not be too many, not only the index occupies a large space, but also slow.

   8. Correct understanding of data redundancy
The notion that primary keys and foreign keys recur in multiple tables, not data redundancy, must be clear, and many people are not yet clear. Non-key fields Repeat, is the data redundancy! It is also a low-level redundancy, i.e. repetitive redundancy. Advanced redundancy is not a recurring occurrence of a field, but a derivation of a field.

For example 4〗: "Unit price, quantity, Amount" Three fields in a product, "Amount" is derived from "unit price" multiplied by "quantity", it is redundancy, and it is an advanced redundancy. The purpose of redundancy is to improve processing speed. Only low-level redundancy increases the inconsistency of the data, because the same data can be entered multiple times, places, and roles at different times. Therefore, we advocate advanced redundancy (derived redundancy) against low-level redundancy (repetitive redundancy).

   9. E–r No standard answer
The E–r diagram of information system has no standard answer, because its design and drawing is not unique, so long as it covers the business scope and function content of the system requirement, it is feasible. Instead, modify the E–r diagram. Although it does not have the only standard answer, it does not mean that it is free to design. The standard of E-R diagram is: Clear structure, concise correlation, moderate number of entities, reasonable distribution of attributes, no low-level redundancy.

   10. View technology is useful in database design
Unlike basic tables, code tables, and intermediate tables, a view is a virtual table that relies on the real table of the data source. A view is a window for programmers to use a database, a form of data synthesis of base tables, a method of data processing, and a means of user data confidentiality. In order to perform complex processing, improve operation speed and save storage space, the definition depth of view is generally not more than three layers. If the three-tier view is still not enough, you should define the temporary table on the view, and then define the view on the temporary table. This iterative definition, the depth of the view is not restricted.

The role of the view is even more important for certain information systems relating to national political, economic, technical, military and security interests. After the basic table of these systems completes the physical design, the first level view is created immediately on the basic table, the number and structure of the view is exactly the same as the number and structure of the base table. And it is stipulated that all programmers are only allowed to operate on the view. Only the database administrator, with multiple personnel to master the "security key" to directly operate on the basic table. Ask the reader to think: Why is this?

   11. Intermediate tables, reports and temporary tables
An intermediate table is a table of statistics that is designed for data warehouses, output reports, or query results, sometimes without primary keys and foreign keys (except for data warehouses). A temporary table is designed by a programmer, storing temporary records for personal use. Base tables and intermediate tables are maintained by DBAs, and temporary tables are automatically maintained by the programmer's own program.

   12. Integrity constraints in three areas
Domain integrity: Using check to implement constraints, in the database design tool, when defining a field's range of values, there is a check button that defines the value of the field by the city. Referential integrity: Use PK, FK, table-level triggers to implement. User-defined integrity: It is a number of business rules that are implemented using stored procedures and triggers.

13. The method of preventing database design from patching is the "three Principles"
(1) The less the number of tables in a database the better. Only the number of tables is less, in order to explain the e–r of the system, the redundant entities have been removed, and the high abstraction of the objective world has been formed, and the data integration of the system has been made to prevent the design of patching;
(2) The fewer fields in a table that combine primary keys, the better. Because of the role of the primary key, one is to build the primary key index, the second is as a foreign key of the child table, so the number of fields of the combination primary key is less, not only save the running time, but also save the index storage space;
(3) The fewer fields in a table, the better. Only the number of fields is less, in order to show that there is no duplication of data in the system, and there is little data redundancy, more importantly, urging readers to learn "column change row", so that prevents the child table from pulling the fields into the main table, leaving a lot of free fields in the main table. The so-called "column change Row", is to pull out a part of the main table, another separate child table. This method is very simple, some people are not accustomed to, do not accept, do not carry out. The practical principle of the
database design is to find the right balance between data redundancy and processing speed. "Three Little" is a holistic concept, integrated perspective, can not isolate a certain principle. The principle is relative, not absolute. The principle of "more than three" is surely wrong. Just imagine: if you cover the same function of the system, 100 entities (a total of 1000 attributes) of the E–r graph, certainly more than 200 entities (a total of 2000 attributes) of the E–r graph, much better. The
advocates the "Three little" principle, is called the reader learns to use the database design technology to carry on the system data integration. The step of data integration is to integrate the file system into the application database, integrate the application database into the subject database, and integrate the subject database into the global comprehensive database. The higher the degree of integration, the stronger the data sharing, the less the information island phenomenon, the whole enterprise information system in the global e-R diagram of the number of entities, the number of primary keys, the number of attributes will be less. The purpose of the
advocating the "Three little" principles, is to prevent readers to use patching technology, and constantly to the database additions and deletions, so that the enterprise database into a random design database table "garbage heap", or database table "clump", and finally caused the database of basic tables, code tables, intermediate tables, temporary tables are disorganized, Countless, leading to enterprises and institutions of information systems can not be maintained and paralyzed. The "three +" principle can be done by anyone, the principle is "patching method" design database of the crooked Neo-Confucianism said. The principle of "three little" is few and few, it requires a higher database design skills and art, not anyone can do it, because the principle is to eliminate the "patching method" design the basis of the database.

   14. Ways to improve the efficiency of database operations
Under the given system hardware and system software condition, the method of improving the operation efficiency of the database system is:
(1) in the physical design of the database, reduce the paradigm, increase redundancy, less use of triggers, more use of stored procedures.
(2) When the calculation is very complex, and the number of records is very large (for example, 10 million), the complex calculation should first be outside the database, in a file system with the C + + language calculation processing completed, the last to add to the table. This is the experience of telecom billing system design.
(3) If you find that there are too many records for a table, such as more than 10 million, you want to split the table horizontally. Horizontal segmentation is done by dividing the table's record horizontally into two tables, with a value of the primary key PK being bounded. If you find that a table has too many fields, such as more than 80, the table is split vertically, and the original table is decomposed into two tables.
(4) Optimizing the DBMS of database management system, that is, optimizing various system parameters, such as the number of buffers.
(5) In the use of data-oriented SQL language programming, as far as possible to take the optimization algorithm.
In a word, in order to improve the efficiency of database, it is necessary to optimize the database system level, optimize the database design level and realize the program level, at the same time, make efforts at the three levels.

The above 14 techniques are summed up by many people in a large number of database analysis and design practices. For the use of these experiences, readers can not be born to help hard sets, rote memorization, but to digest understanding, seeking truth from facts, flexible grasp. and gradually achieve: in the application of development, in the development of applications.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.