Reprint Songdeyouxiang
1. Database naming specification
The use of 26 letters (case-sensitive) and 0-9 of the natural number (often do not need) with an underscore ' _ ' composition;
The name is concise and clear (the length cannot exceed 30 characters);
For example: User, stat, log, can also Wifi_user, Wifi_stat, Wifi_log to the database to add a prefix;
Unless the backup database can add 0-9 of the natural number: user_db_20151210;
2, database table name naming specification adopted26 English letters (case sensitive) and0-9 natural number (often not required) with underline' _ ' composition; Simple and clear naming, multiple words underlined' _ ' separated; For example: User_login, User_profile, User_detail, User_role, User_role_relation, User_role_right, user_role_right_relation Table prefixes' User_ ' can effectively show the tables of the same relationship together;3, database table field name naming specification adopted26 English letters (case sensitive) and0-9 natural number (often not required) with underline' _ ' composition; Simple and clear naming, multiple words underlined' _ ' separated; For example: User_login table field user_id, user_name, Pass_word, Eamil, TickIT, status, Mobile, add_time; There must be a self-increment primary key in each table, and the Add_time (default system time) table has the same name as the associated field names between tables;4, the Database table field type specification with as few storage space as possible to save the data of a field; Example: can useint do not use varchar,Char, which can be used with varchar (16) do not use varchar (256); IP addresses are best used with the type int; fixed-length types are best used with char, for example: ZIP code; You can use tinyint instead of smallint, int; it is bestto give each field a default value, preferably not null; 5, the database table index specification is concise and clear, for example: User_login table user_name The index of the field should be User_name_index unique index; Create a primary key index for each table; Create a reasonable index for each table; Please be careful when building composite indexes. 6, simple familiar with the first paradigm of database paradigm (1NF): The field values are atomic, can not be divided (all relational database systems are satisfied with the first paradigm); For example, the last Name field, where the surname and name is a whole, if you distinguish first and last name then you must set up two separate fields; the second normal form ( 2NF): A table must have a primary key, that is, each row of data can be uniquely differentiated; Note: First paradigm must be satisfied first; Third paradigm (3NF): A table cannot contain information about non-critical fields in other related tables, that is, the data table cannot have a sink field; Note: The second paradigm must be satisfied first; Note: often we do not adhere to the third normal form in the design table, because the reasonable sink remainder field will give us less query for join; For example: The album table will be added to the image of the click Number field, in the album Picture table will also add the image of the click Number field;
MySQL Database design principles
1, the core principle does not operate the database; CPU computing must be moved to the business layer; Number of control columns (field few but good, number of fields suggested inWithin 20); Balance paradigm and redundancy (efficiency first; often sacrifice paradigm) reject3B (Rejecting large SQL statements: Big SQL, rejecting large things: big transaction, rejecting large batches: big batch);2, the field class principle with a good value type (with the appropriate field type to save space); Convert characters into numbers (the best conversions that can be transformed, as well as saving space and improving query performance); Avoid using null fields (Null fields are difficult to query for optimizations, NULL field indexes require extra space, invalid composite index for null fields); Less text type (use varchar instead of the text field as much as possible);3. Index class principle reasonable use index (improve query, slow update, index must not be more better); The character field must have a prefix index; Do not perform column operations in the index; InnoDB the primary key is recommended to use the self-increment column (the primary key to establish the clustered index, the primary key should not be modified, the string should not be the master key) (Understand the InnoDB index preservation structure to know); No foreign keys (guaranteed by the program);4, the SQL class principle SQL statement as simple as possible (a SQL can only be in a CPU operation, large statement to split small statements, reduce lock time, a large SQL can block the entire library); A simple transaction; Avoid using TRIG/FUNC (triggers, functions are not replaced by client programs); Noselect * (Consumption of Cpu,io, memory, bandwidth, this program is not extensible); or overwrite to in (or efficiency is n level); or rewrite to union (MySQL index merge is mentally retarded); select ID from t where phone = ' Span class= "Hljs-number" >159′or name = ' John '; = = select ID from t where phone= ' 159′union select ID from t where name= ' jonh ' to avoid negative%; Use COUNT (*) with caution; Limit efficient paging (the greater the limit, the less efficient); Use UNION ALL instead of union (union has a de-heavy cost); Use less connection join; using group BY; Please use the same type comparison; Break up batch update; 5, performance analysis tool show profile; mysqlsla; mysqldumpslow; explain; show slow log; show processlist;
Copy Code
Principles of Database design
Copy Code
- The relationship between the original document and the entity
Can be a pair of one or one-to-many, many-to-many relationships. In general, they are one-to-one relationships: a single original document corresponds to and corresponds to only an entity.
In special cases, they may be one-to-many or many-to-one relationships, that is, a single original document corresponds to multiple entities, or multiple original documents corresponding to an entity.
The entities here can be understood as basic tables. Clear this correspondence, to our design input interface is very good.
Example 1〗: An employee biographical data, in the Human Resources information System, the corresponding three basic tables: Employee basic Situation table, social relations table, work Resume table.
This is a typical example of "a single original document corresponding to multiple entities".
- Primary key and foreign key
Generally, an entity cannot have no primary key and no foreign key. In a e-r diagram, an entity in the leaf area can define a primary key, or it can not define a primary key
(because it has no descendants), but must have a foreign key (because it has a father).
The design of primary key and foreign key plays an important role in the design of global database. When the global database design is complete, there is an American database design dedicated to
Home said: "Key, Everywhere is the key, in addition to the key, nothing", this is his database design experience, also reflects his information on the nuclear
High abstraction of the mind (data model). Because: The primary key is the height abstraction of the entity, and the primary key is paired with the foreign key, representing the connection between the entities.
The nature of the basic table
A base table differs from an intermediate table, a temporary table, because it has the following four attributes:
(1) atomicity. The fields in the base table are non-biodegradable.
(2) Primitive nature. The records in the base table are the records of the original data (the underlying data).
(3) Deductive nature. All the output data can be derived from the base table and the data in the Code table.
(4) stability. The structure of the base table is relatively stable, and the records in the table are stored for a long time.
After you understand the nature of the base table, you can differentiate the base table from the intermediate and temporal tables when you design the database.
Paradigm Standard
The relationship between the base table and its fields should satisfy the third paradigm as much as possible. However, the design of the database that satisfies the third paradigm is often not the best design.
In order to improve the efficiency of database operation, it is often necessary to reduce the standard of normalization: to increase redundancy appropriately and to achieve the purpose of space-changing time.
Example 2〗: There is a basic table for storing goods, as shown in table 1. The existence of the "Amount" field indicates that the design of the table does not satisfy the third paradigm,
Because "amount" can be obtained by multiplying the "unit price" by "quantity", the amount is a redundant field. However, increase the "amount" of this redundant field,
Can improve the speed of query statistics, this is the practice of space-time.
In Rose 2002, it is stipulated that there are two types of columns: data columns and computed columns. Columns such as "Amount" are referred to as "computed columns", while "unit price" and
A column such as "Quantity" is called a "data column."
Table 1 table structure of the commodity table
Product name commodity model Unit Price quantity amount
TV 29 "2,500 40 100,000
- To understand three paradigms in a popular way
A popular understanding of the three paradigms is of great benefit to database design. In the database design, in order to better apply three paradigms, it must be understood in a popular way
Three paradigms (popular understanding is sufficient understanding, not the most scientific and accurate understanding):
The first paradigm: 1NF is an atomic constraint on attributes, requiring attributes to be atomic and non-decomposed.
The second paradigm: 2NF is a unique constraint on records, requiring records to have a unique identity, that is, the uniqueness of the entity;
The third paradigm: 3NF is a constraint on field redundancy, that is, any field cannot be derived from another field, it requires no redundancy in the field.
There is no redundant database design to do. However, databases that are not redundant are not necessarily the best databases, and sometimes in order to improve operational efficiency, they must be reduced
Low-paradigm standard, with proper retention of redundant data. The practice is to follow the third paradigm when designing the conceptual data model, and to reduce the standard of normalization to physical
Data model design-time considerations. Lowering the paradigm is adding fields, allowing redundancy.
- Be good at identifying and correctly dealing with many-to-many relationships
This relationship should be eliminated if there is a many-to-many relationship between the two entities. The solution is to add a third entity between the two. Thus, the original one
A many-to-many relationship now becomes two one-to-many relationships. To properly assign the original two entity's attributes to three entities. Here's the third one
An entity is, in essence, a more complex relationship that corresponds to a basic table. In general, database design tools do not recognize many-to-many relationships, but can
Rita to many relationships.
Example 3: In "Library information System", "book" is an entity, "reader" is also an entity. The relationship between the two entities is a
A typical many-to-many relationship: A book can be borrowed by multiple readers at different times, and a reader can borrow more books. To this end, the
Add a third entity, which is named "borrowing book," which has the properties of: borrowing time, borrowing also signs (0 means borrowing books, 1 means returning the book), in addition,
It should also have two foreign keys ("book" The Primary Key, "reader" the primary key), so that it can be connected with "book" and "Reader".
Watching:
Book 1 and the entity named "Borrowing books" n
Reader 1 and the entity named "borrowing book" n
The method of value of primary key PK
PK is an inter-table connection tool for programmers, which can be a string of numbers with no physical meaning, which is implemented automatically by the program. It could have physical meaning.
A combination of field names or field names. But the former is better than the latter. When PK is a combination of field names, the number of suggested fields should not be too many, not only the index
It takes up a lot of space and is slow.
Correct understanding of data redundancy
The repetition of the primary key and the foreign key in multiple tables is not data redundancy, and the concept must be clear, in fact many people are unclear. The weight of the non-key field
Data redundancy is the duplicate! And is a kind of low-level redundancy, that is, repetitive redundancy. Advanced redundancy is not a recurring occurrence of a field, but a derivation of a field.
Example 4〗: the "unit price, quantity, Amount" three fields in a commodity, "amount" is derived from "unit price" multiplied by "quantity", it is redundant,
And it's an advanced redundancy. The purpose of redundancy is to improve processing speed. Only low-level redundancy increases the inconsistency of the data, because the same data can be
Can be entered from different times, places and roles. Therefore, we advocate advanced redundancy (derived redundancy) against low-level redundancy (repetitive redundancy).
- E--r Chart No standard answer
The E--r diagram of information system has no standard answer, because its design and drawing is not unique, as long as it covers the business scope and function content of the system requirement,
is feasible. Conversely, to modify the E--r diagram. Although it does not have the only standard answer, it does not mean that it can be arbitrarily designed. The standard of a good e-r chart is:
The structure is clear, the association is concise, the number of entities is moderate, the attribute allocation is reasonable, no low level redundancy.
10. View technology is useful in database design
Unlike basic tables, code tables, and intermediate tables, a view is a virtual table that relies on a real table of data sources. The view is for programmers to use the database
A window, is a form of the synthesis of the base table data, is a method of data processing, is a means of confidentiality of user data. For complex processing,
To increase the computational speed and save storage space, the definition depth of the view must not exceed three layers. If the three-tier view is still not enough, you should define a temporary table on the view,
Redefine the view on the staging table. With this iterative definition, the depth of the view is not restricted.
The role of views is more important for certain information systems related to national political, economic, technical, military and security interests. The basic tables of these systems are finished
into a physical design, the first layer of view is immediately established on the base table, and the number and structure of the views are exactly the same as the number and structure of the base table.
It also stipulates that all programmers are only allowed to operate on the view. Only the database administrator, with a "security key" shared by multiple people,
To operate directly on the base table. Let the reader think: why is this?
Intermediate tables, reports, and temporary tables
The intermediate table is the table that holds the statistics, which is designed for the data warehouse, the output report, or the query results, and sometimes it has no primary key and foreign key (the Data Warehouse
Except for the library). Temporary tables are designed by programmers to store temporary records that are used by individuals. The base table and the intermediate table are maintained by the DBA, and the temporary table is by the programmer
Self-maintenance with the program.
Integrity constraints are represented in three ways
Domain Integrity: Use Check to implement constraints, in the database design tool, the field value range is defined, there is a check button, pass
Over it defines the value of the field city.
Referential integrity: The use of PK, FK, table-level triggers to achieve.
User-defined integrity: It is a business rule that is implemented with stored procedures and triggers.
The way to prevent database design from patching is "three less principles"
(1) The smaller the number of tables in a database, the better. Only the number of tables is small, can explain the system e--r diagram few but good, remove the redundant
Entity, formed a high degree of abstraction to the objective world, and carried on the system data integration, prevented the patching style design;
(2) The fewer fields in a table combine primary keys, the better. Because the primary key function, one is to build the primary key index, and the second is the foreign key as a child table, so the group
The number of fields with primary key is less, not only saves the running time, but also saves the index storage space.
(3) The smaller the number of fields in a table, the better. Only the number of fields is small to indicate that there is no duplication of data in the system and there is very little data redundancy
More important is to urge readers to learn to "row", which prevents the sub-table of the field to be pulled into the main table, leaving in the main table
More free fields. The so-called "column-to-row" is to pull some of the contents of the main table and create a separate child table. This method is very simple.
Alone, some people are not accustomed to, do not adopt, do not execute.
The practical principle of database design is to find the right balance between data redundancy and processing speed. "Three little" is a whole concept, a comprehensive view,
One principle cannot be isolated. The principle is relative, not absolute. The "more than three" principle is certainly wrong. Think: If the same work is covered by the system
Yes, a e--r chart of 100 entities (total 1000 properties) is certainly much better than the E--r diagram of 200 entities (2000 properties in total).
Advocating the principle of "three little" is called the reader to learn to use the database design technology for system data integration. The step of data integration is to integrate the file system
To apply the database, integrate the application database into a subject database and integrate the subject database into a global consolidated database. The higher the degree of integration, the more data
The more sharing, the less information island phenomenon, the whole enterprise Information system Global E-R graph of the number of entities, the number of primary keys, the number of attributes
There will be less.
The purpose of advocating the principle of "three little" is to prevent readers from using patching technology, and constantly make additions and deletions to the database, so that the enterprise database becomes arbitrary
Design the "garbage heap" of a database table, or "clump" of a database table, resulting in basic tables, code tables, intermediate tables, temporary tables in the database
Chaotic, countless (that is, the dynamic creation of tables and increase the number of tables), resulting in information systems can not be maintained and paralyzed.
"More than three" principle anyone can do, the principle is "patching method" design database crooked Science said. The principle of "three little" is few but good
Principle, it requires a high degree of database design skills and art, not anyone can do, because the principle is to eliminate the use of "patching method"
The theoretical basis of database design.
- Ways to improve the efficiency of database operation
Under the condition of the given system hardware and system software, the way to improve the operation efficiency of the database system is:
(1) in the database physical design, reduce the paradigm, increase redundancy, less use of triggers, multi-use stored procedures.
(2) When the calculation is very complex, and the number of records is very large (for example, 10 million), the complex calculation is first outside the database, the file system side
After the processing of the C + + language is completed, the final storage is appended to the table. This is the experience of telecom billing system design.
(3) If a table is found to have too many records, such as more than 10 million, the table is split horizontally. The horizontal split is done with the table primary key
A value of PK is a line, dividing the table's records horizontally into two tables (that is, you can manually split the table row count to two to create a two-table union view of the program transparently). If you find that there are too many fields for a table, such as more than 80,
Split the table vertically, decomposing the original table into two tables.
(4) The database management system DBMS system optimization, that is, the optimization of various system parameters, such as the number of buffers.
(5) When using the data-oriented SQL language for programming, the optimization algorithm should be taken as far as possible.
In conclusion, in order to improve the efficiency of database operation, it must be optimized from database system level, database design level, program implementation level, and the three
At the same time on a level.
The above 14 skills, is a lot of people in a large number of database analysis and design practice, gradually summed up. The use of these experiences, readers can not help hard sets, rote memorization, and to digest understanding, pragmatic, flexible grasp. and gradually achieve: in the application of development, in the development of the application.
Reprinted from: http://www.javaeye.com/topic/281611
=================================
How does denormalization explain it in database? Give an example
2008-04-01 21:16
Dictionary: Reverse normalization, obstruction normalize
Is what we usually call inverse normalization.
For example, set two primary keys in a table.
For example, a relationship between two tables is a many-to-many relationship.
It's all against the standard paradigm,
Both normalization and inverse normalization are designed to improve database performance.
Beginners still try to be standardized well
MySQL Database design specification