The database optimization consists of the following three parts: optimization of the database itself, database table optimization, program operation optimization. This article is part twoDatabase Performance Optimization II: Database table Optimization
Optimize ①: Design normalization tables to eliminate data redundancy
Database paradigm is a database design method that ensures reasonable database structure, satisfies all kinds of query needs and avoids abnormal database operation. To meet the formal requirements of the table, called the normalization table, the paradigm is generated in the early years of the year, the general table design to meet the first three normal forms can be, here briefly introduce the first three paradigms
First of all to see the Baidu Encyclopedia to give the definition:
First Normal (1NF) column with no duplicates
The so-called first paradigm (1NF) refers to a specification requirement for domain additions in a relational model, where all domains should be atomic, that is, each column of a database table is an indivisible atomic data item, not a collection, an array, a record, or a non-atomic data item.
Second Normal form (2NF) Property
On The basis of 1NF, the non-code attribute must be completely dependent on the code [to eliminate the partial function dependency of the non-principal attribute on the main code on 1NF basis ]
Third normal form (3NF) attribute
On a 1NF basis, any non-primary attribute is not dependent on other non-principal properties [elimination of transitive dependencies on a 2NF basis ]
The popular explanation (perhaps not the most scientific, the most accurate understanding)
The first paradigm: the atomic constraints of attributes (fields) that require attributes to be atomic and non-divisible; the second paradigm: the uniqueness of the record constraint, which requires a unique identifier for the record, and each record needs to have an attribute to be the unique identity of the entity. The Third Paradigm: attribute ( field ) Redundancy constraints, that is, any field can not be derived from other fields, in the popular point is: The primary key is not directly related to the data column must be eliminated ( the way to eliminate is to create a table to hold them, of course, except the foreign key)
If the database design is fully standardized, then all tables are concatenated with the keyword, and no copy of the data (repetition) occurs. The advantage of standardization is obvious, it avoids data redundancy, saves space naturally, also provides the fundamental guarantee to the consistency of data (consistency) , eliminate the phenomenon of data inconsistency, and improve the efficiency.
Optimize ②: Appropriate redundancy, increase computed column
The practical principle of database design is to find the right balance between data redundancy and processing speed.
can be "" quantity " get, description " This redundant field can improve the speed of query statistics, This is the practice of space-time change. The reasonable redundancy can disperse the concurrency pressure of the table with large data volume, also can speed up the special query speed, the redundant field can effectively reduce the connection of database table and improve the efficiency.
Where "total value" is a computed column, there are two types in the database: The data column and the computed column, the data column is the column that needs our manual or the program assigns the value, the computed column is derived from other data in the table, such as " total value " here
To create a computed column in SQL:
CREATE TABLE table1 (number decimal (18,4), price money, Amount as Number*price- This is the computed column)
You can also manually add or modify column properties directly in the table design:
Is persistent, we also need to be aware of:
If it is ' no ', it indicates that the column is a virtual column, which is computed once per query, and then it is not used to do check,foreign key or not null constraint.
If it is ' yes ', it is the real column, it does not need to be calculated every time, you can create indexes on this column, and so on.
Optimize ③: Index
Index is an important indicator of table optimization, which occupies an extremely important component in table optimization, so we will write a chapter "SQL Index One step" to tell you how to build and optimize the index .
Optimizing ④: The necessity of primary and foreign keys
The design of primary key and foreign key plays an important role in the design of global database. Because: The primary key is an abstraction of the entity, and the primary key is paired with the foreign key, representing the connection between entities.
Primary key: According to the second paradigm, need to have a field to identify the record, the primary key is undoubtedly the best identification, but many tables do not necessarily need a primary key, but for large data, query frequently database tables, must have a primary key, primary key can increase efficiency, prevent duplication and other advantages.
The choice of the primary key is also more important, generally choose the total length of the small key, small key comparison speed, while the small key can make the primary key of the B- tree structure less hierarchy. Primary key selection also pay attention to the combination of the primary key field order, for the combined primary key, the different field order of the primary key performance difference may be very large, generally should choose low repetition rate, alone or the combination of large query possibility of the field is placed in front.
FOREIGN key: Foreign key as a database object, many people think that the trouble without, in fact, foreign keys are very useful in most cases, the reason is: foreign key is the most efficient consistency maintenance method
Database consistency requirements, can be used in order to use foreign keys,CHECK constraints, rules constraints, triggers, client programs, it is generally believed that the closer to the data of the method is more efficient. Use cascading deletes and cascading updates with caution, cascading deletes and cascading updates as new features of SQL SERVER in the current year, where 2005 is reserved and should be available. I'm cautious here because cascading deletions and cascading updates have broken through the traditional definition of foreign keys, which is a bit too powerful to use before you have to make sure that you have mastered the scope of their functionality, otherwise cascading deletions and cascading updates may make your data inexplicably modified or lost. cascading deletions and cascading updates are more efficient than other methods in terms of performance.
Optimized ⑤: Proper use of stored procedures, views, functions
Many people are accustomed to putting complex operations on the application layer, but if you want to optimize data access performance , porting SQL code to the database ( using stored procedures, views, functions, and triggers ) is also a big improvement for the following reasons:
1. Stored procedures reduce the amount of network transmission, processing and storage, and are compiled and optimized to perform fast, easy to maintain, and when the structure of the table changes, does not affect the client's application
2. Using stored procedures, views, functions can help reduce the drawbacks of SQL replication in your application , because SQL is now centrally processed in only one place
3. Using database objects to implement all TSQL helps to analyze TSQL performance issues, while helping you to centrally manage TSQL code, better refactoring TSQL code
Optimizing ⑥: The legendary ' three-little principle '
①: The fewer tables in the database the better
②: The fewer fields in the table the better
③: A combination of primary keys and fewer composite indexes in a field is as good as possible
Of course, the less is relative, is the reduction of data redundancy of the important design concept.
Optimize ⑦: Split your table to reduce the size of the table
If you find that there are too many records for a table, such as more than 10 million entries, you will split the table horizontally. Horizontal segmentation is done by dividing the table's records horizontally into two tables, with a value of the table's primary key as the line.
If you find that there are too many fields for a table, such as more than 80, split the table vertically and break the original table into two tables
Optimizing ⑧: Field design principles
A field is the most basic unit of a database, and its design has a significant impact on performance. The following should be noted:
A, the data type as far as possible with the digital type, the digital type comparison is much faster than the character type.
B, the data type is as small as possible, where the minimum is to meet the foreseeable future needs of the premise.
C, try not to allow NULL, unless necessary, you can use not null+default instead.
D, less with the TEXT and IMAGE, binary field read and write is relatively slow, and, read the method is not much, most of the case is best not.
E, self-increment field to use with caution, not conducive to data migration
Database performance Optimization II: Database table optimization