Database optimization includes the following three parts: Database optimization and database table optimization,ProgramOperation Optimization. This article is part 2 Database performance optimization 2: Database Table Optimization
Optimization ①: design a standardized table to eliminate data redundancy
The database paradigm is a database design method that ensures that the database structure is reasonable, meets various Query Needs, and avoids database operation exceptions. A table that meets the requirements of the paradigm is called a canonicalized table.Century70Early YearsGenerally, table design can meet the first three paradigms. Here we will briefly introduce the first three paradigms.
Let's take a look at the definition provided by Baidu Encyclopedia:
First Paradigm (1nf) No duplicate Columns
The so-called first paradigm (1nf) Is a standard requirement for adding a domain in the relational model. All the domains should be atomic, that is, each column of the database table is an inseparable atomic data item, it cannot be a set, array, record, or other non-atomic data items..
Second Paradigm (2nf) Attribute
In1nfThe non-code attribute must be fully dependent on the code.[In1nfEliminate some function dependencies of non-primary attributes on the primary code.]
Third Paradigm (3nf) Attribute
In1nfIn addition, any non-primary attribute does not depend on other non-primary attributes.[In2nfBased onLai]
I would like to explain it to you (probably not the most scientific and accurate)
First paradigm: attributes (Field)Attributes must be atomic and cannot be separated;
Second paradigm: the uniqueness constraint of a record. A record must have a unique identifier. Each record must have an attribute as the unique identifier of an object.
Third paradigm: attributes (Field)Redundancy constraint, that is, no field can be derived from other fields. In popular terms, the data columns with no direct relationship to the primary key must be eliminated.(The elimination method is to create another table to store them., Except for Foreign keys)
If the database design is fully standardized, when all the tables are connected by keywords, no data copies will appear.(Repetition). The advantage of standardization is obvious. It avoids data redundancy and naturally saves space and data consistency.(Consistency)It provides a fundamental guarantee to eliminate data inconsistency and improve efficiency.
Optimization ②: Appropriate redundancy to add computing Columns
The practical principle of database design is to find a proper balance between data redundancy and processing speed.
tables that meet the paradigm must be normalized, but not necessarily the best design. In many cases, in order to improve the efficiency of database operation, we often need to lower the paradigm standard: appropriately add redundancy to achieve the purpose of changing the space for time. For example, we have a table, product name, unit price, inventory, and total value. This table does not meet the third paradigm, because " total value " " unit price " multiplied by " quantity " Get , description " amount " is a redundant field. However, add " total value" This redundant field can increase the speed of query statistics. This is the practice of changing the space for time. reasonable redundancy can distribute the concurrent pressure on tables with a large amount of data and speed up special queries. Redundant fields can effectively reduce the connection between database tables and improve efficiency.
"Total value"Is a computing column,There are two types of databases: Data columns and computing columns. Data columns are columns that need to be manually assigned or assigned by the program. The calculation column is calculated from other data in the table, for example"Total Value"
InSQLCreate a computing column:
Create Table Table1
(
Number decimal (18,4 ),
Price money,
Amount as number * price --Here is the calculation Column
)
You can also manually add or modify the column attributes in the table design, as shown in figure
Persistence or not, we also need to note:
If it is 'no', it indicates that this column is a virtual column and is calculated once each time it is queried, and it cannot be usedCheck, foreign keyOrNot nullConstraints.
If yes ',It is a real column. You can create an index on this column without having to calculate it every time.
Optimization ③: Index
An index is an important indicator for table optimization and plays an extremely important role in Table optimization. Therefore, we will write a separate Chapter"SQLOne-step Indexing "to show you how to create and optimize Indexes
Optimization ④: necessity of primary key and foreign key
The Design of primary keys and Foreign keys plays an important role in the design of global databases. Because: The primary key is the abstraction of objects. The pairing between the primary key and the foreign key represents the connection between objects.
Primary Key: According to the second paradigm, a field is required to identify this record. The primary key is undoubtedly the best identifier, but many tables do not necessarily need a primary key, but for a large amount of data, to query frequently-accessed database tables, you must have a primary key. The primary key can increase efficiency and prevent duplication.
The selection of primary keys is also important. Generally, the selection of keys with a small total length is faster, while a small key can make the primary keyBThe tree structure has fewer layers.
When selecting a primary key, pay attention to the field order of the primary key combination. For the primary key combination, the performance of the primary keys in different field order may vary greatly, generally, you should select a field with a low repetition rate, a single field, or a combination of query possibilities.
Foreign key: The foreign key is used as the database object. Many people think it is troublesome. In fact, foreign keys are useful in most cases for the following reasons: foreign keys are the most efficient way to maintain consistency.
Database Consistency requirements, can use foreign keys,CheckConstraints, rule constraints, triggers, and client programs. Generally, the method closer to the data is more efficient.
Exercise caution when using cascading deletion and cascading update.SQL Server 2000The new features of the year are2005It should be available if it is retained. I am cautious about this because cascading deletion and cascading update break through the traditional definition of foreign keys, and the features are a little too powerful, before use, you must determine that you have mastered the function scope. Otherwise, cascading deletion and cascading update may cause your data to be inexplicably modified or lost. In terms of performance, cascading deletion and cascading update are more efficient than other methods.
Optimization ⑤: proper use of stored procedures, views, and functions
Many people are used to placing complex operations on the application layer, but if you want to optimize data access performanceSQLCodePort to database(Use stored procedures, views, functions, and triggers)This is also a major reason for improvement:
1. the storage process reduces the workload of network transmission, processing, and storage. After compilation and optimization, the execution speed is fast and easy to maintain. When the table structure is changed, the client applications are not affected.
2. Using Stored Procedures, views, and functions helps reduce the numberSQLThe disadvantage of replication is that it is only processed in one place.SQL
3. Use database objects to implement allTsqlHelpful for analysisTsqlAnd helps you manage the performance issues in a centralized manner.TsqlCodeTo better reconstruct tsqlCode
Optimization 6: The Legend of the Three Little principles'
①: The fewer tables in the database, the better.
②: The fewer fields in the table, the better.
③: The fewer Primary keys and composite indexes in fields, the better.
Of course, this is relatively rare and is an important design concept for reducing data redundancy.
Optimize 7: split your table and reduce the table size
If you find that a table contains too many records, for example, more than 10 million records, you must perform horizontal segmentation on the table. The horizontal split method uses a value of the primary key of the table as the boundary to horizontally split the records of the table into two tables.
If you find that there are too many fields in a table, for example, more than 80 fields, split the table vertically and split the original table into two tables.
Optimized Layout: Field Design Principles
Fields are the most basic unit of the database, and their design has a great impact on performance. Note the following:
AUse the numeric type as much as possible. The numeric type is much faster than the numeric type.
BThe data type should be as small as possible. The smaller value here refers to meeting the foreseeable future needs.
C, Try not to allowNull, Unless necessary, can be usedNot null + default.
DUse lessTextAndImageThe reading and writing of binary fields is slow, and there are not many reading methods. In most cases, it is best not to use binary fields.
EAuto-increment fields should be used with caution, which is not conducive to data migration
Possible partsArticleI have used other online articles for reference. This article is for learning purposes only. For more information, see the source.
-------------- AK (old K): 2012-12-28