Database performance Optimization II: Database table optimization

Source: Internet
Author: User
Tags repetition
The database optimization consists of the following three parts: optimization of the database itself, database table optimization, program operation optimization. This article is part two Database Performance Optimization II: Database table Optimization

  optimize ①: Design normalization tables to eliminate data redundancy

Database paradigm is a database design method that ensures reasonable database structure, satisfies all kinds of query needs and avoids abnormal database operation. To meet the formal requirements of the table, called the normalization table, the paradigm was generated in the early 1970s, the general table design to meet the first three normal forms can be, here briefly introduce the first three paradigms

First of all to see the Baidu Encyclopedia to give the definition:

First Normal (1NF) column with no duplicates

The so-called First paradigm (1NF) refers to a specification requirement for domain additions in a relational model, where all domains should be atomic, that is, each column of a database table is an indivisible atomic data item, not a collection, an array, a record, or a non-atomic data item.

Second Normal form (2NF) property

On the basis of 1NF, the non-code attribute must be completely dependent on the code [to eliminate the partial function dependency of the non-principal attribute on the main code on 1NF basis]

Third normal form (3NF) attribute

On a 1NF basis, any non-primary attribute is not dependent on other non-principal properties [Elimination of transitive dependencies on a 2NF basis]

The popular explanation (perhaps not the most scientific, the most accurate understanding)

The first paradigm: the atomic constraint of the attribute (field), which requires that the attribute be atomic and non-divisible;
Second paradigm: A unique constraint on a record that requires a unique identifier for a record, and each record needs to have an attribute to be the unique identity of the entity.
The Third Paradigm: attribute (field) redundancy constraints, that is, any field can not be derived from other fields, in the popular point is: The primary key is not directly related to the data column must be eliminated (the way to eliminate is to create a table to hold them, of course, except the foreign key)

If the database design is fully standardized, then all tables are concatenated with the keyword, and no copy of the data (repetition) occurs. The advantage of standardization is obvious, it avoids data redundancy, saves space naturally, also provides the fundamental guarantee to the consistency of data (consistency), eliminate the phenomenon of data inconsistency, and improve the efficiency.

optimize ②: Appropriate redundancy, increase computed column

The practical principle of database design is to find the right balance between data redundancy and processing speed.

A table that satisfies a paradigm must be a normalized table, but not necessarily the best design. In many cases, in order to improve the efficiency of the database, it is often necessary to reduce the paradigm standard: the appropriate increase in redundancy to achieve the purpose of space-changing time. For example, we have a table, product name, unit price, inventory amount, total value. This table does not satisfy the third paradigm, because "total value" can be obtained by multiplying the "unit price" by "quantity", stating that "amount" is a redundant field. However, increasing the "total value" of this redundant field can increase the speed of query statistics, which is the practice of space-changing time. The reasonable redundancy can disperse the concurrency pressure of the table with large data volume, also can speed up the special query speed, the redundant field can effectively reduce the connection of database table and improve the efficiency.

Where "total value" is a computed column, there are two types in the database: The data column and the computed column, the data column is the column that needs our manual or the program assigns the value, the computed column is derived from other data in the table, such as "total value" here

To create a computed column in sql:

CREATE TABLE table1
(
Number Decimal (18,4),
Price Money,
Amount as Number*price--here is the computed column
)

You can also manually add or modify column properties directly in the table design: The following figure

Is persistent, we also need to be aware of:

If it is ' no ', it indicates that the column is a virtual column, which is computed once per query, and then it is not used to do check,foreign key or NOT NULL constraint.

If it is ' yes ', it is the real column, it does not need to be calculated every time, you can create indexes on this column, and so on.

optimize ③: Index

Index is an important indicator of table optimization, which occupies an extremely important component in table optimization, so we will write a chapter "SQL Index one Step" to tell you how to build and optimize the index.

Optimizing ④: The necessity of primary and foreign keys

The design of primary key and foreign key plays an important role in the design of global database. Because: The primary key is an abstraction of the entity, and the primary key is paired with the foreign key, representing the connection between entities.

Primary key: According to the second paradigm, need to have a field to identify the record, the primary key is undoubtedly the best identification, but many tables do not necessarily need a primary key, but for large data, query frequently database tables, must have a primary key, primary key can increase efficiency, prevent duplication and other advantages.

The choice of the primary key is also more important, generally choose the total length of the small key, small key comparison speed, while the small key can make the primary key of the B-tree structure less hierarchy.
Primary key selection also pay attention to the combination of the primary key field order, for the combined primary key, the different field order of the primary key performance difference may be very large, generally should choose low repetition rate, alone or the combination of large query possibility of the field is placed in front.

FOREIGN key: Foreign key as a database object, many people think that the trouble without, in fact, foreign keys are very useful in most cases, the reason is: foreign key is the most efficient consistency maintenance method

Database consistency requirements, can be used in order to use foreign keys, check constraints, rules constraints, triggers, client programs, it is generally believed that the closer to the data of the method is more efficient.
Careful use of cascading deletes and cascading updates, cascading deletions, and cascading updates as new features of SQL SERVER 2000 in the current year, are reserved in 2005 and should have their availability. I'm cautious here because cascading deletions and cascading updates have broken through the traditional definition of foreign keys, which is a bit too powerful to use before you have to make sure that you have mastered the scope of their functionality, otherwise cascading deletions and cascading updates may make your data inexplicably modified or lost. cascading deletions and cascading updates are more efficient than other methods in terms of performance.

optimized ⑤: Proper use of stored procedures, views, functions

Many people are accustomed to putting complex operations on the application layer, but if you want to optimize data access performance, porting SQL code to the database (using stored procedures, views, functions, and triggers) is also a big improvement for the following reasons:

1. Stored procedures reduce the amount of network transmission, processing and storage, and are compiled and optimized to perform fast, easy to maintain, and when the structure of the table changes, does not affect the client's application

2. Using stored procedures, views, functions can help reduce the drawbacks of SQL replication in your application, because SQL is now centrally processed in only one place

3. Using database objects to implement all TSQL helps to analyze TSQL performance issues, while helping you to centrally manage TSQL code, better refactoring TSQL code

Optimizing ⑥: The legendary ' three-little principle '

①: The fewer tables in the database the better

②: The fewer fields in the table the better

③: A combination of primary keys and fewer composite indexes in a field is as good as possible

Of course, the less is relative, is the reduction of data redundancy of the important design concept.

optimize ⑦: Split your table to reduce the size of the table

If you find that there are too many records for a table, such as more than 10 million entries, you will split the table horizontally. Horizontal segmentation is done by dividing the table's records horizontally into two tables, with a value of the table's primary key as the line.

If you find that there are too many fields for a table, such as more than 80, split the table vertically and break the original table into two tables

Optimizing ⑧: Field design principles

A field is the most basic unit of a database, and its design has a significant impact on performance. The following should be noted:

A, the data type as far as possible with the digital type, the digital type comparison is much faster than the character type.

B, the data type is as small as possible, where the minimum is to meet the foreseeable future needs of the premise.

C, try not to allow NULL, unless necessary, you can use not null+default instead.

D, less with the text and image, binary field read and write is relatively slow, and, read the method is not much, most of the case is best not.

E, self-increment field to use with caution, not conducive to data migration

The above may be part of the article for reference to other network articles, this article only for the study use, reproduced please indicate the source

--------------AK (old K): 2012-12-28

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.