[Several Aspects of SQL Server performance optimization]

Source: Internet
Author: User
Tags bulk insert

SQL Server Performance Optimization

 

(1) Database Design

Can you refer to the recent forum on the emergence of an excellent post http://topic.csdn.net/u/20100415/10/a377d835-acbd-4815-8bcb-b367f88ac8b5.html? 92227
Database design includes physical design and logical design:
Physical DesignYou can use the raid hardware architecture.
To put it simply, the usage policy is as follows:
A. raid0 can be used in a read-only database data table or a copied database. It can also be used if you are not sensitive to data loss. In short, this level is of high performance and no redundancy;
B. RAID 5 differs from RAID 1 in that it has more parity. All parity information is distributed across all disks, and the performance is higher than that of RAID 1. However, in the event of disk I/O failure, this will cause a sharp decline in performance. At the same time, this method is also being compromised between raid0 and raid1, which is a common practice.
C. raid 10 is a combination of RAID 0 and RAID 1. It provides high performance, high availability, and better performance than RAID 5. It is especially suitable for applications with a large number of writes, however, the cost is relatively high. No matter how many disks you store, you will lose half of the disk storage space.

Logic DesignYou can use some database components, such as tables and constraints. Here, we first mention the use of file groups. If you have insufficient funds to purchase a complete RAID system, you can consider file groups.
Document and document group architecture section: http://blog.csdn.net/feixianxxx/archive/2010/01/28/5267290.aspx
The principle is to read multiple hard disks at the same time to reduce competition for disk space and improve Read efficiency.

 

Let me talk about the following points:

A. Separation and merging of paradigm and data table
We generally pursue standards when designing databases, and generally reach at least 3nf. after that, some fields in the table that are often used to store queries may be split from the table. Otherwise, a large number of users will access a hot table at the same time, and the efficiency will be deteriorated.

Of course, sometimes the design of the merged table is also carried out, so that a small amount of redundancy can be used in order to reduce excessive join operations.
For details about the extent to which the design table is to be split and merged, it depends on the business requirements and system capacity. For more information, see this document.

 

B. Primary Key and foreign key
As the relationship between the table and the table, the most direct link is through the main foreign key. A primary key is very important for a table. It can exist as a unique identifier of a row in a table. It must be unique. It cannot be null or minimal and is easy to obtain.

The Minimization means that the primary key must be narrow. If the key value is too large and frequently accessed, it will make the database system inefficient and need to be maintained. Generally, 1-4 bytes of fields are used as the primary key. When serving as a cluster index, it also needs to keep the appropriate field length; otherwise, the index is too long.

The constraint defining foreign keys does not require indexes. However, due to the need of Join Operations, we usually need to manually add necessary indexes after creating foreign keys.

 

C. Table field selection
Try to make the field type of your table "only available", that is, to increase the length at most slightly while meeting the average length requirement, so do not waste too much. Field storage is directly reflected in the page. The shorter the line length, the more record rows can be stored on one page.
When we retrieve data from the hard disk, the unit is page. The more records a page contains, the more records that are stored in the cache area, and the higher the throughput.
1. If the content of the field is relatively fixed, for example, gender ID card number, mobile phone number, etc., the fixed length Char or nchar is used.
2. If the content of a field changes a large margin, consider using varchar or nvarchar.
3. Generally, do not allow the field to be null, which will lead to unnecessary processing consumption and can be replaced by the default value.
4. Use constraints as much as possible to maintain data integrity. Do not rely too much on triggers or stored procedures for maintenance. Do not use front-end program code for some simple record rules.

5. large fields such as nvarchar (max), char (max), binary (max), and text should be cut to an independent table and associated with the original source table with the primary key, because these data types will cause slow access, modification, and other operation locks, and we cannot use these fields during the query.

6. table fields should not be too many. Generally, the length of a row should not be too long. This will lead to too few records stored on each page. If necessary, you can cut some fields, divide a table into two or more tables for storage.

D. Routine Maintenance Plan
Data Maintenance Plan, such as backup, re-indexing, data replication, and other operations that consume resources in batch processing, should be performed whenever the system is idle.

 

E. Front-End program access to basic tables
Try to allow the program to access data through the stored procedures, views, and functions in the database, rather than directly operating on the basic table.

 

F. Separate online analysis from online transaction access

The two are very different. For details, refer to msdn. In simple terms, the former is suitable for queries, while the latter is mostly used for insert, update, delete, and partial queries.

 

G. Make archiving plans

The security of how to extract, query, and query an archive after an archive is archived must be considered.

 

Optimization of database query (T-SQL optimization)
This content is too big, just pick it up

1. Use as many query parameters as possible (Sarg)
Its general format:Field operator constant or variable
Some operators here include: <>=<= between and like, which are not started with %.

Non-Sarg syntax, indexing generally does not work. The following describes some practices against the Sarg Syntax:

A. Calculated data fields
For example (tested): Where col_1 + col_2 = 'AB' cannot use indexes;
Where col1 = 'A' ADN col_2 = 'B' the index on the field can be used;
Where col_3 + 1 = 4 cannot be used; where col_3 = 4-1 can be used

B. Do not use inverse operators for fields
Do not use some inverse operators for fields, such as not in not like (as described in the 2000 book <>! =!> The index will be blocked. I can use the index on 08. You are welcome to test the index in 2000 environments)
Example (tested): Where id not like '1% 'or where id not in (, 3) cannot use indexes.

C. Do not use functions for fields
Use functions such as substring left datediff, But You Can skillfully use the Sarg method to replace some functions.
For example, left (COL, 2) = 'av' can use Col like 'av % 'or ABS (Col-1)> 100 to use Col> 101 or Col <-99

 

D. Do not use the OR operator.
When the OR operator is used, if one of the fields in multiple conditions does not have an appropriate index, the indexes of other fields will lose the index effect and the whole table will be scanned. (Note that the efficiency of index scanning is similar to that of table scanning)
 
However, when using a non-Sarg in the where clause, the index may not be used. Sometimes it contains non-Sarg conditions, but it still performs index utilization on the Sarg part.

 

2. Join between tables

Reference: http://blog.csdn.net/happyflystone/archive/2009/01/15/3786158.aspx

 

 

3. Insert large amounts of data
 
A. when inserting large data volumes, you can use the bcp bulk insert option to block table triggers, constraints, or even delete table indexes as much as possible, which can make the insert operation much faster.
If you are worried about data irregularities, You can first import the data to a temporary table, and then filter out the irregular data in the database, and then insert the temporary table into the target table again.
B. We recommend using BULK INSERT for large-capacity inserts. Because it is usually faster than BCP.
C. Use the table lock instead of the default record lock when inserting with BCP or bulk insert.
D. If the server has multiple cores, try to insert multiple clients into the table in parallel.
E. If the inserted data needs to be converted, do not use tools such as DTS to directly convert the record. You can use the temporary table to be inserted first, and then perform the second import after the temporary table is converted.

 

4. Some other precautions

1. perform less sorting. Keywords such as distinct, order by, and Union can only be used as needed;

2. when processing a large number of data, no log operation must be faster than writing data one by one. Select into is faster than insert select, and truncate table is faster than Delete table (excluding where;

3. When database objects are referenced in versions 05 and later, it is best to use two name formats. Because SQL Server first searches for the object name under the default architecture of the current user for the single-node name object. If the object name cannot be found, it then searches for it under DBO.

4. Pay attention to the selection of Join Operations and subquery operations. Sometimes the results of the two are the same. Although the statement structure is different, the execution plans are the same. Of course, there will be more inconsistencies in the execution plan, although the results are the same, so pay attention to the debugging of different statements;

5. query prompt: Note: In general, you do not need to use hint for query, because the optimizer will not optimize the query. Generally, hint is only used for testing to learn about some query methods and indexes, the impact of lock usage on queries;
 

(3) rational index design
This is a very big piece of content, which can not be completed in just a few words. If there are too few content and my level and practice, let's write a little bit:

Preparations for designing indexes:
1. understand the nature of the database. Whether it is OLTP or OLAP. If it is the former, index creation should be cautious because data is frequently modified; for the latter, many indexes can be created to speed up the search;
2. Understand which queries are commonly used;
3. Understand the index options to improve the index performance from this point;
4. Determine the index storage location. For example, Storing non-clustered indexes and data on different disks improves performance.
Index creation principles:
1. for tables that frequently change, that is, tables that frequently perform insert, update, and delete operations, too many indexes cannot be created. Otherwise, more indexes can be created, this is conducive to the selection of the query optimizer;
2. Make sure that the width of the index column is kept as small as possible;
3. For small tables, that is, tables with a small amount of data, no indexes can be created, because the use of indexes is sometimes not as fast as table scanning;
4. When the view contains aggregation operations, you can use the index view to significantly improve performance when connecting large tables. Note that the table here cannot be changed frequently;
4. Select columns with less repetition as index columns. If columns with large repeatability, such as gender columns, do not create indexes;
5. Use appropriate fill factors;
6. Add an index to the columns that are frequently used to query the predicates and join conditions and group sorting;
7. A composite index (covering indexes) is an index composed of multiple columns. When all the columns required for a query contain these columns, the performance can be significantly improved, because it only needs to index the page content. Note that in these index Columns
To keep frequently-used columns in front of each other as much as possible, the columns are sorted from the least repeated columns and the most repeated columns. Here, an index is also called an include column index, it is a special non-clustered index, which can contain some non-key
Add these columns to the leaf level of the index page. This can also significantly improve the performance when the required columns are included in the query. These two columns are merged into multi-column indexes;

I have provided several articles about indexes:
Http://www.cnblogs.com/bhtfg538/ (several articles in it)
Http://hi.baidu.com/dusongw/blog/item/4090493d6ec0cdee3d6d97a6.html

 

(4) front-end application design
As the control end of data, an application determines when to use data, how to use data, and how to process the obtained data results. It will directly affect the activity of the server, which is critical to the overall performance.
The following are some precautions for Application Design:
1. reduce network traffic
You can disable some affected rows by using the stored procedure or even set nocount settings. What does it mean to return as little data as possible? That is to say, if your program only needs tables
In this example, you do not need to return the data of the entire table, and then filter the data, directly filter the data on the server, and then pass the data.

2. Limited lock timeout
Do not allow your query to run without limit. You should use appropriate APIs to set query timeout.

3. Use the cursor until necessary

4. The transaction should be as short as possible

5. Ensure that the application is designed to avoid deadlocks.

 

(5) Other measures:
1. Optimize server performance
Server configuration options are usually automatically adjusted by default. You can modify some options based on the actual situation and the conditions that are conducive to your system optimization.
For example, you can modify the following aspects:
SQL Server Memory; I/0 subsystem; Windows server option.

2. hardware enhancement
Memory, hard disk capacity, raid upgrade, etc.

3. Design a database Association Server
For large systems, multiple servers are often required to balance the processing load at each layer.

 

 

Recommended:Summary of Performance Optimization of SQL Server written by HAI ye http://blog.csdn.net/Haiwer/archive/2008/08/25/2826881.aspx
Reference:Performance Tuning (Hu baijing)

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.