SQL Server database performance optimization technology)

Last Update:2018-12-05 Source: Internet

Author: User

Tags sql server query

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Source: http://www.newasp.net/tech/data/2430.html

Abstract: factors that affect the performance of SQL Server databases and the principle of optimizing SQL server performance,
Keywords: SQL Server database performance optimization Query
Designing an application system does not seem difficult, but it is not easy to optimize the system performance.
Events. There are multiple options in terms of development tools, database design, application structure, query design, and interface selection.
Depends on the specific application requirements and development team skills. This article takes SQL Server as an example to discuss
1. Database Design
To achieve optimal performance in a good SQL Server solution, the most important thing is to have a good database
Plan. In practice, many SQL server solutions tend to suffer poor performance due to poor database design. Institute
1.1 logic library Standardization
In general, the logical database design meets the first three standards of standardization:
1. 1st specification: No repeated group or multi-value columns.
2. 2nd specification: each non-Keyword segment must depend on the primary keyword and cannot depend on one composite primary keyword.
3. 3rd specification: one non-Keyword segment cannot depend on the other non-Keyword segment.
The design that complies with these rules will produce fewer columns and more tables, thus reducing data redundancy and
Page used to store data. However, the table relationship may need to be processed through complicated merging, which will reduce the system performance. A
To some extent, non-standardization can improve the system performance. The non-standardization process can be used according to different performance considerations.
Different methods, but the following methods are proved to improve performance.
1. If the canonicalized design produces many 4-or more-channel merging relationships, you can consider)
2. Common calculated fields (such as total and maximum values) can be stored in database entities.
For example, a project has a schedule in the plan management system, which includes the project number, plan at the beginning of the year, and plan for the second time.
Plans, adjustment plans, and columns ..., The total number of plans (year-end plan + secondary plan + adjustment plan + column-filling plan) is the user's
It is often used in queries and reports. When the number of records in a table is large, it is necessary to regard the total number of plans as an independent word.
Add segments to the table. A trigger can be used to maintain data consistency on the client.
3. Redefine entities to reduce spending on external attribute data or row data. The corresponding nonstandard types are:
(1) divide one entity (table) into two tables (all attributes are divided into two groups ). In this way, the frequently accessed data is stored.
The less accessed data is separated. This method requires that the primary keyword be copied in each table. This design is advantageous.
(2) divide one entity (table) into two tables (all rows are divided into two groups ). This method applies
Entity (table) of a large amount of data ). History is often retained in applications, but it is rarely used. Therefore, frequent
The accessed data is separated from the less accessed historical data. And if the data row is used as a subset by the logical Working Group (Department
, Sales partition, geographic area, etc.), this method is also very good.
1.2 generate a physical database
To correctly select basic physical implementation policies, you must understand the Database Access format and operating features of hardware resources,
Mainly memory and disk subsystem I/O. This is a broad topic, but the following guidelines may be helpful.
1. Data Types related to each table column should reflect the minimum storage space required for data, especially for indexed
This is especially true for columns. For example, if the smallint type can be used, the integer type is not used, so that the index field can be faster.
Read, and more data rows can be placed on one data page, thus reducing I/O operations.
2. Place a table on a physical device, and then use the SQL Server segment to put its clustered index on one
Same as physical devices, this improves performance. In particular, the system uses multiple smart disk controllers and data separation technologies.
3. Use the SQL Server segment to split a frequently used large table and place it in two separate smart disk controls.
This can also improve the performance. Because multiple heads are searching, data separation can also improve performance.
4. using SQL Server segments to store text or image column data on a separate physical device improves performance
2. SQL Server-related hardware systems
The hardware design related to SQL Server includes the system processor, memory, disk subsystem, and network.
It basically forms a hardware platform. Windows NT and SQL Server run on it.
2.1 system processor (CPU)
The process of determining the CPU Structure Based on your specific needs is to estimate the CPU usage workload on the hardware platform.
Process. From past experiences, the CPU configuration should be at least one 80586/100 processor. If only 2 ~ Three users.
Enough, But if you plan to support more users and key applications, Pentium Pro or pⅱ CPU is recommended.
　　
2.2 memory (RAM)
Determining appropriate memory settings for the SQL Server solution is crucial for achieving good performance. SQL
The server uses memory for process caching, data and index item caching, static server spending, and set spending. Maximum number of SQL servers
2 GB of virtual memory can be used, which is also the maximum setting value. Another thing to consider is that Windows NT and all its elements
Windows NT provides a 4 GB virtual address space for each Win32 application. This virtual address space is composed
The Windows NT virtual memory manager (vmm) is mapped to the physical memory, which can reach 4 GB on some hardware platforms. SQL
The server application only knows the virtual address, so it cannot directly access the physical memory. This access is controlled by vmm.
Windows NT allows the generation of virtual address space that exceeds the available physical memory.
When more memory is saved than the available physical memory, the performance of SQL Server is reduced.
These address spaces are specifically set for the SQL Server System, so if there are other
Software (such as file and print sharing, application services, etc.) is running, so consider that they also occupy part of the memory
. Generally, the hardware platform requires at least 32 MB of memory, of which, Windows NT requires at least 16 Mb. 1 simple
The rule is to increase the memory size by kb for each concurrent user. For example, if there are 100 concurrent users, you must at least
32 MB + 100 users * kb = 42 MB memory, the actual usage needs to be adjusted according to the actual running situation. It can be said that
2.3 Disk Subsystem
Designing a good disk I/O system is an important aspect of implementing a good SQL Server solution. Here
The disk subsystem has at least one disk control device and one or more hard disk units, as well as disk settings and file systems.
. Intelligent SCSI-2 disk controller or disk group controller is a good choice, its features are as follows:
(1) Controller High-speed cache.
(2) The bus motherboard has a processor, which can reduce the interruption to the system CPU.
(3) asynchronous read/write support.
(4) 32-bit raid support.
(5) fast SCSI? 2 driver.
(6) Advanced read cache (at least 1 track ).
3. Search Policy
The hardware platform has been carefully selected, and a good database solution has been achieved, which meets user needs and needs.
Now we have to design queries and indexes. There are two aspects for good query on SQL Server
Query and index performance are very important. 1st is to generate queries and indexes based on the knowledge of the SQL Server optimizer; 2nd
3.1 SQL Server Optimizer
Microsoft SQL Server database kernel is automatically submitted to SQL with a cost-based query optimizer.
. Data operation query is a query that supports SQL keywords where or having, such as select and delete.
And update. The cost-based Query Optimizer estimates the cost of the clause generated based on the statistical information.
An easy way to understand the optimizer's data processing process is to check the output of the showplan command. If you use the word-based
Tool (such as iSQL), you can enter show showplan on to get the output of the showplan command. If you use
Graphical query, such as the query tool in SQL Enterprise Manager or iSQL/W, you can set configuration options to provide
SQL Server is optimized in three phases: query analysis, index selection, and merge selection.
1. In the query and analysis phase, the SQL Server optimizer checks each clause represented by a regular query tree and determines whether it can be optimized. SQL Server usually tries its best to optimize the clauses that restrict scanning. For example, search and/or merge clauses. However, not all valid SQL syntaxes can be divided into optimized clauses, such as clauses containing SQL unequal relational characters "<>. Because "<>" is an exclusive operator, rather than an inclusive operator, it is impossible to determine the scope of the clause selection before scanning the entire table. When a relational query contains an unoptimized clause, the execution plan uses a table scan to access this part of the query. For the SQL Server clause that can be optimized in the query tree, then the optimizer executes the index selection.
2. index selection
For each optimized clause, the optimizer checks the database system table to determine whether related indexes are available.
To access data. This index is considered only when the first prefix of a column in the index exactly matches the column in the query clause.
Is useful. Because the index is constructed based on the column sequence, exact matching is required. For clustered indexes, the original
The data is also sorted according to the index column order. To access data with secondary indexed columns, you just want to find
Similar to an entry with a surname of a certain surname, sorting is basically useless, because you still have to check each row to determine whether it is
Yes. If one clause has an available index, the optimizer determines its selectivity.
Therefore, in the design process, you should carefully check all the queries according to the query design principles, with the query optimization features
(1) narrow indexes have high efficiency. For narrow indexes, each page can store more
Index rows, and the index level is also small. Therefore, more index pages can be stored in the cache, which also reduces I/O operations.
(2) the SQL Server optimizer can analyze a large number of indexes and the possibility of merging. Therefore, compared with a small number of wide indexes,
A large number of narrow indexes can provide more choices for the optimizer. But do not keep unnecessary indexes because they will increase storage
And maintenance expenses. For composite indexes, composite indexes, or multi-column indexes, the SQL Server optimizer retains only
(3) Too many indexes on the table will affect the performance of update, insert, and delete, because all indexes must
Make corresponding adjustments. In addition, all paging operations are recorded in logs, which also increases I/O operations.
　　
(4) Creating an index for a column that is frequently updated will seriously affect the performance.
(5) because of storage expenses and I/O operations, smaller self-organized indexes provide better index performance.
(6) try to analyze the usage frequency of each important query, so that you can find the index with the most use, and then you can
(7) Any column in The WHERE clause in the query may be an index column, because the optimizer focuses on this subitem
(8) It is not cost-effective to index small tables smaller than one range, because table scanning is usually faster for small tables.
(9) columns used with "order by" or "group by" are generally suitable for family-based indexing. If
If the Column Used in the order by command has a clustered index, one worksheet will not be generated because the rows have been sorted.
(10) Clustering indexes should not be constructed on frequently changing columns, because this will lead to moving of the entire row. In the implementation of large-scale
Pay special attention to this when dealing with systems, because the data in these systems often changes frequently.
3. Merge and select
When the selection of indexes ends and all clauses have a processing fee based on their access plan
The er starts to execute the merge selection. Merge selection is used to find an effective sequence used to merge clause access plans. To Do
At this point, the optimizer compares different sorting clauses and then selects the lowest processing cost from the perspective of physical disk I/O.
And plan. Because the number of clause combinations increases extremely quickly with the query complexity, the SQL Server Query Optimizer uses the tree
Pruning technology to minimize the cost of these comparisons. When the merge selection stage ends, SQL Server queries
The scheduler has generated a cost-based query execution plan, which makes full use of available indexes and minimizes the cost.
3.2 efficient query Selection
From the three phases of the above query optimization, it is not difficult to see that the solution with the least physical I/O and logical I/O is designed and well mastered.
The balance between processor time and I/O time is the main goal of efficient query design. That is to say, we hope to design such a query.
Query: Make full use of indexes, the disk reads and writes at least, and the most efficient use of memory and CPU resources.
The following suggestions are summarized from the optimization policies of the SQL Server optimizer.
1. If there is a unique index, the where clause with the "=" operator has the best performance, followed by closed
2. From the perspective of database access, where clauses containing discontinuous connectors (or and in) generally have poor performance.
It will be good. Therefore, the Optimizer may adopt the r policy, which will generate one worksheet containing each possible
The execution identifier of the configuration. The optimizer regards these row tokens (page numbers and row numbers) as "dynamic" pointing to the matched rows in one table.
State Index ". The optimizer only needs to scan the worksheet, retrieve each row identifier, and then obtain the corresponding row from the data table.
The cost of the r policy is to generate a worksheet.
3. Contains not, <>, or! The where clause = is of no use for the optimizer's index selection. Because such clauses are exclusive rather than inclusive, the selectivity of clauses cannot be determined before scanning the original data table.
4. Restrict data conversion and string operations. The optimizer generally does not follow the expressions and data conversion types in the WHERE clause.
Select to generate an index. For example:
Paycheck * 12> 36000 or substring (lastname,) = "L"
If the table has an index for paycheck and lastname, the index cannot be used for optimization.
The written conditional expression is:
Paycheck <36000/12 or lastname like "L %"
5. The local variables in the WHERE clause are considered not to be known and considered by the optimizer. Exceptions are defined
6. If there is no Index containing the merge clause, the optimizer constructs a worksheet to store the smallest table in the merge.
. Then, create a clustered index on the table to complete an efficient merge. The cost of this approach is work.
Table Generation and subsequent generation of sub-family indexes. This process is called reformatting. So pay attention to Ram or magnetic
The size of the database tempdb on the disk (except the select into Statement ). In addition, if these types of operations are common
4. Other considerations for Performance Optimization
The main factors that affect SQL Server are listed above. The impact of the operating system is also
In Windows NT, select the file system, network protocol, enabled services, and SQL Server priority.
SQL Server performance is also affected to varying degrees.
There are so many factors that affect performance, and the applications are different. It is unrealistic to find a general optimization solution.
In the process of system development and maintenance, you must constantly adjust the running conditions. In fact, most of
The transformation and adjustment work is carried out on servers independent from the client, so it is also feasible.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More