SQL Performance 2

Source: Internet
Author: User

1. To optimize the query, try to avoid full table scanning. First, consider creating an index on the columns involved in where and order.

2. Try to avoid null value determination on the field in the WHERE clause. Otherwise, the engine will discard the index and perform full table scanning, for example:
Select ID from t where num is null
You can set the default value 0 on num to make sure that the num column in the table does not have a null value, and then query it like this:
Select ID from t where num = 0

3. Try to avoid using it in the WHERE clause! = Or <> operator. Otherwise, the engine will discard the index for full table scanning.

4. Try to avoid using or in the WHERE clause to connect to the condition. Otherwise, the engine will discard the index and perform full table scanning, for example:
Select ID from t where num = 10 or num = 20
You can query it as follows:
Select ID from t where num = 10
Union all
Select ID from t where num = 20

5. Use in and not in with caution. Otherwise, a full table scan may occur, for example:
Select ID from t where num in (1, 2, 3)
For continuous values, you can use between instead of in:
Select ID from t where num between 1 and 3

6. The following query will also cause a full table scan:
Select ID from t where name like '% ABC %'
To improve efficiency, you can consider full-text search.

7. If a parameter is used in the WHERE clause, a full table scan is performed. Because SQL only parses local variables at run time, but optimizesProgramThe access plan cannot be postponed to runtime; it must be selected during compilation. However, if an access plan is created during compilation, the value of the variable is still unknown and thus cannot be used as an input for index selection. The following statement performs a full table scan:
Select ID from t where num = @ num
You can change it to force query to use the index:
Select ID from T with (index name) where num = @ num

8. Avoid performing expression operations on fields in the WHERE clause as much as possible. This will cause the engine to discard the use of indexes for full table scanning. For example:
Select ID from t where num/2 = 100
Should be changed:
Select ID from t where num = 100*2

9. Avoid performing function operations on fields in the WHERE clause as much as possible, which will cause the engine to stop using the index for full table scanning. For example:
Select ID from t where substring (name, 1, 3) = 'abc' -- id whose name starts with ABC
Select ID from t where datediff (day, createdate, '2017-11-30 ') = 0 -- '2017-11-30' generated ID
Should be changed:
Select ID from t where name like 'abc %'
Select ID from t where createdate> = '2014-11-30 'and createdate <'2014-12-1'

10. do not perform functions, arithmetic operations, or other expression operations on the left side of "=" in the WHERE clause. Otherwise, the system may not be able to correctly use the index.

11. when using an index field as a condition, if the index is a composite index, you must use the first field in the index as the condition to ensure that the system uses the index, otherwise, the index will not be used, and the field order should be consistent with the index order as much as possible.

12. Do not write meaningless queries. If you need to generate an empty table structure:
Select col1, col2 into # T from t where 1 = 0
This typeCodeIf no result set is returned but system resources are consumed, change it to the following:
Create Table # T (...)

13. In many cases, replacing in with exists is a good choice:
Select num from a where num in (select num from B)
Replace the following statement:
Select num from a where exists (select 1 from B where num = A. Num)

14. not all indexes are valid for queries. SQL queries are optimized based on the data in the table. When there is a large number of duplicate data in the index column, SQL queries may not use indexes, for example, if a table contains sex fields, male and female are almost half of each other, indexing sex does not play a role in query efficiency.

15. the more indexes, the better. Although the index can improve the efficiency of the SELECT statement, it also reduces the efficiency of insert and update, because the insert or update statements may recreate the index, therefore, you need to carefully consider how to create an index, depending on the actual situation. It is recommended that the number of indexes in a table be no more than 6. If there are too many indexes, consider whether the indexes on some columns that are not frequently used are necessary.

16. update the clustered index data column should be avoided as much as possible, because the order of the clustered index data column is the physical storage order of the table records. Once the column value changes, the order of the entire table record will be adjusted, it will consume a considerable amount of resources. If the application system needs to frequently update the clustered index data column, consider whether to create the index as a clustered index.

17. use numeric fields whenever possible. If fields containing only numerical information are not designed as numeric fields, this will reduce query and connection performance and increase storage overhead. This is because the engine compares each character in the string one by one during query and connection processing, and only one comparison is required for the number type.

18. try to use varchar/nvarchar instead of Char/nchar, because the first step is to reduce the storage space of the variable-length field, which can save storage space. Secondly, for queries, searching in a relatively small field is obviously more efficient.

19. Do not use select * from t anywhere, replace "*" with a specific field list, and do not return any fields that are not used.

20. Try to use table variables instead of temporary tables. If the table variable contains a large amount of data, note that the index is very limited (only the primary key index ).

21. Avoid frequent creation and deletion of temporary tables to reduce the consumption of system table resources.

22. Temporary tables are not unavailable. Using them appropriately can make some routines more effective. For example, when you need to reference large tables or a data set in common tables repeatedly. However, it is best to use the export table for one-time events.

23. when creating a temporary table, if a large amount of data is inserted at one time, you can use select into instead of create table to avoid creating a large number of logs to increase the speed. If the data volume is small, to ease system table resources, create table first and then insert.

24. if a temporary table is used, you must explicitly delete all temporary tables at the end of the stored procedure. First truncate the table and then drop the table, so that the system table can be locked for a long time.

25. Avoid using a cursor whenever possible, because the efficiency of the cursor is poor. If the cursor operation has more than 10 thousand rows of data, you should consider rewriting.

26. before using the cursor-based or temporary table method, you should first find a set-based solution to solve the problem. The set-based method is generally more effective.

27. Like a temporary table, the cursor is not unavailable. Using a fast_forward cursor for a small dataset is usually better than other row-by-row processing methods, especially when several tables must be referenced to obtain the required data. A routine that includes "sum" in the result set is usually faster than a cursor. If the development time permits, you can try both the cursor-based method and the set-based method to see which method works better.

28. Set nocount on at the beginning of all stored procedures and triggers, and set nocount off at the end. You do not need to send the done_in_proc message to the client after executing each statement of the stored procedure and trigger.

29. Avoid large transaction operations as much as possible to improve the system concurrency capability.

30. Avoid returning a large amount of data to the client as much as possible. If the data volume is too large, consider whether the corresponding requirements are reasonable.

The specific SQL statement needs to be written according to the actual application in many cases, which is not described here.

 

SQL Performance 2

1. logical database and Table Design

The logical design of databases, including the relationship between tables, is the core of optimizing the performance of relational databases. A good logical database design can lay a good foundation for optimizing databases and applications.

Standardized database logic design includes replacing long data tables with multiple and correlated narrow tables. The following are some benefits of using standardized tables.

A: Because tables are narrow, sorting and indexing can be performed more quickly.

B: Because there are multiple tables, it is possible to have multiple indexes.

C: narrower and more compact Indexes

D: Each table can have fewer indexes, so it can improve the speed of insert update Delete and so on, because these operations will have a great impact on system performance when there are many indexes.

E: fewer null values and fewer redundant values increase the database's closeness. Due to standardization, it increases the complexity of the number of referenced tables and the connection relationships during data acquisition. Too many tables and complex connections can reduce the server performance. Therefore, you need to consider both of them comprehensively.

When defining primary keys and Foreign keys with relevant relationships, note that the primary keys used to connect multiple tables must have the same data type as the referenced keys.

2 Index Design

A: Try to avoid table scanning.

Check the WHERE clause of your query statement, because this is an important concern of the optimizer. Each column in The WHERE clause is a possible candidate index. To achieve optimal performance, consider the following example: column1 is given in the WHERE clause.

The following two conditions can improve the query performance of indexes!

First, there is a single index in the column1 column of the table.

Second, there are multiple indexes in the table, but column1 is the first index column.

Avoid defining Multiple indexes while column1 is the second or later index. Such indexes cannot optimize server performance.

For example, the following example uses the pubs database.

Select au_id, au_lname, au_fname from authors
Where au_lname = 'white'

Indexes created on the following columns will be useful to the optimizer.

? Au_lname
? Au_lname, au_fname

The indexes created in the following columns will not play a good role in the optimizer.

? Au_address
? Au_fname, au_lname

Consider using narrow indexes on one or two columns. Narrow indexes are more effective than multi-index and composite indexes. With narrow indexes, there will be more rows and fewer index levels on each page (relative to multi-index and composite index), which will promote system performance.

For multi-column indexes, SQL Server maintains the density statistics (used for Union) on the indexes of all columns and the histogram (column chart) Statistics on the first index. According to the statistical results, if the first index on the composite index is rarely used, the optimizer will not use the index for many query requests.

Useful indexes can improve the performance of select statements, including insert, uodate, and delete.

However, changing the content of a table will affect the index. Each insert, update, and delete statement may degrade the performance. Experiments show that you should not use a large number of indexes on a single table, or use overlapping indexes on shared columns (reference constraints used in multiple tables.

Check the number of unique data in a column and compare it with the number of data rows in the table. This is the data selectivity. This comparison result will help you decide whether to use a column as the selected index column. If necessary, you can create an index. You can use the following query statement to return the number of different values in a column.

Select count (distinct cloumn_name) from table_name
Suppose column_name is a 10000-row table, then the returned values of column_name are used to determine whether to use and what indexes should be used.

Unique values Index

5000 nonclustered Index
20 clustered Index
3 No Index

Selection of secondary indexes and non-secondary Indexes

<1:> the physical order of rows is the same as that of indexes. Page-level and low-level indexes all contain actual data pages. A table can have only one distinct index. Because the update and delete statements require a relatively large number of read operations, secondary indexes can often accelerate such operations. In a table with at least one index, you should have a secondary index.

In the following situations, you can consider using the explain index:

For example, the number of different values in a column is limited (but not very small)

The State names in the customer table are about 50 abbreviated values of different States. You can use the region index.

For example, you can use partial indexes for columns that return values within a certain range, for example, using between,>, >=, <, <= to operate on columns.
Select * from sales where ord_date between '2014/1/93 'and '2014/3/93'
For example, you can use the suffix index for columns that return a large number of results during a query.
Select * From phonebook where last_name = 'Smith'

When a large number of rows are being inserted into the table, avoid creating a secondary index on a column with a natural growth (for example, the identity column) in the table. If you have created an explain index, the insert performance will be greatly reduced. Because each inserted row must be at the end of the table and the last data page of the table.

When a data entry is being inserted (this data page is locked), all other inserted rows must wait until the current insertion ends.

 

The leaf-level page of an index contains the actual data pages, and the order of the data pages on the hard disk is the same as the logic order of the secondary index.

<2:> A non-distinct index means that the physical order of rows is different from the index order. The leaf level of a non-linear index contains a pointer to the row data page.

A table can have multiple non-secondary indexes. You can consider using non-secondary indexes in the following situations.

You can consider using non-linear indexes for columns with many different values.

For example, a part_id is listed in a part table.
Select * from employee where emp_id = 'pcm9809f'

You can use the limit index for the columns of the order by clause in a query statement.

3. query statement Design

By analyzing query statements, the SQL Server optimizer automatically optimizes the query and determines the most effective execution plan. The optimizer analyzes the query statement to determine which clause can be optimized, and selects useful indexes for clauses that can be optimized. Finally, the optimizer compares all possible execution plans and selects the most effective one.

When executing a query, A where clause is used to limit the number of rows that must be processed. Unless necessary, you should avoid unlimited read and processing of all rows in a table.

For example,
Select qty from sales where stor_id = 7131

It is very effective than the unrestricted query below

Select qty from sales

Avoid returning a large number of result sets for the customer's final data. It is more effective to allow SQL Server to run functions that meet its purpose to limit the size of the result set.

This reduces network I/O and improves application performance when multiple users have concurrent connections. The optimizer focuses on the query of the WHERE clause to use useful indexes. Each index in a table may become a candidate index included in the where clause. For the best performance, you can follow the following index for a given column column1.

First, there is a single index in the column1 column of the table.

Second, there are multiple indexes in the table, but column1 is the first index column. Do not use a query statement without the column1 column index in the WHERE clause, avoid using a non-first index with Multiple indexes in the WHERE clause.

In this case, Multiple indexes are useless.

For example, given a multicolumn index on the au_lname, au_fname columns of the authors table in
The pubs database,
The following query statement uses the index on au_lname
Select au_id, au_lname, au_fname from authors
Where au_lname = 'white'
And au_fname = 'johnson'
Select au_id, au_lname, au_fname from authors
Where au_lname = 'white'
The following query does not use an index because it uses a non-first index with Multiple indexes.
Select au_id, au_lname, au_fname from authors
Where au_fname = 'johnson'

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.