A database query optimization method for database query optimization

Source: Internet
Author: User

I. Building an index

Microsoft's SQL Server provides two types of indexes: Clustered indexes (clustered index, also called clustered indexes, clustered indexes), and nonclustered indexes (nonclustered index, also called nonclustered indexes, non-clustered indexes).

Clustered Index

The body of our Chinese dictionary is itself a clustered index. For example, we have to check the word "Ann", it will be very natural to open the first few pages of the dictionary, because "ann" Pinyin is "an", and alphabetical order of Chinese characters in the dictionary is the English letter "a" beginning and "Z", then the word "Ann" naturally ranked in the front of the dictionary. If you have turned over all the parts that begin with "a" and still cannot find the word, then it means that you do not have the word in your dictionary, and if you look up the word "Zhang", you will also turn your dictionary into the last part, because the pinyin of "Zhang" is "Zhang". That is, the body part of the dictionary is itself a directory, and you do not need to look up other directories to find what you need to find. We refer to this body of content itself as a directory of certain rules, called a "clustered index."

Nonclustered Indexes

You may encounter the word you do not know, do not understand its pronunciation, at this time, you can not follow the method just find the word you want to check, and need to go to the "radicals" to find the word you are looking for, and then according to the page number after the word directly to a page to find the word you are looking for. But the sort of words you find in combination with the "radicals" and "gept" is not really the sort method of the body, for example, you check the word "Zhang", we can see in the Gept table after the Radicals "Zhang" page number is 672 pages, gept table "Zhang" above is "Chi" word, but the page number is 63 pages, "Zhang" below is "crossbow "Word, page is 390 pages. Obviously, these words are not really in the "Zhang" the word of the upper and lower side, now you see the continuous "Chi, Zhang, crossbow" three words is actually their order in the nonclustered index, is the dictionary body of words in the non-clustered index mapping. We can find the words you need in this way, but it takes two procedures to find the results in the catalog and then turn to the page numbers you need. We put this kind of directory purely as a directory, the body is purely the sort of body is called "nonclustered index".

Further, we can easily understand that each table can have only one clustered index, because the catalog can only be sorted in one way.

1. Using aggregate index is faster than primary key with not aggregate index
Here is the instance statement: (all extracts 250,000 data)
Select Gid,fariqi,neibuyonghu,reader,title from Tgongwen where fariqi= ' 2004-9-16 ' usage time: 3326 ms
Select Gid,fariqi,neibuyonghu,reader,title from Tgongwen where gid<=250000 usage time: 4470 ms
Here, the aggregate index is nearly 1/4 faster than the primary key speed that is not an aggregated index.
2, using the aggregate index than the general primary key for the order by when the speed, especially in the case of small data volume
Select Gid,fariqi,neibuyonghu,reader,title from Tgongwen order by Fariqi spents: 12936
Select Gid,fariqi,neibuyonghu,reader,title from Tgongwen order by GID Spents: 18843
Here, it is 3/10 faster to use the aggregate index than the general primary key for order by. In fact, if the amount of data is very small, it is much faster to use the clustered index as the rank sequence than the non-clustered index, and if the data volume is large, such as more than 100,000, the speed difference between the two is not obvious.
3. Using the time period within the aggregated index, the search time is scaled down by the percentage of the data in the data table, regardless of how many of the aggregated indexes are used
Select Gid,fariqi,neibuyonghu,reader,title from Tgongwen where fariqi> ' 2004-1-1 ' spents: 6343 milliseconds (extract 1 million)
Select Gid,fariqi,neibuyonghu,reader,title from Tgongwen where fariqi> ' 2004-6-6 ' spents: 3170 milliseconds (extract 500,000)
Select Gid,fariqi,neibuyonghu,reader,title from Tgongwen where fariqi= ' 2004-9-16 '
Time: 3326 milliseconds (identical to the result of the previous sentence.) If the number of acquisitions is the same, then the greater than and equals sign are the same)
Select Gid,fariqi,neibuyonghu,reader,title from Tgongwen where fariqi> ' 2004-1-1 ' and fariqi< ' 2004-6-6 ' spents: 3280 ms
4. The date column will not slow down the query speed because there is a minute or seconds input
In the following example, there are 1 million data, 500,000 data after January 1, 2004, but only two different dates, the date is accurate to the day, before the data 500,000, there are 5,000 different dates, the date is accurate to the second.
Select Gid,fariqi,neibuyonghu,reader,title from Tgongwen where fariqi> ' 2004-1-1 ' ORDER by Fariqi Spents: 6390 ms
Select Gid,fariqi,neibuyonghu,reader,title from Tgongwen where fariqi< ' 2004-1-1 ' ORDER by Fariqi Spents: 6453 ms

5. Appropriate use of the index

Indexes can help improve retrieval performance, but too many or improper indexes can cause system inefficiencies. Because the user adds an index to the table, the database will do more work. Too many indexes can even cause index fragmentation.

II. Optimization of SQL statements

1. Minimize use of SELECT *

Use * to return all fields, but when the database server resolves, the structure of the table is parsed and all the field names of the table are then listed. This increases the time for analysis.

2. Use the alias of the table

When you concatenate multiple tables in an SQL statement, use the alias of the table and prefix the alias to each column, which reduces the parsing time and reduces the syntax errors caused by the column ambiguity.

3. Use less in

In a subquery, the NOT IN clause performs an internal sort and merge. In either case, the not in is the least effective because it performs a full table traversal of the table in the subquery, thus reducing the use of not. To avoid using not, we can change it to an outer join (Outer Joins) or not EXISTS.

4. Connection order in the WHERE clause

SQL Server parses the WHERE clause in a bottom-up order, and according to this principle, the connection between tables must be written before other where conditions, and those conditions that can filter out the maximum number of records must be written at the end of the WHERE clause. For example:

(Low efficiency)

SELECT * from EMP E

WHERE SAL > 50000

and JOB = ' MANAGER '

< (SELECT COUNT (*) from EMP WHERE mgr=e.empno)

Efficient

SELECT * from EMP E

where < (SELECT COUNT (*) from EMP WHERE mgr=e.empno)

and SAL > 50000

and JOB = ' MANAGER '

5. Avoid calculating on index columns

In the WHERE clause, if the index column is part of a function, the optimizer uses a full table scan without using the index.

For example, the following statement is inefficient:

SELECT ... From DEPT WHERE SAL * > 25000

And the following statement will be efficient:

SELECT ... From DEPT WHERE SAL > 25000/12

It is important to note that the index columns are not processed in the query, such as: Trim,substring,convert, and so on.

6, reduce the use of cursors

Any kind of cursor can degrade SQL Server performance. One or more of these below replace the use of cursors: temporary tables, while loops, derived tables, correlated subqueries, case statements, multiple queries. If you cannot avoid using cursors, at least try to increase their speed and find out how to speed up cursors.

A database query optimization method for database query optimization

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.