What is Sarg: http://www.ddvip.com/mc/db/3618.html)

Source: Internet
Author: User

Definition of Sarg: An operation used to restrict search, because it usually refers to a specific match, a matching within a worthy range or an and connection between two or more conditions. The format is as follows:

Column name operator <constant or variable>

Or

<Constant or variable> the column name of the operator column can appear on one side of the operator, while the constant or variable can appear on the other side of the operator. For example:

Name = 'zhang san'

Price> 5000

5000 <price

Name = 'zhangsan' and price> 5000 if an expression cannot meet the Sarg format, it cannot limit the search range, that is, SQL Server must determine whether each row meets all the conditions in the WHERE clause. Therefore, an index is useless for expressions that do not meet the Sarg format.

If a phase can be used as a scan parameter (Sarg), it is called an Optimized phase and the required data can be quickly obtained using the index.

To sum up the experience of using Sarg and drawing different conclusions from some materials in practice: 1. Whether the like statement belongs to Sarg depends on the type of the wildcard used

For example, name like 'zhang % ', which belongs to Sarg

However, name like '% Zhang' does not belong to Sarg.

The reason is that the wildcard % is enabled in the string so that the index cannot be used.

2. or may cause full table scan.

For example: Name = 'zhang san' and price> 5000 symbol Sarg,

The price of name = 'zhang san' or> 5000 does not match Sarg.

Using or will cause a full table scan.

3. statements not in the Sarg format caused by non-operators and functions

The most typical case for statements that do not meet the requirements of the Sarg format is to include non-operator statements, such as: Not ,! =, <& Gt ;,! <,!> , Not exists, not in, not like, and other functions. Below are several examples that do not meet the Sarg format:

ABS (price) <5000

Name like '% 3'

Some expressions, such:

Where price * 2> 5000

SQL Server will also regard it as Sarg, and SQL server will convert this formula:

Where price> 2500/2

However, we do not recommend this because sometimes SQL Server cannot guarantee that this conversion is completely equivalent to the original expression.

4. The role of in is equal to or.

Statement:

Select * From Table1 where TID in (2, 3)

And

Select * From Table1 where tid = 2 or tid = 3

If they are the same, they will cause a full table scan. If there is an index on the TID, its index will also become invalid.

5. Use not as few as possible

6. The execution efficiency of exists and in is the same.

A lot of data shows that exists is more efficient than in, and not exists should be used as much as possible to replace not in. But as a matter of fact, I tried it and found that the execution efficiency between the two is the same, whether the front is not or not. Because subqueries are involved, we tried to use the pubs database that comes with SQL Server this time. Before running SQL Server, we can open the statistics I/O status of SQL Server.

(1) Select title, price from titles where title_id in

(Select title_id from sales where qty> 30)

The execution result of this sentence is:

Table 'sales '. The scan count is 18, logical reads are 56, physical reads are 0, and pre-read is 0.

Table 'titles '. Scan count 1, logical read 2, physical read 0, pre-read 0.

(2) Select title, price from titles where exists

(Select * from sales where sales. title_id = titles. title_id and qty> 30)

The execution result of the second sentence is:

Table 'sales '. The scan count is 18, logical reads are 56, physical reads are 0, and pre-read is 0.

Table 'titles '. Scan count 1, logical read 2, physical read 0, pre-read 0.

We can see that the execution efficiency of exists and in is the same.

7. The usage efficiency of the charindex () function is the same as that of the like function with the wildcard %.

As mentioned above, if we add wildcard % before like, it will cause a full table scan, so its execution efficiency is low. However, according to some documents, the use of the charindex () function to replace like will greatly improve the speed. After my experiments, I found this explanation is also incorrect:

Select GID, title, fariqi, reader from tgongwen

Where charindex ('criminal investigation detachment ', Reader)> 0 and fariqi> '2017-5-5'

Time: 7 seconds, 4 scans, 7155 logical reads, 0 physical reads, and 0 pre-reads.

Select GID, title, fariqi, reader from tgongwen

Where reader like '%' + 'criminal investigation detachment '+' % 'and fariqi> '2017-5-5'

Time: 7 seconds, 4 scans, 7155 logical reads, 0 physical reads, and 0 pre-reads.

8. Union is not necessarily more efficient than or.

We have already mentioned that using or in the WHERE clause will cause full table scanning. In general, all the materials I have seen are recommended to use Union here to replace or. Facts have proved that most of these statements are applicable.

Select GID, fariqi, neibuyonghu, reader, title from tgongwen

Where fariqi = '2017-9-16 'or GID> 2004

Time: 68 seconds. 1 scan count, 404008 logical reads, 283 physical reads, and 392163 pre-reads.

Select GID, fariqi, neibuyonghu, reader, title from tgongwen

Where fariqi = '2017-9-16'

Union

Select GID, fariqi, neibuyonghu, reader, title from tgongwen where GID> 9990000

Time: 9 seconds. The number of scans is 8, the number of logical reads is 67489, the number of physical reads is 216, and the number of pre-reads is 7499.

In general, union is much more efficient than or.

However, after the experiment, I found that if the query columns on both sides of or are the same, the execution speed of Union is much lower than that of or. Although union scans indexes here, or scans the entire table.

Select GID, fariqi, neibuyonghu, reader, title from tgongwen

Where fariqi = '2017-9-16 'or fariqi = '2017-2-5'

Time: 6423 milliseconds. Scan count 2, logical read 14726, physical read 1, pre-read 7176.

Select GID, fariqi, neibuyonghu, reader, title from tgongwen

Where fariqi = '2017-9-16'

Union

Select GID, fariqi, neibuyonghu, reader, title from tgongwen

Where fariqi = '2014-2-5'

Time: 11640 milliseconds. The number of scans is 8, the number of logical reads is 14806, the number of physical reads is 108, and the number of pre-reads is 1144.

9. Field extraction should follow the principle of "How much is required and how much is requested" to avoid "select *"

Let's do a test:

Select top 10000 GID, fariqi, reader, title from tgongwen order by GID DESC

Time: 4673 milliseconds

Select top 10000 GID, fariqi, title from tgongwen order by GID DESC

Time: 1376 milliseconds

Select top 10000 GID, fariqi from tgongwen order by GID DESC

Time: 80 ms

From this point of view, each time we extract one field less, the data extraction speed will be correspondingly improved. The speed of improvement depends on the size of the discarded field.

10. Count (*) is no slower than count (field ).

In some documents, * is used to count all columns, which is obviously less efficient than column names in a world. This statement is actually unfounded. Let's take a look:

Select count (*) from tgongwen

Time: 1500 milliseconds

Select count (GID) from tgongwen

Time: 1483 milliseconds

Select count (fariqi) from tgongwen

Time: 3140 milliseconds

Select count (title) from tgongwen

Time: 52050 milliseconds

From the above, we can see that if Count (*) and count (primary key) are used at the same speed, while count (*) it is faster than other fields except the primary key. The longer the field, the slower the summary speed. I think, if Count (*) is used, SQL Server may automatically search for the smallest field for summary. Of course, if you write the count (primary key) directly, it will be more direct.

11. Order by sorting by clustered index columns with the highest efficiency

Let's look: (GID is the primary key, and fariqi is the aggregate index column)

Select top 10000 GID, fariqi, reader, title from tgongwen

Time: 196 milliseconds. Scan count 1, logical read 289, physical read 1, and pre-read 1527.

Select top 10000 GID, fariqi, reader, title from tgongwen order by GID ASC

Time: 4720 milliseconds. 1 scan count, 41956 logical reads, 0 physical reads, and 1287 pre-reads.

Select top 10000 GID, fariqi, reader, title from tgongwen order by GID DESC

Time: 4736 milliseconds. 1 scan count, 55350 logical reads, 10 physical reads, and 775 pre-reads.

Select top 10000 GID, fariqi, reader, title from tgongwen order by fariqi ASC

Time: 173 milliseconds. Scan count 1, logical read 290, physical read 0, pre-read 0.

Select top 10000 GID, fariqi, reader, title from tgongwen order by fariqi DESC

Time: 156 milliseconds. Scan count 1, logical read 289, physical read 0, pre-read 0.

From the above, we can see that the speed of not sorting and the number of logical reads are equivalent to the speed of "order by clustered index columns, however, these queries are much faster than those of "order by non-clustered index columns.

At the same time, when sorting by a field, whether in positive or reverse order, the speed is basically equivalent.

12. Efficient top

In fact, when querying and extracting ultra-large data sets, the biggest factor affecting the database response time is not data search, but physical I/0 operations. For example:

Select top 10 * from (

Select top 10000 GID, fariqi, title from tgongwen

Where neibuyonghu = 'Office 'order by gid desc) as

Order by GID ASC

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.