Problems with paging optimization and row_number () paging in SQL Server

Source: Internet
Author: User
Tags cpu usage

Recently there is a project response, in the server CPU usage is high, our Event query page is very slow, query a few records unexpectedly 4 minutes or more, and in the second page to take so much time, this is certainly unacceptable, but also let the scene with SQLServerProfiler the statement grabbed up.

Paging with Row_number ()

Let's take a look at the page statements caught at the scene:

select top A.*,ag. Name as agentservername,,d.name as mgrobjtypename,l.username as UserName from EventLog as a left join Mgrobj as B on a.mgr Objid=b.id and A.AGENTBM=B.AGENTBM left joins Addrnode as C on B.addrid=c.id left joins Mgrobjtype as D on B.mgrobjtypeid=d. The Id left joins Eventdir as E on a.eventbm=e.bm left joins Agentserver as AG on A.agentbm=ag.  AGENTBM left join Loginuser as L on A.cfmoper=l.loginguid where A.orderno not in (select top 0 OrderNo from EventLog as A left joins Mgrobj as B on a.mgrobjid=b.id left joins Addrnode as C on B.addrid=c.id where 1=1 and a.alarmtime>= ' 2014- 12-01 00:00:00 ' and a.alarmtime<= ' 2014-12-26 23:59:59 ' and B.addrid in (' 02109000 ',......, ' 02109002 ') Order by AlarmTime D ESC) and 1=1 and a.alarmtime>= ' 2014-12-01 00:00:00 ' and a.alarmtime<= ' 2014-12-26 23:59:59 ' and B.addrid in (' 02109 ',......, ' 02109002 ') Order by AlarmTime DESC 

This is a typical use of two times top for paging, the principle is: to find out pageSize*(pageIndex-1) (T1) The number of records, and then Top out of PageSize the record is not in T1, is the current page record. This kind of query efficiency is not high mainly used not in . Refer to my previous article, "How the program ape solves SQL Server's cpu100%" mentioned:"The index is useless for an expression that does not use the SARG operator."

Then use ROW_NUMBER paging instead:

With CTE as (select A.*,ag. Name as agentservername,d.name as mgrobjtypename,l.username as Username,b.addrid,row_number () over (ORDER by AlarmTime DESC) as Rownofrom eventlog as A with (FORCESEEK) left join Mgrobj as B on A.mgrobjid=b.id and A.AGENTBM=B.AGENTBM left Joi n Addrnode as C on B.addrid=c.id left joins Mgrobjtype as D on B.mgrobjtypeid=d.id left joins Eventdir as E on a.eventbm=e.b M left joins Agentserver as AG on A.agentbm=ag. AGENTBM left joins Loginuser as L on A.cfmoper=l.loginguid where a.alarmtime>= ' 2014-12-01 00:00:00 ' and a.alarmtime<= ' 2014-12-26 23:59:59 ' and B.addrid in (' 02109000 ',......, ' 02109002 ')) SELECT * from CTE WHERE RowNo between 1 and 20;

The execution time is increased from 14 seconds to 5 seconds, which indicates that row_number paging is more efficient, and that this is top top a lot more elegant than pagination.

The "cheat" query engine lets queries query as you wish

But why is it 5 seconds to query 20 records, especially if the table is a time index--refer to the index mentioned in "How the program ape solves SQL Server's cpu100%".

I tried to get rid AND b.AddrId in (‘02109000‘,……,‘02109002‘) of this sentence, the results of less than 1 seconds to the 538 records query out, and add location limit this sentence, the result is 204 lines. Why does the result set is not big, spends the time to be so much different? Look at the execution plan and find that you are walking on a different index than the time index.

Put this question on the SQL Server group, and soon, Takakuwa gave a reply: to achieve with the removal of the site limit this sentence, use AdddrId+‘‘ in .

What do you mean? For a moment not to understand, is Takakuwa not read my statement? Soon, someone added, to cheat the query engine. " cheat "? Still do not understand, but I did, the above-mentioned CTE statement copy out intact, and then change this sentence AND b.AddrId in (‘02109000‘,……,‘02109002‘) in order to AND b.AddrId+‘‘ in (‘02109000‘,……,‘02109002‘) , a little execution, God!!! It's done in less than 1 seconds. In the execution of the plan pair, sure enough to go is the time index:

Later, remember the Query engine optimization principle, if you have a condition with an operator or use a function, the query engine will abandon the optimization, and perform a table scan. The head suddenly turned around, before using the b.AddrId+‘‘ query engine to try to add the Mgrobj table to do optimization, then two tables together, will cause the estimated number of records greatly increased, and used b.AddrId+‘‘ , the query engine will first by the time index of the record brush selected, so that the effect, That is, forcing the CTE to perform the condition instead of the in in conditional brush selection in the CTE. I see! Sometimes, an overly optimized query engine can lead to the opposite effect, and if you know the principle of optimization, then you have some small tricks to make the query engine optimize as you wish .

Row_number () Paging in the case of a large number of pages

Things are here, not yet. After the colleague and I reaction, query to the back of the page, and card! What? I re-execute the above statement, the time range from 2011-12-01 to 2014-12-26, the number of records is limited to 19981 to 20000, sure enough, the query to about 30 seconds, view the execution plan, is the same, why?

Takakuwa suspect that Key lookup is too much to cause, it is recommended to first page out the RID and then do key lookup. I don't know what that means. Print out the execution plan and IO:

Looking at Io, it's clear that the more pages you get to the back, the more pages you read in several other related tables. I speculate that when row_number paging, if there is a table connection, sorted to the number of records returned, the previous record is to participate in the table connection , resulting in a later paging, the more slowly, because the more associated tables to scan.

Isn't there a way out? This time Song Sang heroic stand out: "You give the table after adding a forceseek hint can break." This is really like the sound of nature, immediately try.

Use the Forceseek hint to force the table to go index

Looked up the following information:

Hints introduced in SQL Server2008 ForceSeek that can be used to replace index scans with index lookups

So, what happens when you add this sentence to the EventLog table?

Sure enough, the query plan changed, starting with the hint, missing the include index. Quickly add, sure enough, according to this way query time changed to 18 seconds, there is progress! But looking at Io, as above, is not getting less. However, finally learned a new skill, and Song Sang is also very enthusiastic about the evening to help see.

Put other tables that are not in the where, outside the CTE.

According to the above Io, it was soon mentioned that other left join tables were placed outside the CTE. This is a way to put the addition eventlog , mgrobj and addrnode the table outside, the statement is as follows:

With CTE as (select A*,b.addrid,b.name as Mgrobjname,b.mgrobjtypeid          , Row_number () over (ORDER by AlarmTime DESC) as ROW Nofrom eventlog as Aleft join Mgrobj as B on A.mgrobjid=b.id and A.AGENTBM=B.AGENTBM left join Addrnode as C on b.addrid=c . Id where a.alarmtime>= ' 2011-12-01 00:00:00 ' and a.alarmtime<= ' 2014-12-26 23:59:59 ' and b.addrid+ ' in (' 02109000 ', ..., ' 02109002 ')) SELECT A.*, AG. Name as agentservername,d.name as mgrobjtypename,l.username as Usernamefrom CTE a left joins Eventdir as E on a.eventbm=e.b M left joins Mgrobjtype as D on A.mgrobjtypeid=d.id left joins Agentserver as AG on A.agentbm=ag. AGENTBM left joins Loginuser as L on A.cfmoper=l.loginguid WHERE RowNo between 19980 and 20000;

It worked, the IO was greatly reduced, and the speed was increased to 16 seconds.

Table ' Loginuser '. Scan count 1, logical read 63 times, physical read 0 times, read 0 times, LOB logic read 0 times, lob physical read 0 times, lob read 0 times. Table ' Agentserver '. Scan count 1, logical read 1617 times, physical read 0 times, read 0 times, LOB logic read 0 times, lob physical read 0 times, lob read 0 times. Table ' Mgrobjtype '. Scan count 1, logical read 126 times, physical read 0 times, read 0 times, LOB logic read 0 times, lob physical read 0 times, lob read 0 times. Table ' Eventdir '. Scan count 1, logical read 42 times, physical read 0 times, read 0 times, LOB logic read 0 times, lob physical read 0 times, lob read 0 times. Table ' Addrnode '. Scan count 1, logical read 119,997 times, physical read 0 times, read 0 times, LOB logic read 0 times, lob physical read 0 times, lob read 0 times. Table ' worktable '. Scan count 0, logical read 0 times, physical read 0 times, read 0 times, LOB logic read 0 times, lob physical read 0 times, lob read 0 times. Table ' EventLog '. Scan count 1, logical read 5,027 times, physical read 3 times, read 5,024 times, LOB logic read 0 times, lob physical read 0 times, lob read 0 times. Table ' Mgrobj '. Scan count 1, logical read 24 times, physical read 0 times, read 0 times, LOB logic read 0 times, lob physical read 0 times, lob read 0 times.

We see that the Addrnode table or scan count is very large. That can still improve, this time, I think of, first put addrNode , mgrobj mgrobjtype Three tables union query, put into a temporary table, and then eventlog do inner join , and then query the results and other tables do left join , this also can reduce IO.

Use temporary tables to store paging records in making table connections to reduce IO
IF object_id (' tmpmgrobj ') is not NULL DROP TABLE tmpmgrobjselect M.id,addrid,mgrobjtypeid,agentbm,m.name,a.name as Addrname to Tmpmgrobj from  dbo.mgrobj Minner JOIN dbo.addrnode A to A.id=m.addridwhere Addrid in (' 02109000 ',......, ' 021 09002 '); With CTE as (select A.*,b.addrid,b.mgrobjtypeid          , Row_number () Up (ORDER by AlarmTime DESC) as Rowno,ag. Name as agentservername,d.name as mgrobjtypename,l.username as Usernamefrom eventlog as Ainner join Tmpmgrobj as B on a.mg Robjid=b.id and A.agentbm=b.agentbmleft join Mgrobjtype as D on B.mgrobjtypeid=d.id left joins Agentserver as AG on A.agent Bm=ag. AGENTBM left join Loginuser as L on A.cfmoper=l.loginguid WHERE alarmtime> ' 2011-12-01 00:00:00 ' and alarmtime<= ' 2014 -12-26 23:59:59 ') SELECT * from CTE WHERE RowNo between 19980 and 20000IF object_id (' tmpmgrobj ') are not NULL DROP TABLE TM Pmgrobj

This query took only 10 seconds. Let's take a look at IO:

Table ' worktable '. Scan count 0, logical read 0 times, physical read 0 times, read 0 times, LOB logic read 0 times, lob physical read 0 times, lob read 0 times. Table ' Mgrobj '. Scan count 1, logical read 24 times, physical read 2 times, read 23 times, LOB logic read 0 times, lob physical read 0 times, lob read 0 times. Table ' Addrnode '. Scan count 1, logical read 6 times, physical read 3 times, read 0 times, LOB logic read 0 times, lob physical read 0 times, lob read 0 times. ----------table ' Loginuser '. Scan count 0, logical read 24 times, physical read 1 times, read 0 times, LOB logic read 0 times, lob physical read 0 times, lob read 0 times. Table ' worktable '. Scan count 0, logical read 0 times, physical read 0 times, read 0 times, LOB logic read 0 times, lob physical read 0 times, lob read 0 times. Table ' EventLog '. Scan count 93, logical read 32,773 times, physical read 515 times, read 1536 times, LOB logic read 0 times, lob physical read 0 times, lob read 0 times. Table ' Tmpmgrobj '. Scan count 1, logical read 3 times, physical read 0 times, read 0 times, LOB logic read 0 times, lob physical read 0 times, lob read 0 times. Table ' Mgrobjtype '. Scan count 1, logical read 6 times, physical read 1 times, read 0 times, LOB logic read 0 times, lob physical read 0 times, lob read 0 times. Table ' Agentserver '. Scan count 1, logical read 77 times, physical read 2 times, read 0 times, LOB logic read 0 times, lob physical read 0 times, lob read 0 times.

Besides EventLog, the IO of other tables is greatly reduced, and there is wood there?

Inner the difference between join and left join

However, a few more tests, found that the above statement is still a bit of a problem: the first page of the query, but also to use 5 seconds, and query time in the current month, also close to 5 seconds. What is this for? This time, Song Sang to help, provide another SQL statement, in the first few pages of the query 1 seconds out, and the number of pages later, the change is not big. I carefully compared the two statements, I originally used inner join , and Song Sang gave left join . What's the difference between the two? After careful comparison of the query plan, it was found that when use inner join , the query engine executes inner join rather than subqueries, using LEFT join The query engine executes the subquery first. Therefore, if inner join is used, the time index is not effectively used when querying data for 1 months. Finally, I came up with the statement below, in the query the latest data or the previous pages of data, can come out in about 1 seconds, and query the number of pages, in about 10 seconds, basically solve the problem.

if object_id (' tmpmgrobj ') is not NULL DROP TABLE  Tmpmgrobjselect M.id,addrid,mgrobjtypeid,agentbm,m.name,a.name as addrname,t.name as MgrObjTypeNameINTO TMPMGROBJ From Dbo.mgrobj Minner joins Dbo.addrnode A on A.id=m.addridinner joins Dbo.mgrobjtype T on M.mgrobjtypeid=t.idwhere AddrId+ ' In (' 02109000 ',......, ' 02109002 '); SELECT Tmp.*,ag. Name as Agentservername, l.username as Usernamefrom (SELECT a.*, B.mgrobjtypename, B.addrid,row_number () over (O Rder by AlarmTime DESC) as Rownofrom (SELECT * from Eventlogwhere alarmtime >= ' 2011-12-01 00:00:00 ' and Al Armtime <= ' 2014-12-26 23:59:59 ') as Aleft JOIN Tmpmgrobj as B on A.mgrobjid=b.id and A.AGENTBM=B.AGENTBM) TMP left JOI N Eventdir as E on TMP. EVENTBM = E.bmleft JOIN agentserver as AG on TMP. AGENTBM = AG. Agentbmleft JOIN Loginuser as l on tmp.cfmoper = L.loginguidwhere tmp. RowNo between 1 and 20;if object_id (' tmpmgrobj ') are not NULL DROP TABLE tmpmgrobj 
Other optimization references

In another group discussion, it is found that the use of ROW_NUMBER paged query to the next page will be more and more slowly this problem really bothers a lot of people.

Some people suggest, who would be so bored, to turn pages to thousands of pages later? I thought so at first, but after interacting with other people, I found that there really was a scenario where our software provided the last page of this function, the result ... Of course, one way to do this is to remove the last page when you design the software, and another idea is to query the page after more than half the number of pages, then the query is actually the first page of the query.

Others have suggested that the contents of the query be put into a temporary table, the temporary table is added to the index of the self-increment ID, so that the identification ID can be used to quickly brush the record. This is also a method that I intend to try later. However, this method is also problematic, is not able to do generic, must be based on each table to build a temporary table, in addition, in the large data query, the insertion of too many records, because the existence of the index is also slow, and each time, it is estimated that the CPU is also very tight. But anyway, this is a way of thinking.

Do you have any good advice? You might want to put your ideas in the comments and discuss them together.

Summarize

Now, let's summarize what we learned in this optimization process:

    • In SQL Server, ROW_NUMBER paging should be the most efficient and compatible with SQLServer2005 future databases
    • You can control the optimization of the query engine section by using the "cheat" query engine tips
    • ROW_NUMBERPaging has a performance problem with large pages, and can be circumvented with a few tricks
      • Using the index as far as possible by CTE
      • Place non-participating where tables outside the CTE of the pagination
      • If you have where too many tables to participate in, consider making a temporary table for tables that do not participate in paging, reducing IO
    • inner joinwill take precedence over subqueries, without left join
    • with(forceseek)indexed query with the ability to force queries

Problems with paging optimization and row_number () paging in SQL Server

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.