One day my colleague threw me a seemingly complex query (the actual design of two tables, set to set off) said just changed the date condition, but a query 5 seconds out of the data, a fundamental can not find out. Now tidy up the solution process, and the knowledge points involved.
If there are any shortcomings, please understand and welcome the criticism, not very grateful.
Please respect the author's labor results, reproduced please indicate the original link:
Http://www.cnblogs.com/zzry/p/5857751.html
I. Description of the problem
Environment: SQL Server 2008r2
Phenomenon:
Query design to
Ods_table_a daily data 7 million now totals more than 6 billion. Index + partition already established
Mid_table_b Daily data 200,000 total 30 million. Index is not partitioned
When the etldate is ' 2016-08-12 ' and the previous time, this query 5 seconds out of data,
When etldate for ' 2016-08-16 ' and later, this query does not come up with data.
Paste the problem sql: Did the data field processing, for this topic note points in the query because the choice of date causes the query time to become super slow, rather than change the SQL notation such as with temporary table, forced index on.
----------"code Start"
Select COUNT(distinct( Case whenCol_userid3 is NULL ThenCol_userid6End)) as 'AA',COUNT(distinct( Case whenCol_userid3 is NULL andCol_userid7 is not NULL ThenCol_userid6End)) as 'BB',COUNT(distinct( Case whenCol_userid3 is not NULL ThenCol_userid6End)) as 'cc',COUNT(distinct( Case whenCol_userid3 is not NULL andCol_userid7 is not NULL ThenCol_userid6End)) as 'DD',SUM( Case whenCol_userid3 is not NULL ThenEeEnd) as 'ee' from( SelectC.col_userid3,c.ee,g.col_userid6 from ( SelectB.col_userid2 asCOL_USERID3,COUNT(B.COL_USERID2) asee from ( SelectCol_userid asCOL_USERID1,min(eventtime) astime1 fromods_table_awhereEtldate= '2016-08-12' andColid like 'heihei%' Group byCol_userid ) asaJoin ( SelectCol_userid asCol_userid2,eventtime astime2 fromods_table_awhereEtldate= '2016-08-12' andItemId= '1111111111101' andColid like 'haha-%' andColid not like 'haha-skill%' andColid not like 'haha-fine%' ) asb onA.col_userid1=B.col_userid2 andA.time1>b.time2Group byB.col_userid2) asC Right Join ( Select DISTINCTD.col_userid4 asCol_userid6 from ( Select distinctCol_userid asCol_userid4 fromMid_table_bwhereEtldate= '2016-08-12' ) asDJoin ( SelectCol_userid asCOL_USERID5 fromods_table_awhereEtldate= '2016-08-12' andColid like 'heihei%' ) asF onD.col_userid4=f.col_userid5) asg onC.col_userid3=g.col_userid6) asI Left Join( SelectCol_userid asCol_userid7 fromMid_table_bwhereEtldate= '2016-08-12' andIstodaypay= '1') ash onI.col_userid6=H.col_userid7
----------The end of code
Two. Resolution process
1. Take a look at the execution plan for the above code as it seems to need to use the index everywhere. There should be no big problem.
You may notice that the missing index information for the system hint, plus the same effect, does not solve the ' 2016-08-16 ' query slow problem.
2. On the date of modification, the"
All"
etldate= ' 2016-08-12 ' The change into
etldate= ' 2016-08-16 '
take a look at the execution plan:
I'm sorry I didn't come out for half an hour. View estimated execution execution is similar to the above diagram.
reduce the amount of design data plus top 1 I'll look at the execution plan again:
no mapping, even if there's less parallelism than the one above .The initial thought is that the optimizer because the estimated number of rows and other reasons do not choose the degree of parallelism, hurriedly find the code to let it force this way. Find a Song master: Forcing SQL Server execution plan to use parallel elevation performance under complex query statements
http://www.cnblogs.com/CareySon/p/3851113.htmlApart plus keywordsOPTION (Querytraceon 8649)But the application to the actual discovery query efficiency without any improvement, long-lasting results. Later asked the song master (thank Song). He said that some operations can not be parallel, said the update statistics to try first.
a hit!
a hit!
a hit! Execute the following code:
Update STATISTICS ods_table_a
by default, the query optimizer updates the statistics as needed to improve the query plan, but in some cases, you can improve query performance by using update STATISTICS or stored procedure sp_updatestats to update statistics more frequently than the default updates.
new inserted data no statistics, large table Automatic UPDATE STATISTICS trigger Automatic Update mechanism is not enough frequency, preferably updated regularly.
about the
update STATISTICS is not exhausted:
give the relevant technology to attach the link
UPDATE STATISTICS Portal: https://msdn.microsoft.com/zh-cn/library/ms187348.aspx to this problem resolution. Three. Summarize
For large table new inserted data no statistical information, resulting in the above phenomenon, a date for the speed of the watershed (check 12th before 5 seconds out of data, check 12th after the dead or alive not come out. )
The workaround is that large tables automatically update statistics that trigger automatic update mechanisms that are not frequent enough to update regularly.
SQL Tuning--remember that statistics are not updated in time to cause queries to be super slow