Million-level data is really not big data, but this file of data does assess the performance of ordinary query statements, different writing methods have a wide variety of performance, all in this level is revealed, it not only examines the performance of your SQL statement, but also examines the programmer's ideas.
A query interface for the corporate system has recently been very slow, with the interface responding in 6-8 seconds or longer. The check discovery problem appears on the database side, and the query is time consuming. The interface involves data from multiple tables with 1.5 million data, a maximum of more than 7 million data associated with a child table, and other table data between hundreds of thousands of and millions of. In fact, according to such data level query response time should be in the millisecond level, should not have such a long time. Then it's time to troubleshoot the problem.
Because this interface function is mainly information retrieval, query is more complex, too many combinations of conditions, using stored procedures too many limitations, so the query uses the dynamic splicing of SQL statements. Query method is the most commonly used 1, the total number of data 2, data paging. Directly on the code (some conditions).
Select Numb=count (Distinct t1.tlntcode) from ZWOMMAINM0 t1 inner joins ZWOMMLIBM0 T2 on T1.tlntcode=t2.tlntcode join ZW OMEXPRM0 CP on T1.tlntcode=cp.tlntcode join ZWOMILBSM0 i on I.tlntcode=t1.tlntcode join ZWOMILBSM0 p on p.tlntcod E=i.tlntcode join ZWOMILBSM0 l on L.tlntcode=i.tlntcode where IsNull (T2.DELETEFG, ' 0 ') = ' 0 ' and Cp.companyn like '%ibm% ' and cp.sequence=0 and I. Mlbscode in (' i0100 ', ' i0101 ', ' i0102 ', ' i0103 ', ' i0104 ', ' i0105 ', ' I0106 ') and i.locatype= ' and P.mlbscode in (' p0100 ', ' p0102 ', ' p0104 ', ' p0200 ', ' p0600 ') and p.locatype= ' and L.mlbscode in (' l030 ') and l.locatype= ' 10 '
View Execution Time
As you know, the entire query takes time to parse and compile for 4 seconds, and executes for 0.7 seconds. Query statement did not find any problem, then the problem occurs in the compilation, if the SQL statement to execute the original query plan, then skip the compilation, only 0.7 seconds to get the results. So how do you precompile, or use an existing execution plan?
SQL Server has an optimization algorithm that holds the execution plan for the previous execution of the SQL statement, and all execution plans are stored in the Sys.syscacheobjects table, and if the current SQL statement matches in the cache table, then it executes the execution plan that matches to it and no longer compiles. So the solution to our first thought is stored procedures (this is what we often say in the interview or theory of stored procedures are precompiled, usually that is to say, there is no deep impression), yes it can be precompiled, but due to constraints, the query is too complex, If you associate a table that is not using the query criteria, it affects performance. Excluding stored procedures, the other thing we think about
EXEC sp_executesql @Sql, N ' @p NVARCHAR (+) ', @p
Why sp_executesql can reuse a query plan and ordinary SQL statements cannot, we can see the problem from the cache table
Select Bucketid,cacheobjtype,objtype,objid,sql,sqlbytes from sys.syscacheobjects where cacheobjtype= ' Compiled Plan '
The SQL field in the table is the query statement for the historical execution plan, and if the SQL match succeeds then the matching execution plan is executed. Ordinary SQL statements are difficult to match, because they contain not only the structure but also the parameters, and the reuse rate is very low. While sp_executesql executes only the structure, the parameters are not stored, so the reuse rate is high. Find a solution, then act directly.
declare @Sql nvarchar (max), @cpny nvarchar () = ' IBM '
DECLARE @i varchar (+) = ' i0100,i0101,i0102,i0103,i0104,i0105,i0106,i0107,i0109 ', @p varchar (+) = ' p0100,p0101, p0102,p0103,p0104,p0107,p0201 ', @l varchar (+) = ' l030 ' Set @Sql = ' Select value into #i from F_csplit (@i, ', ') select Value into #p from F_csplit (@p, ",") select Value into #l from F_csplit (@l, ",") Select Numb=count (Distinct t1.tlntcode) From ZWOMMAINM0 t1 inner joins ZWOMMLIBM0 T2 on T1.tlntcode=t2.tlntcode joins ZWOMILBSM0 I on I.tlntcode=t1.tlntcode join ZWOMILBSM0 p on P.tlntcode=t1.tlntcode join ZWOMILBSM0 l on L.tlntcode=t1.tlntcode join ZWOMEXPRM0 CP on T1.TLNTCODE=CP.T Lntcode where IsNull (T2.DELETEFG, ' 0 ') = ' 0 ' and I.mlbscode in (select value from #i) and i.locatype= '--and i.mlbs Type= ' and P.mlbscode in (select value from #p) and p.locatype= '--and p.mlbstype= ' + l.mlbscode in (select V Alue from #l) and l.locatype= ""--and l.mlbstype= ' and cp.companyn like '% ' [email protected]+ '% ' and CP. Sequence=0 ' EXEC sp_executesql @Sql, N ' @cpNY NVARCHAR, @i NVARCHAR, @p NVARCHAR (+), @l NVARCHAR (+), @cpny, @i,@p,@l
It takes 0.5 seconds, regardless of how the parameters change basically in the 0.5-second fluctuation, basically meet our requirements, if you want to further optimize the table can also be partitioned and other optimization scenarios.
When we find that queries are slow, it's possible that analysis and compilation take up too much of your time, so simplifying your query statements and reusing execution plans can help you get out of the woods.
Query optimization for millions data in SQL Server