The hidden costs of insert Exec
Insert and Exec: relational brothers in arms, helping you get your data and
Then put it somewhere. But like all brothers, their relationship has its ups and
Downs and sometimes you need to look just below the surface to see the real
Issues. In this post I will cover hidden penalties that a demo--- seemingly
Purely aesthetic -- can bring to the fore.
To perform strate the first of these examples I will be using the following
Table:
USE tempdb
GO
CREATE TABLE dbo.MyTable
(
[SalesOrderID] [int] NOT NULL,
[SalesOrderDetailID] [int] NOT NULL,
[CarrierTrackingNumber] [nvarchar](25) NULL,
[OrderQty] [smallint] NOT NULL,
[ProductID] [int] NOT NULL,
[SpecialOfferID] [int] NOT NULL,
[UnitPrice] [money] NOT NULL,
[UnitPriceDiscount] [money] NOT NULL,
[LineTotal] [money] NOT NULL,
[rowguid] [uniqueidentifier] NOT NULL,
[ModifiedDate] [datetime] NOT NULL
)
GO
Consider the following queries, disregarding the fact that this isn't a good
Dynamic SQL example (I'm trying to keep it simple), and take a moment to think
About the differences between them from a behavioral point of view:
EXEC
('
INSERT dbo.MyTable
SELECT *
FROM AdventureWorks.Sales.SalesOrderDetail
')
GO
INSERT dbo.MyTable
EXEC
('
SELECT *
FROM AdventureWorks.Sales.SalesOrderDetail
')
GO
It's quite common to insert some data into a table from dynamic SQL, and
As long as I can remember, I 've believed the choice between these two forms
The query to be primarily an issue of style. My preference has been for
Latter style, as I feel that it's a bit more readable. It also has the benefit
Of better supporting ownership chains, but that's something that, in most cases,
We don't have to worry about any longer in SQL Server 2005 or 2008. But let's
Move beyond style and potential security issues and get to the substance. What
Actually happens when we run these two queries?
In the first case -- insert done inside of the exec -- the first step is that
Outer statement -- the exec itself -- is parsed and validated. Now we jump down one
Level of context, into the string that was passed to exec, And that string is
Parsed and validated. The referenced objects are resolved, and the plan cache is
Checked. If there is no plan, the query is compiled. And then it's executed.
Data is streamed from salesorderdetail into mytable and when the query inside
The exec is done control returns to the calling context -- exec -- and assuming
There were no errors, the exec is now completed.
In the second case, something quite a bit different occurs. The insert is
Parsed and validated, and an insert exec plan is generated. This plan does not
Involve salesorderdetail, as the query optimizer doesn't know at this point in
The process where the data will be inserted from, since that step is dynamic. So
The plan references something called the "parameter table" (Plan truncated
Simplicity ):
|--Table Insert(OBJECT:([tempdb].[dbo].[MyTable]))
|--Top(ROWCOUNT est 0)
|--Parameter Table Scan
Once this plan has been generated, we again jump down one level of context,
And the inner string is parsed and validated, the referenced object resolved,
The cache checked, and the compiled plan executed. But what is this parameter
Table thing?
Here's where things start really diverging. Data is not, at this point,
Streamed from salesorderdetail directly into mytable. Rather, it is streamed
From salesorderdetail into the parameter table. And the parameter table, as it
Turns out, is in actuality a hidden temporary table. And not until all of
Data has streamed into that hidden temporary table is control resumed by
Exec context, and only then does the data start moving into its final home.
The natural question you might be asking yourself at this point is, just how
Much overhead does this parameter table introduce into the equation? The answer
Might just send you racing to check your stored procedures: the additional cost
For the parameter table is well over 100% as compared with doing the insert
Inside of the exec. The fact that all of the data must be spooled to
Parameter table before the insert can even begin tells us that this must be
True, and it can be verified using a simple check against
SYS. dm_exec_requests DMV, as in the following example:
EXEC
('
INSERT dbo.MyTable
SELECT *
FROM AdventureWorks.Sales.SalesOrderDetail
')
SELECT
r.writes AS [INSERT inside EXEC writes]
FROM sys.dm_exec_requests r
WHERE
r.session_id = @@SPID
GO
INSERT dbo.MyTable
EXEC
('
SELECT *
FROM AdventureWorks.Sales.SalesOrderDetail
')
SELECT
r.writes AS [INSERT EXEC writes]
FROM sys.dm_exec_requests r
WHERE
r.session_id = @@SPID
GO
So that's that. We shoshould avoid insert exec and try to do our inserts in
Same context in which the select is running -- right?
Well, yes and no. There is another element at play here which I haven't yet
Mentioned. What if we were only inserting a few rows, and the table we were
Inserting Into looked something like the following:
CREATE TABLE #MyTable
(
[SalesOrderID] [int] NOT NULL,
[SalesOrderDetailID] [int] NOT NULL,
[CarrierTrackingNumber] [nvarchar](25) NULL,
[OrderQty] [smallint] NOT NULL,
[ProductID] [int] NOT NULL,
[SpecialOfferID] [int] NOT NULL,
[UnitPrice] [money] NOT NULL,
[UnitPriceDiscount] [money] NOT NULL,
[LineTotal] [money] NOT NULL,
[rowguid] [uniqueidentifier] NOT NULL,
[ModifiedDate] [datetime] NOT NULL
)
GO
The only difference between this table and the previous one is that this is
Temporary table and the other is not. But temporary tables have their own
Interesting little twists, especially when it comes down to one of the key
Enemies in a highly-transactional system: recompilation. As it turns out, doing
The insert inside the exec will cause the internal statement to recompile every
Time a new temp table is encountered. This means that if you have a stored
Procedure that creates a temp table, puts together a bit of dynamic SQL, and
Does an insert inside of that dynamic SQL, You'll now have yourself
Recompilation problem.
To see this already strated, try the following script. here only a single row is
Inserted as a result of the dynamic query, But it's complex enough that
Compile time more than overshadows the overhead of the parameter table:
USE tempdb
GO
DBCC FREEPROCCACHE
GO
CREATE TABLE #AvgTimes
(
CPU_time DECIMAL(19,4) NOT NULL,
insert_type VARCHAR(25) NOT NULL
)
GO
CREATE TABLE #MyTable
(
[SalesOrderID] [int] NOT NULL,
[SalesOrderDetailID] [int] NOT NULL,
[CarrierTrackingNumber] [nvarchar](25) NULL,
[OrderQty] [smallint] NOT NULL,
[ProductID] [int] NOT NULL,
[SpecialOfferID] [int] NOT NULL,
[UnitPrice] [money] NOT NULL,
[UnitPriceDiscount] [money] NOT NULL,
[LineTotal] [money] NOT NULL,
[rowguid] [uniqueidentifier] NOT NULL,
[ModifiedDate] [datetime] NOT NULL
)
EXEC
('
INSERT #MyTable
SELECT TOP(1)
sod.*
FROM AdventureWorks.Sales.SalesOrderDetail sod
WHERE
sod.UnitPrice > 10
AND sod.LineTotal > 100
AND EXISTS
(
SELECT *
FROM AdventureWorks.Sales.SalesOrderHeader soh
JOIN AdventureWorks.Person.Contact c ON
c.ContactID = soh.CustomerID
WHERE
soh.SalesOrderID = sod.SalesOrderID
AND c.LastName LIKE ''Ab%''
)
AND EXISTS
(
SELECT *
FROM AdventureWorks.Production.Product p
WHERE
p.ProductID = sod.ProductID
AND p.Color IS NULL
)
AND NOT EXISTS
(
SELECT *
FROM AdventureWorks.Sales.SalesOrderHeader soh
JOIN AdventureWorks.Sales.SalesPerson sp ON
soh.SalesPersonID = sp.SalesPersonID
WHERE
soh.SalesOrderID = sod.SalesOrderID
AND sp.CommissionPct > 50
)
')
INSERT #AvgTimes
SELECT
r.cpu_time,
'INSERT inside EXEC'
FROM sys.dm_exec_requests r
WHERE
r.session_id = @@SPID
DROP TABLE #MyTable
GO 5
CREATE TABLE #MyTable
(
[SalesOrderID] [int] NOT NULL,
[SalesOrderDetailID] [int] NOT NULL,
[CarrierTrackingNumber] [nvarchar](25) NULL,
[OrderQty] [smallint] NOT NULL,
[ProductID] [int] NOT NULL,
[SpecialOfferID] [int] NOT NULL,
[UnitPrice] [money] NOT NULL,
[UnitPriceDiscount] [money] NOT NULL,
[LineTotal] [money] NOT NULL,
[rowguid] [uniqueidentifier] NOT NULL,
[ModifiedDate] [datetime] NOT NULL
)
INSERT #MyTable
EXEC
('
SELECT TOP(1)
sod.*
FROM AdventureWorks.Sales.SalesOrderDetail sod
WHERE
sod.UnitPrice > 10
AND sod.LineTotal > 100
AND EXISTS
(
SELECT *
FROM AdventureWorks.Sales.SalesOrderHeader soh
JOIN AdventureWorks.Person.Contact c ON
c.ContactID = soh.CustomerID
WHERE
soh.SalesOrderID = sod.SalesOrderID
AND c.LastName LIKE ''Ab%''
)
AND EXISTS
(
SELECT *
FROM AdventureWorks.Production.Product p
WHERE
p.ProductID = sod.ProductID
AND p.Color IS NULL
)
AND NOT EXISTS
(
SELECT *
FROM AdventureWorks.Sales.SalesOrderHeader soh
JOIN AdventureWorks.Sales.SalesPerson sp ON
soh.SalesPersonID = sp.SalesPersonID
WHERE
soh.SalesOrderID = sod.SalesOrderID
AND sp.CommissionPct > 50
)
')
INSERT #AvgTimes
SELECT
r.cpu_time,
'INSERT EXEC'
FROM sys.dm_exec_requests r
WHERE
r.session_id = @@SPID
DROP TABLE #MyTable
GO 5
SELECT
AVG
(
CASE a.insert_type
WHEN 'INSERT inside EXEC' THEN a.CPU_time
ELSE NULL
END
) AS [CPU time - INSERT inside EXEC],
AVG
(
CASE a.insert_type
WHEN 'INSERT EXEC' THEN a.CPU_time
ELSE NULL
END
) AS [CPU time - INSERT EXEC]
FROM #AvgTimes a
GO
DROP TABLE #AvgTimes
GO
So what have we learned today? The choice between insert exec and insert
Inside of exec is not purely stylistic and has definite Performance
Implications. Here are the rules I'll be following from now on:
- When working with permanent tables, always avoid insert exec if possible.
There are some cases where it won't be possible to avoid. I only showed dynamic
SQL in this post, but insert exec also applies to stored procedures. Can you
Safely rip apart all of the stored procedures in your system in order to avoid
This issue? Maybe not quite as easily as you can rip apart the dynamic SQL
Within stored procedures.
- When working with temporary tables, evaluate the complexity of
Operations, the amount of data to be inserted, and most importantly, test every
Solution! The more rows that are inserted as a result of the insert exec,
More the overhead you'll get from the parameter table. On the flip side,
More complex the dynamic statement, the more overhead you'll get from
Recompilations. Every scenario is bound to be different and you may just learn
Something about your processes by doing this extra bit of analysis.