What is a parameterized query?
A simple way to understand a parameterized query is to think of it as just a T-SQL query that accepts parameters that control what the query returns. By using different parameters, a parameterized query returns different results. To get a parameterized query, you need to write your code in a specific way, or it needs to meet a specific set of criteria.
There are two different ways to create parameterized queries. The first way is to have the query optimizer automatically parameterize your query. Another way is to program a parameterized query by writing your T-SQL code in a specific way and passing it to the sp_executesql system stored procedure. This method is described in the later part of this article.
The key to parameterized queries is that the query optimizer creates a reusable cache plan. By using parameterized queries automatically or programmatically, SQL Server can optimize processing similar to T-SQL statements. This optimization eliminates the need to create a cache plan for each execution of these similar T-SQL statements using noble resources. and by creating a reusable plan, SQL Server also reduces the memory usage required to store similar execution plans in the process cache.
Now let's look at the different ways in which SQL Server creates parameterized queries.
How are parameterized queries created automatically?
The people who write the query optimizer code are doing their best to optimize how SQL Server handles your T-SQL commands. I think this is the origin of the query optimizer name. One way to minimize the resources and maximize the performance of the query optimizer is to look at a T-SQL statement and determine whether they can be parameterized. To understand how this works, let's look at the following T-SQL statements:
SELECT * From AdventureWorks.Sales.SalesOrderHeader WHERE SalesOrderID = 56000; Go |
Here, you can see that this command has two features. First it's simple, second it contains a specified value for the SalesOrderID value in the WHERE predicate. The query optimizer recognizes the simplicity of this query and SalesOrderID has one parameter ("56000"). Therefore, the query optimizer can automatically parameterize this query.
If you use the following SELECT statement to view a clean buffer pool that contains only the cache plan for the above statement, you will see that the query optimizer rewrites a T-SQL query as a parameterized T-SQL statement:
SELECT Stats.execution_count as CNT, P.size_in_bytes as [size], [SQL]. [Text] As [Plan_text] From Sys.dm_exec_cached_plans P OUTER APPLY sys.dm_exec_sql_text (p.plan_handle) SQL JOIN sys.dm_exec_query_stats Stats on stats.plan_handle = P.plan_handle; Go |
When I run this command on an instance of SQL Server 2008, I get the following output (note that the output is reformatted so that it is easier to read):
CNT size Plan_text
--- ------- --------------------------------------------------------------
1 49152 (@1 int) SELECT * from [AdventureWorks]. [Sales]. [SalesOrderHeader]
WHERE [salesorderid]=@1
If you look at the Plan_text field in the output above, you'll see that it's not like the original T-SQL text. As mentioned earlier, the query optimizer has rewritten the query as a parameterized T-SQL statement. Here, you can see that it now has a variable (@1) with a data type (int), which is defined in the previous SELECT statement. Also at the end of the Plan_text, the value "56000" is replaced with the variable @1. Since this T-SQL statement is rewritten and stored as a cache plan, if the future of a T-SQL command is roughly the same as it is, it can be used for reuse only if the SalesOrderID field is assigned a different value. Let's look at it in action.
If I run the following command on my machine:
DBCC Freeproccache; Go SELECT * From AdventureWorks.Sales.SalesOrderHeader WHERE SalesOrderID = 56000; Go SELECT * From AdventureWorks.Sales.SalesOrderHeader WHERE SalesOrderID = 56001; Go SELECT Stats.execution_count as CNT, P.size_in_bytes as [size], [SQL]. [Text] As [Plan_text] From Sys.dm_exec_cached_plans P OUTER APPLY sys.dm_exec_sql_text (p.plan_handle) SQL JOIN sys.dm_exec_query_stats Stats on stats.plan_handle = P.plan_handle; Go I get the following output from the last SELECT statement (note that the output is reformatted so that it is easier to read): CNT size Plan_text --- -------- -------------------------------------------------------------- 2 49152 (@1 int) SELECT * from AdventureWorks]. [Sales]. [SalesOrderHeader] WHERE [salesorderid]=@1 |
Here, I first release the process cache, and then I perform two different, but similar, nonparametric queries to see if the query optimizer will create two different cache plans or create a cache plan for these two queries. Here, you can see that the query optimizer is actually very smart, and it parameterize the first query and caches the plan. Then, when a second query similar to the one with a different SalesOrderID value is sent to SQL Server, the optimizer recognizes that a plan has been cached and then reuses it to process the second query. You can say that because the CNT field now indicates that the plan was used two times.
The database configuration options parameterization can affect how T-SQL statements are automatically parameterized. There are two different settings for this option, simple and forced. When the parameterization setting is set to simple, only the simplest T-SQL statements are parameterized. To introduce this, look at the following command:
SELECT SUM (linetotal) as LineTotal From AdventureWorks.Sales.SalesOrderHeader H JOIN AdventureWorks.Sales.SalesOrderDetail D on d.salesorderid = H.salesorderid WHERE H.salesorderid = 56000 |
This query is similar to my previous example, except that I added an additional join standard here. When the database AdventureWorks parameterization option is set to simple, the query is not automatically parameterized. The simple parameterization setting tells the query optimizer to simply parameterize the query. However, when an parameterization is set to forced, the query is automatically parameterized.
When you set the database option to use force parameterization, the query optimizer attempts to parameterize all queries, not just simple queries. You might think it's good. However, in some cases, when the database settings are parameterization to forced, the query optimizer will select a query plan that is not ideal. When the database setting parameter is forced, it changes the literal constants in the query. This can result in an invalid schedule when the index and indexed views are not selected to participate in the execution plan when the query involves a calculated field. The forced parameterization option may be a good solution for improving the performance of a database with a large number of similar, slightly different queries for passing parameters. An online sales application whose customers perform a number of similar searches for your product, with different product values, may be a good application type to benefit from forced parameterization.
Not all query clauses will be parameterized. For example, the top of the query, Tablesample, having, GROUP by, order BY, OUTPUT ... into or FOR XML clauses are not parameterized.
Use Sp_execute_sql to parameterize your T-SQL
You do not need to rely on the parameterization option of the database to make the query optimizer parameterize a query. You can parameterize your own query. You do this by rewriting your T-SQL statement and using the "sp_executesql" System stored procedure to execute the rewrite statement. As already seen, the SELECT statement, which includes a "JOIN" clause, is not automatically parameterized when the parameterization of the database is set to simple. Let me rewrite this query so that the query optimizer will create a reusable parameterized query execution plan.
To illustrate, let's look at two similar T-SQL statements that are not automatically parameterized and create two different cache execution plans. Then I'll rewrite the two queries so that they all use the same cache parameterization execution plan.
Let's take a look at this code:
DBCC Freeproccache Go SELECT SUM (linetotal) as LineTotal From AdventureWorks.Sales.SalesOrderHeader H JOIN AdventureWorks.Sales.SalesOrderDetail D on d.salesorderid = H.salesorderid WHERE H.salesorderid = 56000 Go SELECT SUM (linetotal) as LineTotal From AdventureWorks.Sales.SalesOrderHeader H JOIN AdventureWorks.Sales.SalesOrderDetail D on d.salesorderid = H.salesorderid WHERE H.salesorderid = 56001 Go SELECT Stats.execution_count as CNT, P.size_in_bytes as [size], Left ([sql].[ Text] as [Plan_text] From Sys.dm_exec_cached_plans P OUTER APPLY sys.dm_exec_sql_text (p.plan_handle) SQL JOIN sys.dm_exec_query_stats stats on stats.plan_handle = P.plan_handle; Go |
Here, I freed up the process cache and then ran the two different, not simple T-SQL statements that contained one join. Then I'll check the cache schedule. This is the output of this SELECT statement that uses the DMV (note that the output is reformatted so that it is easier to read):
cnt ; Size plan_text --- ----------- ------------------------------------------------------------ ------------------- 1 49152 select sum (linetotal) as linetotal from Adventureworks.sales.salesorderheader h Join adventureworks.sales.salesorderdetail d on d.salesorderid = h.salesorderid where h.salesorderid = 56001 1 49152 select sum (linetotal) as linetotal From adventureworks.sales.salesorderheader h Join adventureworks.sales.salesorderdetail d On d.salesorderid = h. SalesOrderID where h.salesorderid = 56000 |
As you can see from this output, the two SELECT statements are not parameterized by the query optimizer. The optimizer created two different cache execution plans, each of which was executed only once. We can help the optimizer create a parameterized execution plan for these two different SELECT statements by using the sp_executesql system stored procedure.
The following code is rewritten to use the sp_executesql system stored procedure:
DBCC Freeproccache; Go EXEC sp_executesql N ' SELECT SUM (linetotal) as LineTotal From AdventureWorks.Sales.SalesOrderHeader H JOIN AdventureWorks.Sales.SalesOrderDetail D on d.salesorderid = H.salesorderid WHERE H.salesorderid = @SalesOrderID ', N ' @SalesOrderID INT ', @SalesOrderID = 56000; Go EXEC sp_executesql N ' SELECT SUM (linetotal) as LineTotal From AdventureWorks.Sales.SalesOrderHeader H JOIN AdventureWorks.Sales.SalesOrderDetail D on d.salesorderid = H.salesorderid WHERE H.salesorderid = @SalesOrderID ', N ' @SalesOrderID INT ', @SalesOrderID = 56001; Go SELECT Stats.execution_count as Exec_count, P.size_in_bytes as [size], [SQL]. [Text] As [Plan_text] From Sys.dm_exec_cached_plans P OUTER APPLY sys.dm_exec_sql_text (p.plan_handle) SQL JOIN sys.dm_exec_query_stats stats on stats.plan_handle = P.plan_handle; Go |
As you can see, I have rewritten these two SELECT statements so that they are executed by using the "EXEC sp_executesql" statement. I pass three different arguments to each of these exec statements. The first parameter is the basic SELECT statement, but I replace the value of the SalesOrderID with a variable (@SalesOrderID). In the second argument, I identified the data type of the @salesorderid, which in this case is an integer. And then, in the last argument, I passed the value of SalesOrderID. This parameter will control the results generated by my select based on the SalesOrderID value. The first two parameters in each execution of sp_executesql are the same. But the third argument is different because each has a different SalesOrderID value.
Now when I run the above code, I get the following output from the DMV SELECT statement (note that the output is reformatted so that it is easier to read):
CNT size Plan_text --- ----------- ----------------------------------------------------------------------------------------- 2 49152 (@SalesOrderID INT) SELECT SUM (linetotal) as LineTotal From AdventureWorks.Sales.SalesOrderHeader H JOIN AdventureWorks.Sales.SalesOrderDetail D on d.salesorderid = H.salesorderid WHERE H.salesorderid = @SalesOrderID |
From this output, you can see that I have a parameterized cache plan that was executed two times for each EXEC statement.
Use parameterized queries to conserve resources and optimize performance
Each T-SQL statement needs to be evaluated before the statement can be executed, and an execution plan needs to be established. Creating an execution plan consumes valuable CPU resources. When the execution plan is created, it uses memory space to store it in the procedure cache. One way to reduce CPU and memory usage is to use parameterized queries. Although the database can be set to force parameterized for all queries, this is not always the best choice. By understanding which of your T-SQL statements can be parameterized and then using sp_executesql stored procedures, you can help SQL Server save resources and optimize the performance of your queries.