This article introduces parameterized query. I will discuss how SQL Server optimizer attempts to parameterize a query and how you can create your own parameterized query if the query can be parameterized.
This article introduces parameterized query. I will discuss how the SQL Server optimizer attempts to parameterize a query and how you can create your own parameterized query.
What is parameterized query?
A simple way to understand parameterized queries is to think of it as just a T-SQL query that accepts parameters that control what the query returns. By using different parameters, a parameterized query returns different results. To get a parameterized query, you need to write your code in a specific way, or it needs to meet a specific set of standards.
There are two different ways to create a parameterized query. The first method is to allow the query optimizer to automatically parameterize your query. Another way is to program a parameterized query by writing your T-SQL code in a specific way and passing it to the sp_executesql system stored procedure. This method is described later in this article.
The key to parameterized query is that the query optimizer will create a reusable cache plan. By automatically or programmatically using parameterized queries, SQL Server can optimize processing like T-SQL statements. This optimization eliminates the need to create a cache plan for each execution of these similar T-SQL statements using noble resources. In addition, by creating a reusable plan, SQL Server also reduces the memory usage required for similar execution plans stored in the cache during the process.
Now let's take a look at the different ways in which SQL Server creates parameterized queries.
How is a parameterized query automatically created?
The person who writes the query optimizer code at Microsoft is doing their best to optimize the way SQL Server handles your T-SQL commands. I think this is the source of the query optimizer name. One of these ways to minimize resources and maximize query optimizer execution performance is to view a T-SQL statement and determine if they can be parameterized. To understand how this works, let's look at the following T-SQL statement:
Here, you can see that this command has two features. First, it is simple, and second, it contains a specified value for the SalesOrderID value in the where predicate. The query optimizer can identify that this query is relatively simple and that SalesOrderID has a parameter ("56000 "). Therefore, the query optimizer can automatically parameterize this query.
If you use the following SELECT statement to view a clean buffer pool that only contains the cache plan for the preceding statement, then you will see the query optimizer override the T-SQL Query into a parameterized T-SQL statement:
When I run this command on an SQL Server 2008 instance, I get the following output (note that the output is reformatted to make it easier to read ):
Cnt size plan_text
------------------------------------------------------------------------
1 49152 (@ 1 int) SELECT * FROM [AdventureWorks]. [Sales]. [SalesOrderHeader]
WHERE [SalesOrderID] = @ 1
If you look at the plan_text field in the output above, you will see that it is not like the original T-SQL text. As described above, the query optimizer rewrites this query into a parameterized T-SQL statement. Here, you can see that it now has a variable (@ 1) of the data type (int), which is defined in the previous SELECT statement. In addition, at the end of plan_text, the value of "56000" is replaced with the variable @ 1. Since this T-SQL statement is overwritten and stored as a cache plan, if a future T-SQL command is roughly the same as it, only the SalesOrderID field is assigned a different value, it can be used for reuse. Let's look at it in action.
If I run the following command on my machine:
Here, I first release the process cache, then I run two different but similar non-parametric queries to see if the query optimizer creates two different cache plans or creates a cache plan for these two queries. Here, you can see that the query optimizer is actually very clever, it parameterized the first query and cached the plan. Then, when the second similar query with a different SalesOrderID value is sent to SQL Server, the optimizer can identify that a plan has been cached and then reuse it to process the second query. You can say this because the "cnt" field now indicates that this plan was used twice.
Database Configuration option PARAMETERIZATION can affect how T-SQL statements are automatically parameterized. There are two different settings for this option, SIMPLE and FORCED. When PARAMETERIZATION settings are set to SIMPLE, only SIMPLE T-SQL statements are parameterized. To introduce this, see the following command:
This query is similar to my previous example, except that an additional JOIN standard is added here. When the PARAMETERIZATION option of database AdventureWorks is set to SIMPLE, this query will not be automatically parameterized. The simple parameterization setting tells the query optimizer to only parameterize SIMPLE queries. However, when PARAMETERIZATION is set to FORCED, the query is automatically parameterized.
When you set the database option to force parameterization, the query optimizer tries to parameterize all the queries, not just simple queries. You may think this is good. However, in some cases, when PARAMETERIZATION is set to FORCED in the database, the query optimizer selects a query plan that is not ideal. When the database sets PARAMETER to FORCED, it changes the literal constant in the query. This may result in an invalid plan because the index and index view are not selected to participate in the execution plan when a calculated field is involved in the query. The forced parameterization option may be a good solution to improve the performance of databases with a large number of similar and slightly different passed parameters. An online sales application whose customers perform a large number of similar searches for your product with different product values may be a good application type that can benefit from forced parameterization.
Not all query clauses are parameterized. FOR example, the TOP, TABLESAMPLE, HAVING, group by, order by, OUTPUT... INTO, or for xml clauses queried are not parameterized.
Use sp_execute_ SQL to parameterize your T-SQL
You do not need to rely on the database PARAMETERIZATION option to parameterize a query by the query optimizer. You can parameterize your own queries. You're done by re-writing your T-SQL statement and executing the override statement using the "sp_executesql" system stored procedure. As you can see, the SELECT statement that includes a "JOIN" clause is not automatically parameterized when the database's PARAMETERIZATION is set to SIMPLE. Let me re-compile this query so that the query optimizer will create a reusable parameterized query execution plan.
For illustration, Let's see two similar T-SQL statements that are not automatically parameterized and create two different cache execution plans. Then I will re-compile these two queries so that they all use the same cache parameterized execution plan.
Let's take a look at this Code:
Here I release the process cache and then run the two non-simple T-SQL statements that contain a JOIN. Then I will check the cache plan. This is the output of the SELECT statement using DMV (note that the output is reformatted to make it easier to read): As you can see from this output, the two SELECT statements are not parameterized by the query optimizer. The optimizer creates two different cache execution plans, each of which is executed only once. We can use the sp_executesql system stored procedures to help the optimizer create a parameterized execution plan for these two different SELECT statements.
The following is the stored procedure of the sp_executesql system after the code is rewritten: As you can see, I re-compiled the two SELECT statements, execute them by using the "EXEC sp_executesql" statement. For each of these EXEC statements, I pass three different parameters. The first parameter is a basic SELECT statement, but I replace the value of SalesOrderID with a variable (@ SalesOrderID. In the second parameter, I have determined the Data Type of @ SalesOrderID. In this example, it is an integer. Then, in the last parameter, I passed the value of SalesOrderID. This parameter controls the results generated by my SELECT based on the SalesOrderID value. The first two parameters in each execution of sp_executesql are the same. However, the third parameter is different because each parameter has a different SalesOrderID value.
Now when I run the above code, I get the following output from the DMV SELECT statement (note that the output is reformatted to make it easier to read ):
From this output, you can see that I have a parameterized cache plan that is executed twice and each EXEC statement is executed once.
Use parameterized query to save resources and optimize performance
Before a statement can be executed, each T-SQL statement needs to be evaluated and an execution plan needs to be created. Creating an execution plan consumes valuable CPU resources. When an execution plan is created, it uses the memory space to store it in the process cache. One way to reduce CPU and memory usage is to use parameterized queries. Although the database can be set to FORCE parameterization for all queries, this is not always the best choice. By understanding which of your T-SQL statements can be parameterized and then using sp_executesql stored procedures, you can help SQL Server save resources and optimize the performance of your queries.