Analysis of performance problems caused by connection with table-valued functions
Table value functions
SQL Server provides functions similar to other programming languages, and functions are typically encapsulated by code and return values. In SQL Server, functions can return a set, that is, a table, in addition to simple data types (such as Int and Varchar.
Based on whether to directly return a set or definition before returning a set, Table value functions are divided into inline User-Defined Table value functions and user-defined Table value functions (hereinafter referred to as Table value functions, saves the word "user-defined ).
Inline Table value functions
Inline table valued functions are no different from normal functions. The only difference is that the returned result is a set (table) instead of a simple data type, A simple inline Table value function is shown in code list 1 (from MSDN ).
CREATE FUNCTION Sales.ufn_CustomerNamesInRegion( @Region nvarchar(50) )RETURNS tableASRETURN (SELECT DISTINCT s.Name AS Store, a.CityFROM Sales.Store AS sINNER JOIN Person.BusinessEntityAddress AS bea ON bea.BusinessEntityID = s.BusinessEntityID INNER JOIN Person.Address AS a ON a.AddressID = bea.AddressIDINNER JOIN Person.StateProvince AS sp ON sp.StateProvinceID = a.StateProvinceIDWHERE sp.Name = @Region);GO
Code List 1. A simple table Value Function
User-Defined Table value functions
To define a table-valued function, you need to define the returned table structure at the beginning of the function. Then, you can write any code to perform data operations and insert it into the defined table structure before returning it, an example of a slightly responsible user-defined Table value function is shown in Listing 2 (from MSDN ).
CREATE FUNCTION dbo.ufnGetContactInformation(@ContactID int) RETURNS @retContactInformation TABLE ( -- Columns returned by the function ContactID int PRIMARY KEY NOT NULL, FirstName nvarchar(50) NULL, LastName nvarchar(50) NULL, JobTitle nvarchar(50) NULL, ContactType nvarchar(50) NULL ) AS -- Returns the first name, last name, job title, and contact type for the specified contact. BEGIN DECLARE @FirstName nvarchar(50), @LastName nvarchar(50), @JobTitle nvarchar(50), @ContactType nvarchar(50); -- Get common contact information SELECT @ContactID = BusinessEntityID, @FirstName = FirstName, @LastName = LastName FROM Person.Person WHERE BusinessEntityID = @ContactID; -- Get contact job title SELECT @JobTitle = CASE -- Check for employee WHEN EXISTS(SELECT * FROM Person.Person AS p WHERE p.BusinessEntityID = @ContactID AND p.PersonType = 'EM') THEN (SELECT JobTitle FROM HumanResources.Employee AS e WHERE e.BusinessEntityID = @ContactID) -- Check for vendor WHEN EXISTS(SELECT * FROM Person.Person AS p WHERE p.BusinessEntityID = @ContactID AND p.PersonType = 'VC') THEN (SELECT ct.Name FROM Person.ContactType AS ct INNER JOIN Person.BusinessEntityContact AS bec ON bec.ContactTypeID = ct.ContactTypeID WHERE bec.PersonID = @ContactID) -- Check for store WHEN EXISTS(SELECT * FROM Person.Person AS p WHERE p.BusinessEntityID = @ContactID AND p.PersonType = 'SC') THEN (SELECT ct.Name FROM Person.ContactType AS ct INNER JOIN Person.BusinessEntityContact AS bec ON bec.ContactTypeID = ct.ContactTypeID WHERE bec.PersonID = @ContactID) ELSE NULL END; -- Get contact type SET @ContactType = CASE -- Check for employee WHEN EXISTS(SELECT * FROM Person.Person AS p WHERE p.BusinessEntityID = @ContactID AND p.PersonType = 'EM') THEN 'Employee' -- Check for vendor WHEN EXISTS(SELECT * FROM Person.Person AS p WHERE p.BusinessEntityID = @ContactID AND p.PersonType = 'VC') THEN 'Vendor Contact' -- Check for store WHEN EXISTS(SELECT * FROM Person.Person AS p WHERE p.BusinessEntityID = @ContactID AND p.PersonType = 'SC') THEN 'Store Contact' -- Check for individual consumer WHEN EXISTS(SELECT * FROM Person.Person AS p WHERE p.BusinessEntityID = @ContactID AND p.PersonType = 'IN') THEN 'Consumer' -- Check for general contact WHEN EXISTS(SELECT * FROM Person.Person AS p WHERE p.BusinessEntityID = @ContactID AND p.PersonType = 'GC') THEN 'General Contact' END; -- Return the information to the caller IF @ContactID IS NOT NULL BEGIN INSERT @retContactInformation SELECT @ContactID, @FirstName, @LastName, @JobTitle, @ContactType; END; RETURN; END; GO
Code Order 2. Table Value Function
Why use a table-valued function?
It seems that the table value function does nothing different than the stored procedure, but it is actually different. This is because the table value function can be used to write data to other queries, but the stored procedure does not work. In addition, the use of table valued functions and the Apply operator can greatly simplify the join operation.
If the stored procedure meets one of the following conditions, you can consider rewriting to a table value function.
• The Stored Procedure logic is very simple. It is just a Select statement. The reason for not using a view is only because parameters are required.
• There is no update operation in the stored procedure.
• There are no dynamic SQL statements in the stored procedure.
• Only one result set is returned in the stored procedure.
• The primary purpose of a stored procedure is to generate a temporary result set and store the result set in a temporary table for other queries to call.
User-Defined Table value functions
Unlike the inline Table value function, the inline Table value function is more like a view in the process, which means that in the query optimization stage, the Inline Table value function can be used to optimize the query optimizer. For example, you can push the filter condition (Where) to the bottom of the algebra tree, which means you can Join the Where clause first, this allows you to use index search to reduce IO and improve performance.
Let's look at a simple example. The following code example shows a simple Join operation with a table-valued function:
First, we create Table value functions, which are inline Table value functions and Table value functions, as shown in code listing 3.
-- Create table value row create function tvf_multi_Test () RETURNS @ SaleDetail TABLE (ProductId INT) as begin insert into @ SaleDetail SELECT ProductID FROM Sales. salesOrderHeader soh inner join Sales. salesOrderDetail sod ON soh. salesOrderID = sod. salesOrderID return end -- CREATE inline TABLE value function create function tvf_inline_Test () returns table as return select ProductID FROM Sales. salesOrderHeader soh inner join Sales. salesOrderDetail sod ON soh. salesOrderID = sod. salesOrderID
Code List 3. create two different functions
Now, we use the same query to Join the two table value functions. The code is shown in Listing 4.
-- The Table value function performs Join SELECT c. personid, Prod. name, COUNT (*) 'numer of unit 'FROM Person. businessEntityContact c inner join dbo. tvf_multi_Test () tst ON c. personid = tst. productId inner join Production. product prod ON tst. productId = prod. productID group by c. personid, Prod. name -- inline Table value function for Join SELECT c. personid, Prod. name, COUNT (*) 'numer of unit 'FROM Person. businessEntityContact c inner join dbo. tvf_inline_Test () tst ON c. personid = tst. productId inner join Production. product prod ON tst. productId = prod. productID group by c. personid, Prod. name
Code list 4. Join between table valued functions and inline table Valued Functions
The execution cost is 1.
Figure 1. Costs of the two methods
From the IO perspective, it is obvious that the sub-optimal execution plan is selected, and BusinessEntityContact chooses 121317 searches instead of one scan. The Inline table function correctly knows that the cost of one scan is much lower than that of one query.
The root cause of the problem is the inline Table value function. for SQL Server, it is the same as the view, this means that the inline Table value function can be involved in the algebraic operation (or algebraic tree optimization) of the logical execution plan, which means that the restrained table can be further split (1, for the query of the second inline table, the execution plan knows that the restrained table is the SalesOrderHeader table and the SalesOrderDetail table. Because only one column is selected for the query, the execution plan is optimized until the SalesOrderHeader table does not need to be scanned ), for inline table valued functions, the execution plan can fully understand the indexes of the involved tables, related statistics, and other metadata.
On the other hand, as shown in the first part of Table value function 1, the Table value function is a black box for the entire execution plan, neither knowing the statistics nor indexing. The table involved in the Table value function is unknown in the execution plan (in figure 1, the temporary table # AE4E5168 is not specified ), therefore, for the entire execution plan, the result set SQL Server assumes that the returned results are very small. When the table value function returns a large number of results (as shown in this example ), A relatively poor execution plan is generated.
Therefore, when a table-valued function returns an extremely small number of results, the performance may not be affected. However, if more results are returned, the quality of the Execution Plan will be affected.
How to handle
First, in SQL Server, we need to find the existing Join statements for table valued functions. by mining the execution plan, we can find such statements, the code used is shown in code list 5.
WITH XMLNAMESPACES('http://schemas.microsoft.com/sqlserver/2004/07/showplan' AS p) SELECT st.text, qp.query_plan FROM ( SELECT TOP 50 * FROM sys.dm_exec_query_stats ORDER BY total_worker_time DESC ) AS qs CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS st CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) AS qp WHERE qp.query_plan.exist('//p:RelOp[contains(@LogicalOp, "Join")]/*/p:RelOp[(@LogicalOp[.="Table-valued function"])]') = 1
Code List 5. Find the Join query with the table value function from the execution plan Cache
Result 2 is displayed.
Figure 2. Join queries with table-valued functions in the execution plan Cache
Summary
This article describes the concept of a table valued function, why does a table valued function affect performance, and how to locate a Join query with a table valued function in the execution plan cache. It may not affect the number of rows returned by applying to a table-valued function or a very small number of rows returned by the table-valued function. However, if a Join operation is performed on a table-valued function with a large number of returned results, performance problems may occur, you can rewrite the table value function to an inline Table value function or save the result of the Table value function to a temporary table before performing the Join operation. This improves the performance.