SQL Server Tuning Series-union operators Total

Source: Internet
Author: User
Tags logical operators

Objective

In the last two articles, we describe how to view the query plan, and some of the common optimization techniques for join operators, and we summarize how the federated operators are used and how to optimize them.

Talk less and go straight to the subject of this article.

Technical preparation

Based on the SQL SERVER2008R2 version, a more concise case library (Northwind) from Microsoft was used for parsing.

One, UNION operator

The so-called Union operators, in fact, have the most application of the two: Union All and Union.

The use of these two operators is simple, the former is to merge the results of the two datasets, the latter is a merge after the deduplication operation, if you have written T-SQL statements of the code of the farmers are not unfamiliar.

Let's analyze the display of these two operators in the execution plan, for example

SELECT firstname+n"+lastname,city,country from Employeesunion allselect contactname,city,country from Customers 

This is the icon above, which is the Union all Union operator.

This union operator is a simple operation that scans two data sets to summarize the results by federation.

Let's take a look at the union operator, as an example

From employeesunionselect city,country from Customers

As we can see, the UNION operator has a distinct sort operation after the concatenation operator, which removes the duplicate values from the result set.

We have always emphasized that the sorting of big data tables is a very resource-intensive action!

So, here we have found an option that can be optimized to get rid of the sort, or to change the sort method.

The way to replace the distinct sort operation is Hase aggregation. The Distinct sort operation requires memory that is proportional to the amount of data that is removed before the data collection is duplicated, while the memory required for the hash aggregation is proportional to the result integration after the removal of the repetition!

So if there are many duplicate values in the data rows, then the memory consumed by the hash aggregation is less.

Let's give an example.

From ordersunionselect ShipCountry from Orders

This example is not really useful, here is to demonstrate, we look at the results

We know that this table of ShipCountry is a large area of duplicate values, so the use of hash matching to go back and go is the best way to operate.

In fact, there is a more lightweight way to reconnect to a hash match connection: Merge connection

I have analyzed this connection method for two data sets of connection, here is actually similar, before the application must first sort the original result set!

We know that optimization can be made by indexing to improve the sorting speed.

Let's reproduce this way of de-weight, we create a new table, and then build the index, the code is as follows

-- new Table Select Employeeid,firstname+n' +LastName as Contactname,city,countryinto newemployeesfrom employeesgo-- Add index alter TABLE newemployees add CONSTRAINT pk_newemployees PRIMARY KEY (EmployeeID) CREATE index ContactName on Newemployees (ContactName) Create INDEX ContactName on CUSTOMERS (ContactName) go-- new query, It is important to add a display order by in order for the merge connection to appear. Select ContactName from Newemployeesunion allselect ContactName from Customersorder by ContactName    

We use index scanning to avoid explicit sorting operations.

We'll change the Union all to Union, and the operation will redo the two datasets.

-- Create a new query, it is necessary to add a display order by in order to appear the merge connection to re-select ContactName from Newemployeesunion select ContactName from Customersorder by ContactName

Here we know that the Union operation will be the result of the re-operation, the above application of flow aggregation operations, flow aggregation is generally used in the grouping operation, of course, it is used in grouping to weight.

In our actual application environment, the most common way is to merge connections, but one case is best suited for hash joins, which is a small table and a large table for joint operations, especially if there is a large number of duplicate values in the case.

Hash algorithm that's a good thing!

Reference documents

    • Microsoft Books Online logical operators and physical operator references
    • Refer to the book SQL. server.2005. Technical Insider "series

Conclusion

This article first to this bar, short, easy to understand, this article mainly introduces the query plan of the joint operators, the next we analyze SQL Server parallel operations, in multicore hyper-threading gathered today, see SQL How server uses parallel operations to maximize the use of existing hardware resources to improve performance, is interested in advance attention, the content of SQL Server performance tuning covers a wide range, followed by the analysis in the next article.

SQL Server once deep into the software, you will find that it is really very deep, basic can be described with bottomless, if you want to study the performance tuning this piece, you can focus on this series of content, we study together!

And now there are a lot of people on the SQL Server this product has been misunderstood, or the point of view to be corrected, before the customer met directly when I talk about God horse SQL Server import data more than a downtime ....

God horse SQL Server can only be used for small amount of data ... God horse is not as good as Oracle ...!!!

There are some children's shoes simply think that SQL Server is pediatrics, there is no technical content ... It's very simple.

On these points, I don't want to spit out anything, I just want to let those friends who really know SQL Server to prove something to SQL Server.

SQL Server Tuning Series-union operators Total

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.