The principle that the union operator in SQL Server is de-weighed when merging data and the default sorting of the result of the Union operator query

Source: Internet
Author: User
Tags getdate rand

Original: The principle that the union operator in SQL Server is de-weighed when merging data and the default sorting of the result of the Union operator query

The source of this article: http://www.cnblogs.com/wy123/p/7884986.html

There's another discussion about union and union all, and for union and union all, the biggest thing on the net is the performance problem (I don't want to say it: Union ALL is faster than union)
Actually don't want to fry Union and union all this bowl of leftovers,
Every time I see the internet saying use this without that, the list of one of the kind of article, as long as you see that the union all better than the union performance is ...

For a merged result set, the Union is de-weighed, and union all is not heavy, and the de-weight and non-weight are two purposes, respectively, by Union and UNION all
Two function (function) different things, put together than performance what is the meaning?
The problem is really boring, like "some middle-back on the football field and some Qian Yao which ability is stronger" is not comparable,
Their role is different, do not say that the ability of the central defender can not, take him down, with a good Qian Yao player instead of the central defender, or Qian Yao ability, withdraw him with a good central defender instead?
This is a functional difference, as for performance, I personally think that the comparison does not make any sense.
If you want to put it together, do the same data merge,
Union because it is going to be heavy, relative to union all, of course, (relative) consumes more resources (the amount of resources spent is not related to performance, more things to do, of course, more resources are needed)
But be sure to figure out, when merging the data, whether or not to remove the duplicate data, this is the final result of the right and wrong problem, not performance problems!

The performance of Union and union all is not discussed here,
Start a problem from another point
The biggest difference between union and union is that the Union is going to be heavy, so the question is, how does this weight go? What is the effect of the de-reset on the default order set of the query?

The implementation of the Union de-weight

Test the implementation principle of the union operator de-weight

Create TableTestUnion1 (Id1INT PRIMARY KEY, Id2tinyint, Namevarchar( -));Create TableTestUnion2 (Id1INT PRIMARY KEY, Id2tinyint, Namevarchar( -));Insert  intoTestUnion1Values( -,9,'AAA')Insert  intoTestUnion1Values( the,3,'CCC')Insert  intoTestUnion1Values( $,7,'Eee')Insert  intoTestUnion2Values( -,2,'BBB')Insert  intoTestUnion2Values( -,8,'DDD')Insert  intoTestUnion2Values( -,5,'FFF')--the same data in TestUnionALL1 and TestUnionALL2Insert  intoTestUnion1Values( -,6,'XXX')Insert  intoTestUnion2Values( -,6,'XXX')

UNION in the process of de-weight, the execution plan used is the merge join,union all is not heavy, the same steps corresponding to the execution plan is concatenation

The de-action of the Union here is achieved through merge, where merge join is not a merge join between table and table
As can be seen here, the merge produced by the Union is different from the merge created by the inner join.

For this action of the de-weight of union, to be sure not to say that only the merge join one, here is only two results of the data are just ordered to use the merge join to come and go heavy

If the first field in the order of the query fields is a clustered index (or a primary key), as mentioned above, the Union's sides will be in the merge mode area
If you query the order of the fields for a nonclustered index, the union process is now merging two result sets (the concatenation mentioned above) and then doing sort sorting to re-

  


Final sort result of result set after union

The final sort result of the result set after the Union is related to the order of the query fields,
If the first field in the order of the query fields is a clustered index (or a primary key), as mentioned above, the Union's sides will be in the merge mode area
If the first field in the order of the query fields is a non-clustered index field, the union process is now merging two result sets (concatenation mentioned above) and then doing sort sorting to re-
The following example illustrates this problem, and when the order of the query fields changes, the execution plan is completely inconsistent.

Or look at a case, when name is in the front, the end result is sort by name.

Some people may wonder if the data volume is too small, is not a coincidence, here you can increase the test database, in the query condition, let the nonclustered index participate in the operation

Create TableTestUnion1 (Id1INT PRIMARY KEY, Id2tinyint, Namevarchar( -), CreateDatedatetime);Create TableTestUnion2 (Id1INT PRIMARY KEY, Id2tinyint, Namevarchar( -), CreateDatedatetime);begin Tran    Declare @i int = 0     while @i<1000000    begin        Insert  intoTestUnion1Values(@i,Rand()* $,newid(),getdate()-Rand()* +)        Insert  intoTestUnion2Values(@i,Rand()* $,newid(),getdate()-Rand()* +)        Set @i=@i+1    EndCommitCreate IndexIdx_createdate onTestUnion1 (createdate)Create IndexIdx_createdate onTestUnion2 (CreateDate)

Reference, once the query result set is not sorted by the query field clustered index,
For example, this is the index of the CreateDate time field, and the execution plan is to merge the result set in the usual way, i.e. concatenation
Then, in the sort (Distinct) way, the final ordering of the results of the de-weight is related to the first field of the query result, and the results are always sorted by the first field of the query.

Change the order of the query fields, and look at the results, still sorted by the first field in the check sequence.

When the union operator is going to weigh,
If the first field of the query field is a clustered index, the merge join is used to combine + go-weight.
If the first field of the query field is not a nonclustered index, then the first two (or more) result sets will be combined in a normal way, and then the sort distinct.
and the default ordering after the union operation is affected by how the fields are queried before and after.

Summarize:

The Union and UNION all function differently, and it doesn't make any sense to put them together, and I really don't want to go through this.
Merge result set, need to go to the union of the weight, do not need to go to the union all, if there is no duplicate result set in two result sets, use UNION ALL,
This is purely demand-driven, not a performance issue for union and union all.

To withdraw one more sentence:
Once the big night received an interview phone, no opening remarks, the first sentence is "We phone interview can you?", replied, the second sentence is "What is the difference between Union and union all, there is no performance difference."
I really don't want to discuss the performance of Union and union all again.

The principle that the union operator in SQL Server is de-weighed when merging data and the default sorting of the result of the Union operator query

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.