Analysis of the reason that SQL query statements do not explicitly specify the sort method and cannot guarantee that the same query will be consistent for each sort result

Source: Internet
Author: User

  

The source of this article: http://www.cnblogs.com/wy123/p/6189100.html

The title is a bit awkward,
First throw the question: A query does not explicitly specify the ordering method, then, the second time the same query execution, the query results will be the first query results sorted exactly the same way?
The answer is uncertain, two exactly the same query, the result is exactly the same, two times (multiple) query results may be sorted in the same way, it may be inconsistent.
If not, what causes the same query to be sorted differently by default?
Here are a few simple scenarios that explain why querying the same query will result in a different default sort. Of course, for this issue, including but not limited to the following situations.

Scenario 1: Parallel queries result in a random ordering of the default result set

As a rule, build a table for testing.

Create TableTestDefaultOrder1 (IDint Identity(1,1)Primary Key, col2varchar( -), Col3varchar( -), Col4varchar( -), Col5varchar( -), Col6varchar( -), Col7varchar( -), Col8varchar( -), CreateDateDatetime)GoDeclare @i int =0 begin Tran     while @i<500000    begin        Insert  intoTestDefaultOrder1Values(NEWID(),NEWID(),NEWID(),NEWID(),NEWID(),NEWID(),NEWID(),GETDATE()-RAND()* -)        Set @i=@i+1    EndCommit

Test scenario:

The performance issues such as indexing are not considered here first,
is an example of a test result, you can see that the conditions of the two queries are exactly the same, and none of them explicitly specify the sorting sequence, the default result is completely different.

You can even do three queries (more times) with the same conditions, and the results are still completely inconsistent.

  

Cause Analysis:

Why the same query, the order of each query result is different, as mentioned above, this case is caused by parallel query.
What execution plan is used by the query engine is cost-based, and if it is found that the execution cost of a query exceeds a certain threshold, it can be handled in a parallel way,
If a parallel query is used, multiple threads are used to decompose the entire query task, and the amount of task assigned by each thread cannot be fixed, and the order of the results of merging each thread is not fixed.
This leads to the order of the final query results is not fixed.
This is an example of the amount of task allocated per thread for a parallel query.

, the current query, the number of rows returned by the first thread is 2, but the number of rows returned by the first thread of the second query is not guaranteed to be 2.
Even if the number of rows returned is 2 for the second time, there is no guarantee that the 2 rows returned will be the same as the first two rows of data returned.
At the same time, when merging the result set of each thread, depending on the time of the thread's return, it is theoretically indeterminate, with multiple uncertainties together, resulting in a final ordering of the result set (which can be considered) to be random.

  

Scenario 2: Physical storage causes randomness of default results

Similarly, first create a case for the test data, as below, creating a heap table,

Create TableTestDefaultOrder2 (IDint Identity(1,1), col2Char( the))GoDeclare @i int =0 begin Tran     while @i< -    begin        Insert  intoTestDefaultOrder2Values(NEWID())        Set @i=@i+1    EndCommit

Test scenario:

This scenario excludes the effect of these parallel queries because there are only 50 data, and no parallel queries are enabled at all
For example, the two-time query performs a rebuild of the table, and the data itself is not changed, and the default order of two queries is completely different

Even after rebuilding, the query results are still different from the two above.

Cause Analysis:

The characteristics of the heap table determine that the data rows and data pages in the heap are not in any fixed order, and that the data in the entire heap changes after the physical storage has changed.
In the process of querying (scanning a heap table), you do not get the exact same order as before the physical storage changes.
In addition to the above rebuilt tables, the default order of the queries is inconsistent, and other operations that affect the physical space affect the physical storage location of the heap table data pages.

For example, the database contraction, after the contraction of the query and contraction before the query order is still not the same, I do not touch your table and any of the data in your table, but you can not prevent my normal database maintenance operations.
In summary, once the physical storage location is affected, the default scan result order for the heap table may be different.

  

The above only through a single table query to explain, if not explicitly specify the sort, even if the same query conditions, the order of the query results can not be guaranteed every time consistent,
If it is a multi-table association, or if you consider the index, database maintenance and other operations, the situation will become more complex, such as this is also more interesting: http://www.cnblogs.com/wy123/p/5425946.html
What's special is that you don't explicitly specify the sort method,
1, a period of time, the query results may be sorted according to the expected results, a period of time is not (the impact of physical storage changes);
2, some query conditions are sorted according to the expected results, change the query conditions, the sorting result becomes unrecognizable (the impact of the execution plan change).
In a nutshell: No sorting is performed explicitly, and it is not expected that the query results will be sorted every time, even every time.

Summarize:

This article is based on two simple examples,
From two aspects of execution planning and physical storage, "If query SQL does not explicitly specify a sort method, the order of the query results is not guaranteed to always follow your expectations."
Of course, it cannot be confined to either of these situations.
However, the words can not be said to die, some conditions do not explicitly specify the sorting method, you may get the desired sort results, but this expectation is often unreliable.
  

"Yesterday the order of the system query results is good, how changed today?" ”
"Why do I use a conditional query is sorted by time, according to the B criteria query is not?" ”
If you don't explicitly specify a sort order, don't ask me if the database is having a problem (or the SQL Server database is "not working", or if the DBA says it's internal because it's a bluff).


So classmates, if you expect the query results to be sorted, regardless of whether the default is the sort you expect, explicitly specify the sort method.

Analysis of the reason that SQL query statements do not explicitly specify the sort method and cannot guarantee that the same query will be consistent for each sort result

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.