Start with Jerome Wong, a blog site.
He raised this question.
I have been writing SQL statements for several years and have never noticed this problem.
For example:
The data table is as follows:
IDEMPNONAMEAGE
1 26929 Jerome 28
2 28394 Quince27
3 20983 Green 30
4 27189 Mike 30
5 23167 Arishy 30
6 26371 Yager 29
I wrote an SQL statement to get 3rd or 4 pieces of data and test paging.
- select top 2 * from (select top 4 * from Member ) m
- order by m.RowID desc
Subquery in the middle section: select top 4 * from Member
Obtained:
1 26929 Jerome 28
2 28394 Quince27
3 20983 Green 30
4 27189 Mike 30
However, the results of the entire SQL statement are:
5 23167 Arishy 30
6 26371 Yager 29
I really don't know how this happens. Please advise me.
In fact, no matter whether you are a newbie or an expert in writing a program, you will often encounter details like this.
Next, I will make a series of analyses on the issues raised by the Jerome Wong netizens regarding select top (here I would like to thank the Jerome Wong netizens for proposing this question)
Preparations
- If object_id ('zhuisuo ') is not null
- Drop table zhuisuo
- Go
- Create table zhuisuo
- (
- Id int null, name varchar (20) null
- )
- Insert into zhuisuo values (1, 'recourse 1 ')
- Insert into zhuisuo values (2, 'recourse 2 ')
- Insert into zhuisuo values (3, 'recourse 3 ')
- Insert into zhuisuo values (4, 'recourse 4 ')
- Insert into zhuisuo values (5, 'recourse 5 ')
- Insert into zhuisuo values (6, 'recourse 6 ')
- Insert into zhuisuo values (7, 'recourse 7 ')
- Insert into zhuisuo values (8, 'recourse 8 ')
- Insert into zhuisuo values (9, 'recourse 9 ')
- Insert into zhuisuo values (10, 'recourse 10 ')
- Go
Next, let's write two simple Select statements.
- select top 2 * from (select top 4 * from zhuisuo) m order by m.id desc
- select top 2 * from (select top 4 * from zhuisuo order by id asc) m order by m.id desc
The execution results will show that
Most people usually think that the execution results of these two statements will be the same.
How can this happen?
You can see clearly from this query plan
First, after scanning the zhuisuo table, first descending order (top N Sort) and then the first two rows in the range of four rows
The second method is to scan the zhuisuo table and then take four rows (top N Sort) in ascending order, and then take the four rows (top N Sort) in descending order)
Here we have to briefly talk about the table subqueries in SQL statements.
Table subquery, and the table that appears in the from clause is called a derived table.
The derived table is virtual and has not been physically embodied. That is to say, when the table is compiled
For example, (select top 2 * from (select top 4 * from zhuisuo) m order by m. id desc)
External queries and internal queries are merged and a plan is generated.
Now let's take a look at the above execution plan.
Note: order by is generally not allowed in a derived table unless top
That is to say, the select top 2 * from (select * from zhuisuo order by id asc) m order by m. id desc statement cannot be executed)
A derived table is a quasi-table to be referenced outside, and order by returns a cursor instead of a table. Therefore, it is restricted only when order by is used.
Why is it okay to use top plus order?
This is because top can select a specified number from the cursor returned by order by to generate a table and return
Next I will give an example of the top-level precautions.
1. Using top to return random rows, many people will think of using the RAND function to obtain such a statement.
- select top 4 id,name from zhuisuo order by rand();
After multiple queries, you will be disappointed to find that it does not return random rows.
This is because each query calls it only once instead of once per line.
Now we can change RAND to Newid.
- select top 4 id,name from zhuisuo order by newid();
Now you will get the desired result. Here we can realize that NEWID has better distribution characteristics.
2. Note that top is used in insert.
- insert top (4) into zhuisuo
- select * from zhuisuo order by id desc
Many netizens will explain how to insert the last four data entries in the zhuisuo table into the table.
However, after the execution is completed, you will be disappointed. The first four items are inserted.
The correct method for inserting the top part in flashback should be
- insert into zhuisuo
- select top (4) * from zhuisuo order by id desc
What are the differences between the two statements?
In fact, the above statement is not sorted (Top N Sort)
3. What if I want to delete the last five data records in a data table?
Order by cannot be used when delete and update use top.
Now we can solve this problem.
- Delete zhuisuo
- Where id in (select top (5) id from zhuisuo order by id desc)
- Update zhuisuo
- Set name = 'suo suo '+ namewhere id in (select top (5) id from zhuisuo order by id desc)
This is disguised to implement Top N sort update or delete data, but this is not the optimal method, because it also requires the root id to match
In this case, we can use this method.
- With cte_del as (select top (5 )*
- From zhuisuo order by id desc)
- Delete from cte_del
- With cte_del
- (Select top (5) * from zhuisuo order by id desc)
- Update cte_del set name = 'suo suo '+ name
4. In addition to these features, top has more functions. For example, I used Top N sort and apply to answer a netizen's question.
How to query a user's reading records with a accuracy rate greater than 60% in the last month, and display only the one with the highest accuracy rate per day
Here, I will only mention a few interesting details about apply. In the future, I will write them in a casual manner.
The last part is the procedure for using the serial number to represent the logical query processing.
Edit recommendations]