SQL Server random data Extraction efficiency optimization
2013-05-11 1 Comments Vegetable Light Collection I want to contribute
SQL Server random data Extraction efficiency optimization
Hermit I recently encountered a very tangled matter, is how to quickly extract a random data from the data table.
I generated a simple data table and imported 500W data for testing.
is the SQL statement that was used for the first time:
Declare @d Datetime
Set @d=getdate ()
SELECT Top 1[activityid]
, [Cardno]
, [Password]
, [State]
, [Createtime]
, [GetTime]
from [Tgbus_card]. [dbo]. [Ka_card] Where ActivityID = 501 and state = 0 Order by NEWID ()
Select [Statement execution takes time (milliseconds)]=datediff (Ms,@d,getdate ())
This is my first thought of the solution, NEWID () every time to retrieve the entire data table, for each row of data to produce a uniqueidentifier type unique value, so it is not very efficient execution, each execution will be about 4 seconds.
Is there any better way to do that?
I was thinking, because select is executed before order by, so can I use it in order by when I generate NEWID () in the select?
Then there is the second version:
Declare @d Datetime
Set @d=getdate ()
SELECT Top 1[activityid]
, [Cardno]
, [Password]
, [State]
, [Createtime]
, NEWID () as Random
from [Tgbus_card]. [dbo]. [Ka_card] Where ActivityID = 501 and state = 0 Order by Random
Select [Statement execution takes time (milliseconds)]=datediff (Ms,@d,getdate ())
Really improved!! But it's not obvious ...
Is there a better way?!
So I thought about the new features of 2005 Tablesample, try it now!
This is the third version of the Code:
Declare @d Datetime
Set @d=getdate ()
SELECT Top 1[activityid]
, [Cardno]
, [Password]
, [State]
, [Createtime]
, [GetTime]
from [Tgbus_card]. [dbo]. [Ka_card] Tablesample ($ Rows) Where ActivityID = 501 and state = 0
Select [Statement execution takes time (milliseconds)]=datediff (Ms,@d,getdate ())
The speed is too fast!!! 10 milliseconds to show it!
I thought it was done! The heart is also flattered! So small study of tablesample, do not see, the original Tablesample randomly selected units is the data page!! Holy shit!! That is, if the amount of data is very small, it is likely to have no data, try it! This is not a reliable guy!
The problem is back to the beginning ... Are you really going to use newid? Just when I feel the feeling of the time on MSDN I inadvertently see this text:
Feel the chance!!
Then wrote this fourth version of the Code:
Declare @d Datetime
Set @d=getdate ()
SELECT Top 1* from [Tgbus_card]. [dbo]. [Ka_card]
WHERE 0.01 >= cast (CHECKSUM (NEWID (), Cardno) & 0x7fffffff as float)/cast (0x7fffffff as int)
and ActivityID = 501 and state = 0
Select [Statement execution takes time (milliseconds)] = Datediff (Ms,@d,getdate ())
That's the way I ended up with SQL execution at about 1 milliseconds.
Source:
SQL Server random data Extraction efficiency optimization