2014-10-15 10:06:48
goal : Random ordering, using efficient SQL statement queries to get random data samples.
anti-pattern : Using rand () random function
SELECT * from Employees as e ORDER by RAND () Limit 1
Cons: Indexes cannot be used and are different and unpredictable each time they are selected. Perform full table traversal with very poor performance.
how to recognize anti-patterns : May be anti-pattern when the following conditions occur
1, in SQL, return a random row speed is very slow;
2, to get all the records and then randomly one. How do you increase the amount of memory your program can use?
3, some columns appear more frequently than other columns, the stochastic algorithm is not very random.
Rational use of anti-patterns : When the amount of data is very small, you can use random sorting.
solution : no specific sort ...
1. Randomly select between 1 and Max values
Select b1.*
From Bugs as B1
Join (Select Ceiling (rand () * (select Max (BugID) from Bugs) as Randid) as B2
On b1.bugid = B2.bugid
Because the ID may not be contiguous, it may not be possible to query the results at times.
2. Select the next maximum value
Select b1.*
From Bugs as B1
Join (Select Ceiling (rand () * (select Max (BugID) from Bugs) as Randid) as B2
where B1.bugid >= B2.bugid-in order to avoid the ID does not exist, we look for the last piece of data above the ID.
ORDER by B1.bugid limit 1;
Consider using this scenario when the gap is not very large and the importance of each value being equal to the probability selection is not very high.
3, request all the key values, randomly select one, and then use this randomly selected primary key to find the complete record.
4. Use offset to select random rows
MySQL, PostgreSQL, sqlite support limit keyword;
Oracle, MSSQL server uses the Row_number () function.
5. Proprietary Solutions
Each database provides a proprietary solution:
SQL Server2005 adds the tablesample () function to the above version: SELECT * from Employees tablesample (1 rows);
Oracle uses a similar sample () function:
SELECT * FROM (SELECT * from Employees SAMPLE (1)
ORDER BY Employees.id )
where rownum = 1
conclusion : Some queries can not be optimized, the other way to try.
SQL anti-Pattern Learning Note 16 sorting using random numbers