You only need to use newid () for the SQL Server random record fetch method.
Sqlserver has a built-in newid () function, which returns a GUID, so we can sort the table records by the newid (), because newid () the results are different each time, so the query results ensure that the results of each sort are different. The following uses the orders table in the Northwind database as an example to obtain 10 random records:
Syntax
The code is as follows: |
Copy code |
Select top n * From TABLE Order By NewID () Example Select top 10 * FROM Northwind. Orders order by newid () |
The NewID () function creates a unique value of the uniqueidentifier type.
After executing the command several times, it seems that it meets the requirements. Each time it comes out, it is different. As for the value range distribution, the random function itself is also pseudo-random.
A simple test was conducted.
The code is as follows: |
Copy code |
Create table # t (cname char (50), sysid uniqueidentifier) DECLARE @ I int Set @ I = 0 WHILE (@ I) <1760 BEGIN Insert into # t select top 1 cname, sysid from tVIPMember order by newid () SET @ I = @ I + 1 END Select count (*) from (Select cname as a from # t group by cname, sysid) |
There are a total of 1760 records in the database, and the result of several executions is about 1100-1200.
However, if the above method has over records, you will find it very slow. We will then optimize newid ()
The code is as follows: |
Copy code |
Create definer = 'root' @ '%' PROCEDURE 'Rand _ data' (IN 'tbname' VARCHAR (50), IN 'rowcnt 'INT, IN 'tbkey' VARCHAR (50 )) LANGUAGE SQL NOT DETERMINISTIC CONTAINS SQL SQL SECURITY DEFINER COMMENT 'Random Acquisition of several records, only applicable to a single primary key table' BEGIN # Obtain the primary key name IF tbKey IS NOT NULL THEN SET @ tbKey = tbKey; # The parameter already exists. This situation is faster. ELSE # No in the parameter. It is time consuming to query the primary key from the system table. SELECT @ tbKey: = c. COLUMN_NAME FROM INFORMATION_SCHEMA.TABLE_CONSTRAINTS AS t, INFORMATION_SCHEMA.KEY_COLUMN_USAGE AS c WHERE T. TABLE_NAME = c. TABLE_NAME AND t. TABLE_SCHEMA = c. TABLE_SCHEMA AND t. TABLE_SCHEMA = database () AND t. TABLE_NAME = tbName AND t. CONSTRAINT_TYPE = 'primary key '; End if; # Obtain the maximum id, minimum id, and number of records SET @ getMaxIdSql = CONCAT ('select @ maxId: = MAX (', @ tbKey ,'),', '@ MinId: = MIN (', @ tbKey ,'),', '@ TotalCnt: = COUNT (', @ tbKey ,')', 'FROM '', tbName ,'';'); PREPARE getMaxId FROM @ getMaxIdSql; EXECUTE getMaxId; Deallocate prepare getMaxId; # Creating a temporary table Drop table if exists rand_tt; SET @ temTbSql = CONCAT ('create temporary table rand_tt SELECT 0 aid, tb. * FROM '', tbName,'' tb LIMIT 0 ;'); PREPARE temTb FROM @ temTbSql; EXECUTE temTb; Deallocate prepare temTb; # Construct an SQL statement to obtain a record SET @ randRowSql = CONCAT ('Insert INTO rand_tt SELECT @ cnt: = @ cnt + 1 aid, tb. * from ', TbName, 'TB WHERE tb. ', @ tbKey,' = ?; '); PREPARE addRow FROM @ randRowSql; # Generate random records SET @ cnt = 0; InsertLoop: LOOP SET @ id = FLOOR (RAND () * (@ maxId-@ minId) + @ minId ); If not exists (SELECT id FROM rand_tt WHERE id = @ id) THEN EXECUTE addRow USING @ id; IF @ cnt> = rowCnt OR @ cnt >=@ totalCnt THEN LEAVE insertLoop; End if; End if; End loop insertLoop; Deallocate prepare addRow; # Returned data Alter table rand_tt drop column aid; SELECT * FROM rand_tt;
|
In this case, you will find that million data can be processed in a few steps.