Repeat
Recently in the forum to see someone asked, how to quickly generate 1 million of the 8-bit repeat number, for this problem, there are several points to note:
1. How to generate 8-bit random numbers, the more random the number of generated, the likelihood of repetition is less
2. Control not repeat
3. Consider performance
In response to this question, I have written the following examples to address, hoping to provide guidance to those who have this type of demand
Example of generating 1 million 8-bit non-duplicate data
Use tempdb
Go
--Create a test table
CREATE TABLE TB (ID char (8))
--Create a unique index to automatically filter duplicate values
CREATE UNIQUE INDEX IX_TB on TB (ID)
With Ignore_dup_key
Go
--Test the processing time of data inserts, record the point at which the processing begins
DECLARE @dt datetime
SET @dt = GETDATE ()
--Inserting random data
SET NOCOUNT on
DECLARE @row int
Set @row = 1000000--Set total number of records
While @row >0
BEGIN
--Displays a message indicating that you need to insert multiple rows of data
RAISERROR (' Need%d rows ', 1, @row) with nowait
--inserting random bit-coded data
SET RowCount @row
INSERT TB SELECT
id = right (100000000 + CONVERT (bigint, ABS (CHECKSUM (NEWID ())), 8)
From syscolumns C1, syscolumns C2
SET @row = @row-@ @ROWCOUNT
End
--Displays the time used to insert data
SELECT begindate = @dt, EndDate = GETDATE (),
Second = DATEDIFF (Second, @dt, GETDATE ()),
Go
--Shows whether the final result record is correct
SELECT COUNT (*) from TB
Go
--Delete test
DROP TABLE TB
Solutions to the skills used:
1. Control the resulting data is not duplicated, directly using the IGNORE_DUP_KEY option in the unique index to automatically filter the duplicate values in the inserted data to avoid manual processing of duplicate
2. Using CHECKSUM with the NEWID () function, make the generated data as random as possible, the general generation of random numbers will consider using the RAND () function, but this function is to produce pseudo random value, with the following statement test, you will find that the resulting data is the same, this does not apply to the batch generation of multiple random numbers, The NEWID () function generates a GUID that is basically not duplicated, and then translates it into numbers by checksum, which is less likely to produce duplicates.
SELECT Top 10
RAND ()
From sysobjects
3. In terms of efficiency control, use loops + batch generation instead of the traditional one-by-one generation. In SQL Server, each INSERT statement has an internal transaction, and if inserted, the transaction is too expensive to be efficient; Regardless of the one-time generation of 1 million data, one because the generated data may be duplicated, remove the duplication is not 1 million, Two is a one-time generation of 1 million of data, consumption of memory and CPU resources are very high, the general computer may not withstand