- Recently in the forum to see someone asked, how to quickly generate 1 million non-repeating 8-digit number, for this issue, there are a few things to note:
- 1. How to generate a 8-bit random number, the more random the generated number, the more likely the repetition is, of course, the smaller
- 2. Control does not repeat
- 3. Consider performance
- In response to this problem, I have written the following examples to solve, hoping to provide guidance to those who have such needs
- */
- Use tempdb
- GO
- --Create a test table
- CREATE TABLE TB (ID char (8))
- --Create a unique index for automatic filtering of duplicate values
- CREATE UNIQUE INDEX ix_tb on TB (ID)
- With Ignore_dup_key
- GO
- --Test the processing time of data insertion, and record the point in time at which processing begins
- DECLARE @dt datetime
- SET @dt = GETDATE ()
- --Inserting random data
- SET NOCOUNT on
- DECLARE @row int
- Set @row = 1000000 --Sets the total number of records
- While @row >0
- BEGIN
- -Displays a message indicating that multiple rows of data are also required
- RAISERROR (' need%d rows ', ten, 1, @row) with NOWAIT
- --Inserting random bits encoded data
- SET ROWCOUNT @row
- INSERT TB SELECT id = right (100000000 + CONVERT (bigint, ABS (CHECKSUM (NEWID ())), 8)
- From syscolumns c1,syscolumns C2
- SET @row = @row-@ @ROWCOUNT
- END
- --Displays the time that the data was inserted for use
- SELECT begindate = @dt, EndDate = GETDATE (), Seconds = DATEDIFF (Second, @dt, GETDATE ())
- GO
- --Shows whether the final result record is correct
- SELECT COUNT (*) from TB
- SET ROWCOUNT 0
- SET NOCOUNT OFF
- GO
- --Delete test
- DROP TABLE TB
- /*
- Techniques used to solve the problem:
- 1. Control the resulting data is not duplicated, directly using the IGNORE_DUP_KEY option in a unique index, so that duplicate values in the inserted data are automatically filtered to avoid manual processing of duplicates
- 2. Use CHECKSUM with the NEWID () function to make the generated data as random as possible, and generally generate random numbers to consider using the RAND () function.
- But this function is to produce pseudo-random values, test with the following statement, you will find that the resulting data are all the same, this does not apply to bulk generation of multiple random numbers,
- The NEWID () function produces a GUID that is basically not duplicated, and then translates it into a number by checksum, so that the likelihood of duplication is smaller
- SELECT TOP ten RAND () from sysobjects
- 3. In efficiency control, use the Loop + batch generation method instead of the traditional one-by-one generation. In SQL Server, each INSERT statement will have an internal transaction,
- If you insert, the cost of the transaction is too large, the efficiency will be very low, do not consider generating 1 million data at a time, one because the generated data may be duplicated,
- Remove the repetition there is no 1 million, two one-time generation of 1 million data, consumption of memory and CPU resources is also very high, the general computer may not withstand
- */
How SQL Server quickly generates 1 million random 8-bit numbers that are not duplicated