An Efficient Method for randomly generating 13-Bit Absolute random numbers without duplicates

Last Update:2018-12-08 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Problem description: an efficient method for randomly generating 13-Bit Absolute random numbers without duplicates.

Question:

1. All non-repeated random numbers are generated and stored in advance, and the number is obtained as needed;

2. Generate a random number to instantly compare all generated numbers. If yes, it is generated again.

3. Search for a good non-conflicting hash algorithm (or a low probability of conflict ).

4. A pseudo-random number is generated based on a certain algorithm, which must meet the non-similarity or low similarity requirement within a certain order of magnitude.

Random images cannot be repeated, so no algorithm can achieve true random. It can only prevent high-frequency collision and similarity to a certain extent, thus giving the outside world a random photo.

Related methods and problems of idea 2:

1. The database uses the generated sequence as the primary key (the database does not repeat the judgment). If insertion fails, the data is saved.

The method is easy to implement.

2. Sort random numbers generated randomly, use the binary search method, and insert appropriate positions in the sorting queue.

3. manually generate a red/black tree (AVL) and determine whether the tree exists each time it is inserted.

All three methods can be implemented, but the third method is better in terms of efficiency. The database exception is used to determine whether data duplication performance is obviously the worst. Compared with the second method, inserting a linked list is much more efficient, but to achieve Random Access to binary search, a sequence similar to an array must be used. The third method of using the red and black trees is the most efficient.

However, this type of method has a common defect. When more than 50% of data has been generated, the search efficiency of this tree is no problem, but this tree (OR array) the size is almost terabytes. From this perspective, it is not as simple as implementing the first approach. In addition, for example, if eight different data records have been obtained and then a random number is generated, the probability of repeatedly drawing one of the eight generated numbers is very high, after this analysis, we can see that every time a random number is generated, the number of times the whole tree needs to be traversed will increase with the increase of the tree, which is unacceptable.

Methods and problems related to Idea 3:

1. Hash 1-00000000000000000 numbers in sequence, and check the results for conflicts.

2. MD5 is similar to a strong Hash algorithm to encrypt numbers within a fixed range. If the numbers are the same, reverse decryption is not possible. (The length must be modified using a one-way unique ing)

3. For more information about how to generate a GUID, see.

Method 1: You need to store the corresponding results for conflict detection. Similar to method 3 in solution 2, the storage volume is very large. Method 2 and method 3 are the same problem. After MD5 encryption, 16 or 32 characters are generated without duplicates. Note that the characters (0-9a-zA-Z) are generated ), the GUID generates 128 characters at random, therefore, it is difficult to find a way to map 16, 32, or even 128 bits to 32 bits, and each bit is mapped from 62 (10 + 26 + 26) character sets to 10 (0-9) to map the difference between such a large number set.

Related methods and problems of idea 4:

This idea is recommended and easy to implement.

Segment link method;

First, we divide the data into 6 + 6 + 1, which reduces our data volume to millions. The reason for this division is determined by a certain amount of time efficiency and subsequent link methods.

The following describes how to generate a random number of six digits:

1000000 random numbers are generated in 3 s.

A Random Number of 1000000 digits is generated in 16 s.

A Random Number of 1000000 digits is generated in 43 s.

When a random number of 1000000 bits is generated, the value is greater than 1 h.

Therefore, we chose the third solution.

The generated data is as follows:

Use the same method to regenerate a 6-digit random number.

Link method we use the horizontal adjacent unrelated connection method:

(Here, we do not use the database Cartesian product, but use a program to read part of the data and automatically cache the next part of the data when the data is insufficient)

By using this method, we can obtain a maximum of 0.8 million records without duplicates, and no similarity within million records (symmetric links ). In addition, the entire space consumption meets the requirements.

The following is a random SQL script randomly selected for the key database.

USE Job

Create table tb2 (id char (6 ))

Create unique index IX_tb2 ON tb2 (id)

WITH IGNORE_DUP_KEY

DECLARE @ dt datetime

SET @ dt = GETDATE ()

SET NOCOUNT ON

DECLARE @ row int

SET @ row = 800000

WHILE @ row> 0

BEGIN

RAISERROR ('need % d rows ', 10, 1, @ row) WITH NOWAIT

Set rowcount @ row

INSERT tb2 SELECT

Id = RIGHT (100000000 + CONVERT (bigint, ABS (CHECKSUM (NEWID (), 6)

FROM syscolumns c1, sysobjects o --, syscolumns c2

SET @ row = @ row-@ ROWCOUNT

END

SELECT BeginDate = @ dt, EndDate = GETDATE (), Second = DATEDIFF (Second, @ dt, GETDATE ())

Select count (*) FROM tb2

Data disruption script

Select identity (int, 1, 1) as rownumber, * into tmp_tb from tb order by NEWID ();

Select identity (int, 1, 1) as rownumber, * into tmp_tb2 from tb2 order by NEWID ();

Data merging adopts program control:

For details about the merge method, see. The key point is to record the position in the first table and the wrong opening N.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

An Efficient Method for randomly generating 13-Bit Absolute random numbers without duplicates

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

An Efficient Method for randomly generating 13-Bit Absolute random numbers without duplicates

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support