Generation of Random Arrays

Source: Internet
Author: User

During the generation of hashmap test data, there was a problem-the generation of random numbers. The generation of real random numbers requires the data sampled during the operation of the hardware device. Whether the sampling is true or not is random depends on whether the sampling result conforms to the even distribution. Can refer to this article http://www.cnblogs.com/hehehu/archive/2005/08/23/221125.html

Here we discuss the problem of random array arrangement. The problem is introduced as follows:

Given the positive array a [n], a random sequence is generated. For example, if a [n] = {1, 2, 3, 4, 5} is given, a random sequence is generated: 32451. The probability of each generated sequence is the same.

Build the problem into the function method void sequence (int * a, int count). There are two methods to achieve this:

Method 1: assign a random priority P [I] to each element a [I] of the array, and sort the elements in array A according to the priority, this process is on page 59 of the second edition of Introduction to algorithms.

PERMUTE-BY-SORTING(A)1  n ← length[A]2  for i ← 1 to n3  do P[i ] = RANDOM(1, n^3)4  sort A, using P as sort keys5  return A

It is noted that the generation of this random sequence requires the generation of a random number (1, n ^ 3), that is, a random number between 1 and N ^ 3. At this time, the generation of random numbers is involved. Here, we will avoid this problem and establish the generation of random arrays on the basis of the random number production. Then we will discuss the theory of random number generation separately. Based on the fact that the number generated by random (1, n ^ 3) is a real random number between 1 and N ^ 3, the process of proof can be referred to the introduction to algorithms. In this method, the first row selects a random number ranging from 1 to n ^ 3 and ranges from 1 to n ^ 3 to make all the priorities of P as unique as possible; 4th rows need to be sorted. The complexity of the algorithm is the complexity of the sorting algorithm. We know that the time cost of comparative sorting is O (nlgn). If there is a lot of data, it is the bottleneck.

Method 2: a better way to generate a random arrangement is to arrange the given series in the same place. The program randomize-in-place is completed in O (n) time. In iteration I, element a [I] is randomly selected from element a [I] To element a [n. After iteration I, a [I] remains unchanged.

RANDOMIZE-IN-PLACE(A)1  n ← length[A]2  for i ← 1 to n 3  do swap A[i ]  A[RANDOM(i, n)]

Here we also need to rely on random (I, n) to generate a random number between I-n. In the introduction to algorithms, mathematical induction proves that the generated sequence is random. That is, the probability of each sequence is 1/N !.

The preceding two methods depend on the generation of random numbers. Let's take a look at the pseudo-random number generator rand () used in the library and the source code of its seed srand () (The following uses the source code in vc9 as an example)

void __cdecl srand (        unsigned int seed        ){        _getptd()->_holdrand = (unsigned long)seed;}int __cdecl rand (        void        ){        _ptiddata ptd = _getptd();        return( ((ptd->_holdrand = ptd->_holdrand * 214013L            + 2531011L) >> 16) & 0x7fff );}

PTD is a structure in the multi-thread environment of the Runtime library. We don't need to pay attention to it. As long as we know that _ holdrand set in srand will be used in _ holdrand in rand, in addition, this number changes with the generation of a number every time. The change is achieved through K = K * 214013l + 2531011l. intuitively, these two numbers are prime numbers, and the next number is generated by iterations without a number. Therefore, given the number of seeds, the production sequence is fixed.

Ignore the coverage of data in the sequence obtained by these two prime numbers (this is also a problem! But in general, the impact is not so great.) The problem is converted to the seed value that needs to be input for the srand (unsigned int seed) function. Generally, the seed value can be obtained from parameters that change at any time, such as time intercept, interrupt, and mouse position. However, this problem has exceeded the problem of pure computer operations.

The reasoning process that we often use to generate a random sequence is correct-the method used to generate the random sequence above; however, the data prerequisites are incorrect-it cannot be proved that the random number is a true random number. Under engineering conditions, non-real random numbers can also meet the conditions and be applied to production.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.