Generation of pseudo-random numbers

Source: Internet
Author: User
Tags random seed

In a real sense, random numbers (or random events) are randomly generated based on the distribution probability shown in the experiment process. The results are unpredictable, is invisible. The Random Functions in the computer are simulated according to certain algorithms, and the results are definite and visible. We can think that the probability of this foreseeable result is 100%. Therefore, the "random number" generated by the random function of the computer is not random. It is a pseudo-random number.

Generation Method

Generally, there are three main methods to generate pseudo-random numbers:

(1) Direct method, which is generated based on the physical meaning of the distributed function. The disadvantage is that it is only applicable to random numbers with special distributions, such as binary distribution and Poisson distribution.

(2) inversion method (inversion method), assuming that u is subject to the uniform distribution on the [0, 1] interval, so that X = F-1 (u), then x cumulative distribution function (CDF) is F. This method is simple in principle, convenient in programming, and has wide applicability.

(3) Acceptance-rejection method: assume that the probability density function (PDF) of the random number to be generated is F, first, find a random number generator and constant C whose PDF is g, so that f (x) is less than or equal to CG (x), and then solve the problem based on the receiving rejection algorithm. Since the algorithm performs an average C operation to obtain a random number, the value of C must be as small as possible. Obviously, the disadvantage of this algorithm is that it is difficult to determine g and C.

Therefore, the pseudo-random number generator (PRNG) generally adopts the inverse method, which is based on the even distribution. The advantages and disadvantages of the even distribution PRNG determine the advantages and disadvantages of the entire random number system [7]. The following describes the prng of the even distribution.

Program instance

C language program example

The following shows a C program:

// Rand01.c

# Include

Static unsigned int rand_seed;

Unsigned int random (void)

{

Rand_seed = (rand_seed * 123 + 59) % 65536;

Return (rand_seed );

}

Void random_start (void)

{

Int temp [2];

Movedata (0x0040, 0x006c, fp_seg (temp), fp_off (temp), 4 );

Rand_seed = temp [0];

}

Main ()

{

Unsigned int I, N;

Random_start ();

For (I = 0; I <10; I ++)

Printf ("% u \ t", random ());

Printf ("\ n ");

}

This program (rand01.c) describes the process of generating random numbers completely:

First, the main program calls the random_start () method. I am very interested in this sentence in the random_start () method:

Movedata (0x0040, 0x006c, fp_seg (temp), fp_off (temp), 4 );

This function is used to move memory data. fp_seg (far pointer to segment) Is the function for retrieving the address of the temp array segment. fp_off (far pointer to offset) is a function that obtains the relative address of the temp array. The movedata function is used to place the dual characters in the 0040: 006ch storage unit into the two storage units declared by the temp array. In this way, a 16-bit number at 0040: 006ch can be sent to rand_seed through the temp array.

Random is used to calculate the random number based on the random seed rand_seed value. Here is the sentence:

Rand_seed = (rand_seed * 123 + 59) % 65536;

The random number calculation method is used to calculate random numbers. The random number calculation method is different in different computers, even in different operating systems installed on the same computer. I have tried it in Linux and Windows respectively. The random numbers generated by the same Random Seed in these two operating systems are different, which indicates that their calculation methods are different.

Now we know where the Random Seed is obtained and how the random number is calculated by the random seed. So why should random seeds be retrieved at 0040: CH in the memory? 0040: What is stored in ch?

Those who have studied computer composition principles and interface technology may remember that Intel 8253 Timer/counter will be used when preparing the rom bios clock to interrupt the service program, it communicates with the intel8259 interrupt chip to enable the interrupt service program to run. The processor controls the interrupt Chip Based on the timer/counter value for 18.2 interruptions per second on the motherboard. On our computer's motherboard, there will be such a Timer/counter used to calculate the current system time. Every time a clock signal period passes, the counter will be added to one, and where will the value of this counter be stored? That's right. It's at 0040: ch of memory. In fact, this memory space is defined as follows:

Timer_low DW? ; Address: 0040: 006ch

Timer_high DW? ; 0040: 006eh

Timer_oft DB? ; Address: 0040: 0070 H

In the clock interruption service program, whenever timer_low turns to full, the counter will also turn to full, and the value of the counter will return to zero, that is, the 16-bit binary at timer_low will return to zero, while timer_high will add one. In rand01.c

Movedata (0x0040, 0x006c, fp_seg (temp), fp_off (temp), 4 );

The 16-bit binary numbers timer_low and timer_high are put into the temp array and sent to rand_seed to obtain the Random Seed ".

Now, it can be determined that the random seed comes from the system clock. Specifically, it is the timer/counter value from the computer motherboard in the memory. In this way, we will summarize the previous analysis and discuss the application of these conclusions in the program:

1. the random number is calculated by the random seed based on a certain calculation method. Therefore, as long as the calculation method is certain and the random seed is certain, the random number generated will not change.

C ++ example

See the following C ++ program:

// Rand02.cpp

Theoretical Derivation of pseudo-random numbers

# Include

# Include

Using namespace STD;

Int main ()

{

Unsigned int seed = 5;

Srand (SEED );

Unsigned int r = rand ();

Cout <"r =" <r <Endl; // [updated] edited. Note: Return 0 may be missing in the code;

}

In the same platform environment, after an EXE is compiled and generated, the random numbers displayed each time it is run are the same. This is because in the same compiling platform environment, the random number generated by random seeds is calculated in the same way. In addition, the random number is the same.

2. As long as the user or a third party does not set a random seed, the random seed is from the system clock (that is, the timer/counter value) by default)

C ++ Example 2

See the following C ++ program:

// Rand03.cpp

# Include <iostream>

# Include <cstdlib>

Using namespace STD;

Int main ()

{

Srand (unsigned) Time (null ));

Unsigned int r = rand ();

Cout <"r =" <r <Endl; // [updated] edited. Note: The Code may be missing.

Return 0;

}

If the user and other programs do not set a random seed, the system timer/counter value is used as the Random Seed. Therefore, in the same platform environment, compile and generate an EXE and run it each time, the displayed random number is a pseudo-random number, that is, the displayed result is different each time.

3. Suggestion: If you want to generate a random number sequence in a program, you must set a random seed at most once before the random number is generated.

Generate a random string

See the following C ++ program used to generate a random string:(I can't compile the original program. I changed it and added some header files)

  # Include <iostream>

# Include <string>

# Include <vector>

# Include <algorithm>

# Include <stdlib. h>

# Include <time. h>

# Include <cmath>

Using namespace STD;

Inline void keep_window_open () {char ch; CIN >>> ch ;}

# Define rand_max 0x7fff

Int main ()

{

Int rnum = 0;

Int M = 20;

Char * Ch = new char [m];

Srand (unsigned) Time (null ));

For (INT I = 0; I <m; ++ I ){

Rnum = 1 + (INT) (RAND ()/(double) rand_max) * 36 );

Switch (rnum ){

Case 1: ch [I] = 'a ';

Break;

Case 2: ch [I] = 'B ';

Break;

Case 3: ch [I] = 'C ';

Break;

Case 4: ch [I] = 'D ';

Break;

Case 5: ch [I] = 'E ';

Break;

Case 6: ch [I] = 'F ';

Break;

Case 7: ch [I] = 'G ';

Break;

Case 8: ch [I] = 'H ';

Break;

Case 9: ch [I] = 'I ';

Break;

Case 10: ch [I] = 'J ';

Break;

Case 11: ch [I] = 'K ';

Break;

Case 12: ch [I] = 'l ';

Break;

Case 13: ch [I] = 'M ';

Break;

Case 14: ch [I] = 'n ';

Break;

Case 15: ch [I] = 'O ';

Break;

Case 16: ch [I] = 'P ';

Break;

Case 17: ch [I] = 'q ';

Break;

Case 18: ch [I] = 'R ';

Break;

Case 19: ch [I] ='s ';

Break;

Case 20: ch [I] = 'T ';

Break;

Case 21: ch [I] = 'U ';

Break;

Case 22: ch [I] = 'V ';

Break;

Case 23: ch [I] = 'W ';

Break;

Case 24: ch [I] = 'X ';

Break;

Case 25: ch [I] = 'y ';

Break;

Case 26: ch [I] = 'Z ';

Break;

Default: cout <"No! \ N ";

Break;

}

Cout <ch [I] <'\ n ';

}

Return 0;

}

The running results show that each character of the random string is the same, that is, the generated character sequence is not random, so we need to set srand (unsigned) Time (null )); remove from the for loop and put it in front of the for statement to generate a random character sequence, and the character sequence generated each time it is run will be different (haha, it may also be the same, however, the probability of such a situation is too small ).

If you change srand (unsigned) Time (null); To srand (2), although the character sequences generated during a single run are random, however, the random string generated during each running is the same. Remove the srand sentence from the program.

In addition, you may encounter this situation. When using the timer control to program, you will find that a group of random numbers generated at the same time interval will appear regular, the random numbers generated by the user's command events appear random. Why? Based on the above analysis, you can quickly come up with the answer. This is because timer controls the time interval precisely by the computer clock Counter. The time interval is the same, and the difference between the values before and after the counter is the same. In this way, the clock value is linear, the Random Seed is linear, and the random number generated is also regular. The random numbers generated by user key events are indeed more random, because the events are caused by human keys, and people cannot guarantee strict key time intervals, even if they do it strictly, it cannot be completely accurate. As long as the time interval difference is one microsecond, the difference between the values before and after the stenographer is different, and the variation of random seeds loses the linear law, the generated random number is more irregular, so the random number is more random. This reminds me of various Lottery programs for the gala. If a lucky audience is generated by pressing a button, the random principle will be well implemented and the result will be more fair.

Summary

1. The computer's pseudo-random number is calculated by the random seed based on a certain calculation method. Therefore, as long as the calculation method is certain and the random seed is certain, the random number generated is fixed.

2. As long as the user or third party does not set a random seed, the random seed comes from the system clock by default.

Disadvantages of pseudo-random number generator

RepeatN= 10000 tests, produced each timeS= 20 andS= 100 samples with random distribution, and the Kolmogorov-Smirnov hypothesis test (hypothesis test) was used to determine whether the samples meet the uniform distribution. Rules:

① 0 hypothesis (nullhypothesis) indicates that the samples are evenly distributed; ② 1 hypothesis (alternativehypothesis) indicates that the samples are unevenly distributed.

The P value (in [0, 1]) is used to measure. The closer the P value is to 0, the more reason the 0 hypothesis is rejected. That is, the samples do not obey the even distribution. The closer the P value is to 1, indicates that the more reason a zero hypothesis is accepted, that is, the samples are evenly distributed.

1 and figure 2: As the P value decreases, the samples become increasingly unevenly distributed. In practice, we hope that the larger the P value, the better. However, the statistical conclusion shows that P values must be evenly distributedN,SThis indicates that, due to randomness, the sample obtained from a certain sampling is always indecisive, or even away from the even distribution. In addition, different sample sizes lead to different test standards. intuitively, the uniform distribution of S = 100 is more even than that of S = 20. Therefore, the difference in the even distribution of PRNG is particularly serious in small samples.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.