Suppose you have a text file with several rows in the file. Requires a random row to be returned. The selected probabilities are the same for each row.
Two cases:
1, if the file is large, can not be fully put into memory
2, if the file flow
Large file features are not loaded into memory random read, file flow is characterized by only one read.
pseudocode
i = 1
chosen_line = "" While lines
has next:
# Random Returns a uniform random number in [0,1)
if Random () < 1/i:
chosen_line = line
i + = 1 line
= line.next return
chosen_line
similar to the random selection problem: for integers m and N, where m<n, outputs an ordered list of M random integers in the 0~n-1 range and does not allow duplication.
void Genknuth (int m, int n)
{for
(int i=0; i<n; ++i)
{
if (Bigrand ()% (n-i)) < m)
{
cout<<i<<endl;
--m
}}
}
Calculating the probability of each I being output will be found to be m/n and therefore a random selection of equal probabilities.