1. How to read a file without knowing the number of rows? select and output a row randomly.
When we readI (I> 0)When1/IThe probability of selectingIAnd replace the original row.
That is, the first line is always selected and the probability1/2Select2Line, with probability1/3Select3And so on.
By the end of the file, the probability of each row being selected is equal.
# Include <stdio. h> # include <stdlib. h> # include <string. h> # include <time. h> # define max_line_len 4096int main () {srand (Time (null); const char * filename = "input.txt"; file * file = fopen (filename, "R "); char line_buffer [max_line_len]; char Selection [max_line_len]; int I = 1; while (fgets (line_buffer, max_line_len, file) {If (RAND () % I = 0) strcpy (selection, line_buffer); ++ I;} puts (selection); fclose (File); Return 0 ;}
Click here.
To
1Line, no problem, Skip.
To
2Line,
2The probability that a row is selected is
1/2, Then
1The probability that a row is selected is also
1/2.
To
3Line,
3The probability that a row is selected is
1/3, No
1Row and number
2The probability that a row is selected is
(1/2) * (2/3).
To
ILine,
1 ~ IThe probability that each row of a row is selected is
1/ITo the last line of the file.
2.How to read a file without knowing the number of rows, select and output k rows randomly (assuming that K is smaller than the total number of files)
Read1 ~ KSave the row and readILine, allK/IThe probability of replacing the previously storedKA row in the row.
# include
# include
# include
# include
# define max_line_len 4096int main () {int K = 5; srand (Time (null); char line_buffer [max_line_len]; char ** selections = (char **) malloc (K * sizeof (char *); For (INT I = 0; I
Click here.
Set
1 ~ I (I> = K)The probability of each row being selected is
K/IWhen we read
I + 1When
K/(I + 1)The probability of retaining the row, and randomly replace a saved row (the probability of each saved row being replaced is
1/K). In this way
I + 1The probability that a row is selected is
K/(I + 1), The probability of other rows being selected is
(
K/I) * (1-k/(I + 1) + (K/I) * (K/(I + 1) * (1-1/K),
The first is
I + 1When the row is not retained
I + 1When the row is retained and replaced, the final result is
K/(I + 1), So to
I + 1Until the row ends, the probability of each row being selected is still the same. It is also satisfied by the end of the file.