Shuffling algorithm and reservoir sampling

Source: Internet
Author: User

I can see it today. Record it here.

Shuffling Algorithm

Recursive Method: First Set 1 ~ N-1 shuffling, then the random number K (0 <k <n), and exchange N and K, the code is very simple:

`1 int[] shuffle(int[] cards, int n){2     if(n == 1){3         return cards;4     }5     shuffle(cards, n-1);6     int k = random(1, n-1);7     swap(card[k], card[n]);8     return card;      9 }`

It can also be converted to non-recursive:

`1 void shuffle(int[] cards){2     for(int i = 0; i < cards.length; i++){3         int k = rand(0, i);4         int temp = cards[k];5         cards[k] = cards[i];6         cards[i] = temp;7     }8 }`

Convert from http://hankjin.blog.163.com/blog/static/3373193720109141128016/ below

Method 1:
1. Generate a random 1 ~ N x is the first card.
2. Generate a random 1 ~ (N-1) number Y. If y <X, Y is used as the second card. Otherwise, Y + 1 is used as the second card.
3. Generate a random 1 ~ (N-I) number z, take the Z number not drawn out as the I card. (I = 3, 4, 5... 54)
The complexity of this algorithm is O (n ^ 2), because the average number (n/2) of each random number is compared.
Corresponding to playing cards in reality, this algorithm is equivalent to randomly drawing one card from the card heap and putting it on another pile until the draw is complete. The new pile here is the order of cards washed.

Method 2:
1. Generate a random number of 1-N x, and change the X and 1st cards.
2. Generate a random number y from 1 to n, and change the Y and 2nd cards.
3. Generate a random number z from 1 to n, and change the number z card and the number I card. (I = 3, 4, 5... 54)
The complexity of this algorithm is O (n ).

Method 3:
Because there are N in total! Shuffling results, so we can generate a 1-N probability! The random number x between them, and then the corresponding arrangement is generated based on X, that is, the shuffling result.
For more information, see http://baike.baidu.com/view/437641.htm. is the accuracy of this method o (lgn?

Method 2:
Method 2 seems good at first glance. There are many articles on the Internet that use this algorithm, but this is actually a wrong method, because all possibilities of method 2 are n ^ N, but there are n cards to be washed! Possible because n ^ n % n! ! = 0, so the probability of each result is different.
So how can we fix this problem? Like the first method, the I-shuffling generates an I-n random number instead of a 1-n random number. In this way, the probability of a result is n! . The probability is equal.

That is to say, in the third line of method 2, the Z card is exchanged with the I card, Because Z is 1 ~ N may be equal to I, so the possibility is changed to N ^ N, so it is incorrect. The improved algorithm is similar to the one I wrote earlier.

Reservoir sampling

Randomly fetch K entries from the nelement. The procedure is as follows: Extract all the first K entries and put them in the result set, A random exchange between the probability of I to k/I and the result set is as follows:

Case 1: When 0 <I <= K, R [I] = s [I];

Case 2: When k <I <n, j = random (1, I); If (j <= k) {R [J] = s [I];}

Proof:The probability of the M object being selected = the probability of selecting M * (the probability of the element not selected after the M object + the probability of the selected element after the M object * the probability of the M object not replaced ), that is

Related Keywords:

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

• Sales Support

1 on 1 presale consultation

• After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

• Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.