Introduction to algorithms Chapter 5: Probability Analysis and Random Algorithms

Source: Internet
Author: User
Note: The input formula is difficult. In this article, C (n, k) is used to represent the combination, n is K, P (n, k) is used to represent the arrangement, n is k, and Σ K is ~ B Indicates the sum. 5.1 employment problems

Suppose you want to hire a new office assistant and hire an agent to recommend you a candidate every day (N consecutive candidates are recommended). If you interview this person, if the candidate is better than the current office assistant, you will quit your current Office Assistant and then hire the new one. For an interview, you must pay an agency fee and an Office Assistant fee.

Assume that the interview cost is CI, and the hiring cost is Ch. Assume that M is hired throughout the process, the total cost is NCI + MCH. Since N is a fixed value, the change in the total cost depends on the m value.

This scenario is used as a general computing paradigm model. Generally, we need to check every Member in the queue and maintain a current winner to find the maximum or minimum values in the sequence. The employment problem is to build a model for the updated frequency of the current winning member.

Worst case

In the worst case, we hire every candidate, M = n.

Probability Analysis

In fact, we neither know the order in which candidates appear nor control this order, so we use probability analysis. Probability Analysis is to use probability Technology in problem analysis. To use probability analysis, you must use knowledge about input distribution or make assumptions about it, and then analyze the algorithm to calculate an expected running time. This expectation value is calculated based on all possible input distributions.

In some cases, we make some assumptions about all possible input sets. For other problems, you may not be able to describe a reasonable input distribution. In this case, you cannot use the probability analysis method.

In employment problems, we can assume that the candidates appear in a random order. Assume that you can compare two candidates and determine which one is better. In other words, there is a full-order relationship between all candidates. Therefore, you can use a unique number from 1 to n to indicate the applicant's level of excellence. Rank (I) is used to represent the position of candidate I. This ordered sequence <rank (1), rank (2),..., rank (n)> is an arrangement of the sequence <1, 2,..., n>. If the applicant appears in a random order, the ranking list is 1 to n! Each of them appears with an equal probability.

Random Algorithm

In many cases, we know little about input distribution. Even if we know some information about input distribution, we cannot create a model for this distribution. However, by randomizing some parts of an algorithm, probability and randomness can often be used as a tool for algorithm design and analysis.

For example, if an employment agent gives us a list of candidates for employment questions, we randomly select a candidate for an interview every day to ensure the randomness of the application sequence.

Generally, if the behavior of an algorithm is not only determined by the input, but also by the value generated by the random number generator, the algorithm is random.

Exercise 5.1.2

Assume that random (a, B) returns any number between A and B with the same probability to describe an implementation of the random (a, B) process, it only calls the existing implementation of random (0, 1 ). As a function of A and B, what is the expected running time of your program? Assume that the run time of ramdom (0, 1) is a constant.

Solution 1:

Analysis: Assume K = B-A + 1. Then, the essence of random (A, B) is to select a random probability from K numbers, which may be represented by random (k. Assume that a round of operations is performed in this way. Using Random () to generate K series S = (R1, R2,..., rk ). If R1 = R2 =... = rk, this round of operation will be voided and re-executed; otherwise, if rI = 0, the number of I will be eliminated, and the problem will be converted to a random selection of less than K. Assume that the algorithm random (M) has been implemented for m <K.

Ai-> the number of event I is finally selected
Bim-> event I count and other S-1 count are retained in this round of screening
CIM-> the number of event I in this m-reserved subset is finally selected
 
PR {ai/Bim} = random (m) = 1/m;
PR {ai, bim} = Pr {Bim} * 1/m;
PR {ai} = Sigma M = 1 ~ K-1Pr {ai, bim} = Pr {ai} = Sigma PR {Bim} * 1/m;
PR {Bim} = C (m-1) (1/(2N-2), since we ignore s for the full 1 all 0 two sequences, so the probability of other sequences is 1/(2N-2)

After solving the problem, PR {ai} = 1/K can be obtained.

 

Since random (2) is ready-made, it can be seen through mathematical induction that random (k) can be solved through the above algorithms.

Expected running time of algorithm random (k): The algorithm scale can be reduced as long as there is no situation where S is all 0 and all 1. The probability of these two conditions is 2/2 K. According to the conclusion of the bernuoli test, we expect that the problem scale can be reduced by performing a SK = 1/(1-2/2 k) Round screening.

T (K) = filtering round * K * t (2) + T (I); I is distributed with a certain probability in: 1 ~ K-1.

Set constant C = T (2)

E [T (k)] = Sk * K * C + E [T (1)] * C (k, 1) (1/(2k-2) +... + E [T (k-1)] * C (K, k-1) (1/(2 k-2 ))

This recursive formula cannot be solved.

 

Simplified Algorithm

The above algorithm provides a strict reasoning process, but the algorithm process can be simplified: If K is an even number, S is divided into two groups, through a R) if s is an odd number, the group is grouped by the preceding method and the majority is eliminated. This algorithm is also correct. In a round of tests, if the output of R () is 0 or 1, the problem scale can be halved. In this way, the expected running time of the algorithm can be expressed as E [T (k)] <= Sk * K * C + E [T (K/2)], it should be bytes (k ).

Solution 2:

Assuming that M is 2 K, random (m) can be easily obtained by random (k. Assume that this m> K & M <2 K, you can design the random (k) in this way ):

Random (k)

Value = random (m)

If (value <= K)

Return Value

Else

Return random (k)

Algorithms are essentially an effort-based experiment.

It can also be larger than m, such as m> NK:

Random (k)

Value = random (m)

If (value <= NK)

Return Value/n

Else

Return random (k)

 

Exercise 5.1.3
Suppose you want to output 0 and 1 with a probability of 1/2 each. You can freely use a biased-random process that outputs 0 or 1. It outputs 1 in probability P and 0 in probability 1-P, where 0 <p <1, but you do not know the value of P. An algorithm using biased-random as a subroutine is provided to return a non-biased result. What is the expected running time of your algorithm?

Analysis: The design idea is to use symmetry. Assume there are two benuli test sequences a and B Based on biased-random. Each test sequence generates a 0-1 value sequence. Each round of A and B is performed once. If the result of this round is AI> Bi (that is, AI = 1, Bi = 0) the algorithm ends with 1. If AI <Bi, the algorithm ends with 0. If AI = Bi, the next iteration starts.

Since each round of test is independent, as long as we can prove that each round is equal to the probability of 1 and 0 under the conditions of the result.

Event MI: the result of round I testing is that Ai! = Bi;
Event AI: The result is 1;
Event bi: The result is 0;

PR {ai/mi} = Pr {ai, mi}/PR {mi };
PR {ai, mi} = Pr {ai = 1, Bi = 0} = p * q;
PR {Bi, mi} = Pr {ai = 0, Bi = 1} = Q * P;

So there are pr {ai/mi} = Pr {Bi/mi}

The execution time of this algorithm depends on the number of rounds of tests to reach Ai! = Bi. For any I, PR {ai = bi} = pp + QQ PR {ai! = Bi} = 2pq. This is another bernuoli test. Expected number of tests E [N] = 1/2pq.

5.2 random indicator variables

Definition:

To analyze many algorithms, including employment analysis, we will use indicator random variables, which provide a convenient way to convert between probability and expectation, given a sample space S and time A, the random indicator variable corresponding to event
1 If a occurs
XA = 0 if a does not occur

E [xa] = Pr {}

Use indicator random variables to analyze employment issues:

Assume that the candidate appears in a random order, so that X is used as a random variable, and the value is equal to the number of times a new office is hired. Then E [x] = Σ xpr {x = x}, but this computation will be very troublesome.

We define n variables corresponding to whether or not each candidate is hired, and XI is the indicator random variable corresponding to the event where I candidate is hired. X = X1 + X2 +... + Xn.
E [XI] = Pr {Xi} = 1/I, because the applicant is random, so the probability that the I candidate is better than the previous I-1 is 1/I.

Therefore, E [x] = 1 + 1/2 + 1/3 +... + 1/N.

Exercise 5.2.2:
In the employment question, assuming that the applicant appears in a random order, what is the probability of hiring two times?

Analysis: the first candidate must be hired, and the best candidate must be hired. If the first one is the best, the command is hired once, which can be ruled out. If you hire twice, this means that the candidates between the first candidate and the best candidate are not as good as the first candidate. Because the candidates appear randomly, any sequence is always possible. As long as the number of permutation that meets the previous requirement can be obtained, the probability can be obtained.

Assume that the applicant corresponds to the set S = {1, 2, 3,..., n }. The candidate's level of excellence is marked by a number. Suppose we will temporarily extract the largest n. Replace the remaining 1 ~ N-1 are divided into two groups, which form two sequences S1 and S2. The first number of S1 is required to be the largest number in S1, and S2 is arranged randomly, then the sequence s1ns2. In turn, each sort that meets the requirements can be expressed in this form.

The number of such arrays is M = Σ k = 1 ~ N-1C (n-1, k) (k-1 )! * (N-1-k )! -- K = 1 ~ N-1
Probability: M/N! = 1/N Σ k = 1 ~ N-11/K

 

Exercise 5.2.4:

Hat custody: There are N Customers, each of whom gives the restaurant a hat to the waiter who is responsible for keeping the hat. The waiter returns the hat to the customer in random order, what is the expected number of customers who receive their hats?

Analysis: the random indicator variable Xi corresponds to the event "customer I gets his hat". We can see that E [XI] = 1/N. Therefore, the expected number of customers who get their hats is Σ E [XI] = 1.

 

Exercise 5.2.5:

Suppose a [1... n] is an array composed of n different numbers. If I <j and a [I]> A [J], it is called (I, j) the reverse order of. Assume that the elements in a form an even and random arrangement on <1, 2,..., n>, and calculate the expectation of the backward direction pair in a using the indicator random variable.

Analysis: If the random indicator variable Xij corresponds to the event "A [I]> [AJ]", E [Xij] = 1/2. Therefore, the expected number of Reverse Order pairs is Σ 1 = <I <j <= ne [Xij] = N (n-1)/4.

5.3 random Algorithm

In the case of employment issues, if the applicant appears in a random order, the expected number of times to hire a new office assistant is lnn. This algorithm changes with the input changes. For a specific input, it always produces a fixed number of hires. In this way, there are expensive input, not expensive input and moderate input. If the applicant is randomly arranged first, the random occurrence occurs in the algorithm instead of in the input distribution. The execution of this algorithm depends on random selection instead of input. This is the difference between Random Algorithms and probability analysis.

Randomly arrange Arrays

Many Random Algorithms randomize input by arranging given input arrays. Two randomization methods are discussed here. Assume that an array A contains elements 1 to n. Our goal is to construct a random arrangement of this array.

 

Permute-by-sorting

A common method is to assign a random priority P [I] to array a [I], and then sort the elements in the array based on the priority. This process is called permute-by-sorting.

Permute-by-sorting
1 n <-length [A]
2 For I <-1 to n
3 do P [I] = random (1, N3)
4 sort a, using P as keys
5 return

The third row is in the range of 1 ~ The random value is selected between N3 to make all priority values in P as unique as possible.

It is proved that if all priorities are unique, permute-by-sorting can generate an even and random sorting of input.

Generate the highest priority for AI, so K1, K2,..., kN is 1 ~ N, which determines the final a arrangement. Prove that k1k2... kn is 1 ~ N is evenly and randomly arranged. Assume that ~ Uniformly and randomly arrange s1... Sn of N. Assume that the event Ki is equal to Si with XI, respectively.
So PR {x1 then X2 then... using xn} = Pr {x1} * PR {X2 | X1} * PR {X3 | X2 using X1 }... {PR {xn | Xn-1 sans Xn-2... required X1} = 1/n!
It should be PR {x1} = 1/N; PR {X2 | X1} = 1/n-1 ;...

Permute-in-place

 

A better way is to arrange the given series in the same place.
Permute-in-place
1 n <-length [A]
2 For I <-1 to n
3 do swap a [I] <--> A [random (I, n)]

It is proved by the non-variant cycle. It is proved that after iteration I, a uniform random I arrangement of A is generated.

Initial: conditions are met before the first iteration.
Persistence: assume that the I-th iteration produces an array of x1x2... XI, as long as it proves to any specific value x1, x2 ,... xilin A, PR {x1 = x1 ready X2 = x2 ready... the probability of using xi = xi} is equal. PR {x1 = x1 then X2 = x2 then... using xi = xi} = Pr {xi = xi | X1 = x1 using X2 using... drawing Xi-1 = XI-1} * PR {x1 = x1 drawing X2 = x2 drawing... xi-1 = XI-1} = 1/n-I + 1 * PR {x1 = x1 ready X2 = x2 ready... drawing Xi-1 = XI-1 }. Since the known permutation x1x2... the Xi-1 is an even random I-1 arrangement. So you have to prove it.
In fact, by analogy we can see that PR {x1 = x1 then X2 then... required xi = xi} = 1/N * 1/n-1 *... 1/n-I + 1 = (n-I )! /N !.
Termination: the probability of any sort is 1/N !.

Exercise 5.3.3

Assume that the permute-in-place algorithm is modified, instead of the elements a [I] And a [I... A random element in N] is exchanged with a random element at any position in the array:
Permute-with-all
1 n <-length [A]
2 For I <-1 to n
3 do swap a [I] <--> A [random (1, N)]
Will this code be evenly and randomly arranged?

Analysis: no rigorous proof method is found for this problem, but there is a speculative proof method: N in each step of exchange, then, based on the multiplication principle, there are n switching sequences in the entire process, each of which generates a unique sequence of final elements. The element sequence has n! Type. Because n times cannot be n! Division, so it is impossible to generate an element sequence in probability.

 

Exercise 5.3.4

In the array P of permute-by-sorting, the probability of all elements being unique is at least 1-1/N.

Analysis: assume that the N values generated by random (1, N3) are A1, A2,..., an; event Xi indicates A1 ~ AI is not equal. Xn = {An is not equal to A1... an-1 one} else Xn-1, so PR {xn} = Pr {An is not equal to A1... an-1 one | Xn-1} * PR {Xn-1} = (n3-n + 1/N3) * PR {Xn-1 }.

Recursive PR {xn} = 1_k = 1 ~ N (n3-k + 1)/N3> (1-1/N2) n> C (n, 0) * 1-C (n, 1) (-1/N2) = 1-1/N.

 

Exercise 5.3.6

Explains how to implement the algorithm permute-by-sorting to process two or more situations with the same priority. Even if multiple priorities are the same, the algorithm must generate an even random sequence.

Analysis: For several items with the same priority, perform another round of random priority sorting. If there are other items with the same priority, perform another round .... The idea is to make sure that the items with the same priority are randomly arranged.
 

5.4 Further use of probability analysis and indicator Random Variables

5.4.1 birthday paradox
Only when the number of people in a room reaches 1/2 can at least two people have the same birthday probability.
Analysis 1:
Assume that the number of days in a year is Y, and the numbers M1, M2,..., and Mn are numbers for all persons in the room. Assume that event AI = mi and M1 ~ The birthday of mi-1 is different, the event Bi = m1 ~ Mi birthdays are different;
So BK = artificial ai [I = 1 ~ K],
PR {BK} = Pr {Bk-1 ∩ AK} = Pr {Bk-1} * PR {AK | Bk-1} = Pr {Bk-1} * (Y-k + 1)/y.
Based on Recursive PR {bn} = percentile (1-(I-1)/y) {I = 1 ~ N} calculates the minimum n value of Pr {bn} <1/2.

 

Analysis 2: using indicator Random Variables
Make the indicator variable Xij correspond to the event {person I and person J birthday are the same}
E [Xij] = 1/y;
Σ E [Xij] = 1/y * K (k-1)/2;

The random variable X = Σ Xij indicates the logarithm of the two pairs with the same birthday;

If you take day y = 356, k> = 28, you can expect at least two people to have their birthdays.

Note: The inverse Probability Based on expectation is a simple and approximate analysis, not as accurate as Analysis 1. However, although the results of the two analysis methods are inconsistent, they are both values (N1/2 ).

 

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.