For an algorithm, it may have the best conditions such as O (n), or the worst case O (n^2), but more likely the general situation O (NLGN). Then whether to adopt this algorithm depends on its average condition, that is, its expectation, which is a method of probability analysis.

The following is a detailed description of this method of analysis, assuming that you have already understood the probability of discrete mathematics random variables and expectations of the relevant content.

**To illustrate with an employment question:**

Suppose you need to hire a new office Assistant. Your previous hiring attempt ended in failure, so you decided to find an employment agent. The hiring agent recommends a candidate for you every day. You will interview this person and decide whether to hire him or not. You must pay the hiring agent a small fee to interview the candidate. It takes more money to really hire a candidate, because you have to quit your current Office Assistant and pay a large brokerage fee to the hiring agent. Your promise is to find the best person to hold the position at any time. So, after you've decided to interview each candidate, **if the candidate is more qualified than the current Office Assistant, you'll quit your current Office Assistant and hire the new candidate** . You are willing to pay for this strategy, but want to be able to predict how much it will cost.

The cost of the interview is lower, for example, CI, while the cost of employment is higher, set to Ch. Suppose M is the number of people employed. Then the total cost of the algorithm is O (NCI+MCH). No matter how many people we hire, we will always interview N candidates, so the cost of the interview will always be the NCI. Therefore, we only focus on analyzing the MCH, that is, the cost of employment. This amount will change in every execution of the algorithm.

**Worst case analysis**

In the worst case scenario, we hire candidates for each interview. This happens when the qualification of the candidate is gradually increasing, at which point we employ n times and the total cost is O (nch).

However, it is reasonable to expect that candidates do not always appear in the order of increasing qualifications. In fact, we can neither know the order of their appearance nor control the order. Therefore, we usually expect a general or average situation.

**Probability analysis**

Probabilistic analysis is the application of probabilistic techniques in the analysis of problems. In most cases, we use probability analysis to analyze the run time of an algorithm.

**Indicator random variable**

The indicator random variable provides a convenient method for the conversion between probability and expectation. Given a sample space S and event A, then event a corresponds to the indicator random variable i{a} defined as

For a simple example, determine how many times you want to face up when you toss a uniform coin. The sample space is s={h,t}, a random variable y is defined, and **the probabilities of the values H and T are all**the same. Next, we define the indicator random variable xh, which corresponds to the case where the coin is facing up, the event H. This variable calculates the number of heads facing up when tossing a coin, or 1 if the face is upward, otherwise 0. Writing:

The expected number of heads facing up when tossing a coin is the expected value of the indicator variable XH:

Therefore, when a uniform coin is thrown, the desired number of faces upward is 1/2. As shown in the following lemma, the expected value of the indicator random variable corresponding to event A is equal to the probability that event a occurs.

We make the indicator random variable XI correspond to the front-facing event of the first-time coin toss, and Yi represents a random variable of the output of the first-time toss of the coin, with xi=i{yi=h}. Assuming that the random variable x represents the **total number** of positive occurrences in an n-coin toss,

We want to calculate the number of expectations that are facing up, so we use both sides of the equation above to take expectations

The **left side of the equation is the expected sum of n random variables** . With lemma 5.1, it is easy to calculate the expected value of each random variable. According to the formula E[x+y]=e[x]+e[y, which reflects the expected linear property **, it is easy to calculate the expected sum: it equals the sum of the expected value of n random variables** . The expected linear properties use the indicator random variable as a powerful analytical technique, even if there is a dependency relationship between the random variables. Now we can easily calculate the expected number of positive occurrences:

Indicator random variables greatly simplify the computational process.

**Analysis of employment problems using indicator random variables**

At this point, we want to calculate the expected number of times to hire a new office Assistant. To make use of probability analysis, it is assumed that candidates appear in random order. Make x a random variable whose value is equal to the number of times a new Office Assistant is hired.

To take advantage of the indicator random variables, we do not calculate e[x by defining the variables that correspond to the number of times a new Office Assistant is hired, but instead define N and whether each candidate is employed in the corresponding variable. In particular, make XI correspond to the indicator random variable that the first applicant is employed in this event. So

And

By lemma 5.1, there

If candidate I is more than 1 to i-1 each candidate, then applicant I will be employed. Since it is assumed that candidates appear in random order, the first I candidates appear in random order. **any one of these former I candidates is likely to be the most qualified at the moment. The probability that candidate I is more qualified than the candidate from 1 to I-1 is 1/i, so it is also employed in the probability of 1/i** . By lemma 5.1, you can draw a conclusion

It is now possible to calculate e[x]:

Even if you interviewed n individuals, on average, you would actually hire only about lnn individuals.

Probabilistic analysis techniques for algorithms (from an introduction to algorithms)