I. What is probability?

Probability is a quantitative indicator indicating the likelihood of an event. It is between 0 and 1.

1.1 subjective probability

A subjective estimate of the likelihood of an event based on experience and knowledge. subjective probability can be understood as a mentality or tendency.

1.2 potential tests

Assume that a test has a limited number of possible results $ e_1, E_2, \ dots, e_n $. Assuming that we analyze the test conditions and implementation methods, we cannot find any reason to think of a result, for example, $ e_ I $, than any other result, such as $ E_j $, if it is more advantageous (that is, it is more prone to occur), we have to consider that all the results $ e_1, E_2, \ dots, e_n $ have the same chance of appearance in the test, that is, the opportunity for $1/N $. Such experimental results are often referred to as "possible ".

1.3 definition of Classical probability

Set a test with $ N $ and other possible results, and the event $ e $ contains $ M $ results, then the probability of the event $ e $, as $ P (e) $, defined:

$ P (e) = M/N $

The classical definition above can only be used in cases where all test results are limited and the possibility is true. In some cases, this concept can be extended to the case where the test results are infinitely large.

1.4 geometric probability

Party A and Party B agreed to meet at a certain place between and, and agreed that the first person will wait for 10 minutes before leaving. Imagine that Party A and Party B would randomly select a time to arrive at the place between and ask "what is the probability that Party A and Party B can run into" event $ e $?

If we use a coordinate system to represent the planes of all events, the $ x $ axis represents the moment when a starts, and the $ y $ axis represents the moment when B starts, if a, B, and a can meet the following requirements:

$ | X-y | <10 $

It can be calculated on the coordinate plane to meet the area of the area above the inequality.

1.5 probability frequency definition method

1) A large number of random images related to event a can be repeated.

2) In the $ N $ repeat test, remember $ n (a) $ as the number of times the event $ A $ appears, also known as $ n () $ is the frequency of event $ A $. $ F_n (a) =\frac {n (a)} {n} $ is the frequency of event $ A $.

3) People's long-term practice shows that with the increase of the number of test repeats $ N $, the frequency $ f_n (a) $ will stabilize near a constant $ A $, we call this constant a stable value of frequency. The stable value of this frequency is the probability we want.

Ii. Two principles of Classical probability calculation 2.1

1) multiplication principle

If a task can be completed only after $ K $ steps, there are $ M_1 $ methods in the first step, and $ M_2 $ methods in the second step ...... There are $ m_k $ methods in step $ K $, so there are $ M_1 \ times M_2 \ times \ dots \ times m_k $ methods to complete this task.

2) addition principle

If a task can be completed by one of the different approaches of the $ K $ class, there is $ M_1 $ method in the first approach, there are $ M_2 $ methods to complete in the second approach ...... There are $ m_k $ methods in the $ K $ class path. There are $ M_1 + M_2 + \ DOTS + m_k $ methods to complete this task.

2.2 arrangement and combination

According to the definition of Classical probability formula, the calculation of Classical probability is to calculate two numbers $ M $ and $ N $. Most of these computations involve permutation and combination. The difference between the two lies in that the arrangement should be based on the order but not the combination: AB and BA are different, but they are the same combination.

Definition 1: $ N $ the total number of different orders of $ r$ items ($1 \ le r \ Le N $) is

$ P _ {r} ^ {n} = N (n-1) (n-2) \ DOTS (n-R + 1) $

In particular, when $ n = r$, get $ P _ {r} ^ {r} = R (r-1) \ dots 1 = r! $ Is a full arrangement of $ r$.

Definition 2: $ N $ total number of different combinations of $ r$ items ($1 \ le r \ Le N $) is

$ C_r ^ n = P_r ^ N/R! = N! /(R! (N-R )!) $

Some books write the mark $ c_r ^ N $ as $ C_n ^ r$. A more common mark of $ c_r ^ N $ is $ \ begin {pmatrix} n \ r \ end {pmatrix} $. We will replace $ c_r ^ N $ with $ \ begin {pmatrix} n \ r \ end {pmatrix} $. We can easily export $ \ begin {pmatrix} n \ 0 \ end {pmatrix} = 1 $,

$ \ Begin {pmatrix} n \ r \ end {pmatrix} = N (n-1) \ DOTS (n-R + 1)/R! $

2.3 relationship with the spread of Binary

The combination coefficient $ \ begin {pmatrix} n \ r \ end {pmatrix} $ is also known as the binary coefficient, because it appears in the well-known formula of the binary extension below:

$ (A + B) ^ n = \ sum _ {I = 0} ^ n \ begin {pmatrix} n \ r \ end {pmatrix} a ^ IB ^ {n-I} $

This formula proves simple: $ (a + B) ^ n = (a + B) \ DOTS (a + B) $. to generate $ A ^ IB ^ {n-I} $, in $ N $ (a + B) $, $ A $ is to be retrieved from $ I $ and $ n-I $ is to be retrieved from $ A $. The different method for getting $ I $ from $ N $ is $ \ begin {pmatrix} n \ r \ end {pmatrix} $, this is the coefficient of $ A ^ IB ^ {n-I} $.

2.4 heap splitting problems

$ N $ separate object into $ K $ heap. The number of objects in each heap is $ r_1, r_2, \ dots, and the method of r_k $ is

$ \ Frac {n !} {R_1! \ Dots r_k !} $

Here $ r_1, r_2, \ dots, r_k $ are all non-negative integers and their sum is $ N $

3. Event calculation 3.1 contains, includes, and is equal

Two events $ A $ and $ B $ under the same test. If $ A $ occurs and $ B $ is required, $ A $ contains $ B $, or $ B $ contains $ A $, which is recorded as $ A \ subset B $. If $ A and B $ are mutually exclusive, that is, $ A \ subset B $ and $ B \ subset A $, the events $ A and B $ are equal, as $ A = B $.

As shown in, if the box is a target, if it hits a, it must have hit B. A and B are more difficult to happen than a, so the probability must be less than or equal to B.

3.2 mutual exclusion and opposition of events

If two things, A and B, do not occur in the same test (but do not happen), they are mutually exclusive. If any two of the events are mutually exclusive, these events are mutually exclusive or mutually exclusive.

For example, when you throw a dice, the two events that throw one and two are mutually exclusive. The two events cannot happen at the same time, but none can happen at the same time.

An important event of mutual exclusion is the "opposite event". If a is an event, the event $ B =\{ A does not happen \} $ is called the opposite event of, note: $ \ bar {A }$ (also recorded as $ a_c $ ).

For example, when you throw a dice, throwing an odd or even point is an inverse event.

3.3 Events and

There are two events A and B. define a new event C as follows:

$ C =\{ a occurs, or B occurs \}={ A, B at least one \} $

The event C defined in this way is called the sum of a and Event B, and is recorded as $ c = A + B $.

When multiple events are promoted, there are several events $ A_1, A_2, \ dots, a_n $. Their and a are defined as events.

$ A =\{ A_1, or A_2, \ dots, or a_n \}=\{ A_1, A_2, \ dots, a_n at least one \} $

3.4 addition theorem of Probability

Justice

The probability of the sum of multiple mutex events is equal to the sum of the probability of each event:

$ P (a_1 + A_2 + \ dots) = P (A_1) + P (A_2) + \ dots $

Inference

$ \ Bar {A} $ indicates the opposite event of A, then

$ P (\ bar {A}) = 1-P (a) $

3.5 event product and event difference

There are two things: A and B. The event C is defined as follows:

$ C =\left \ {A, B occur \ right \} $

The product definitions of multiple events $ A_1, A_2, \ dots $ (limited or unlimited) are similar to: $ A =\{ A_1, A_2, \ dots occur \} $, recorded as $ A = a_1a_2 \ dots $, or $ \ prod _ {I = 1} ^ {n} a_ I $

The difference between two events A and B, as $ A-B $, is defined:

$ A-B =\{ A happens, B does not happen \} = A \ bar {B} $

Iv. Definitions of conditional probability and independence 4.1 conditional probability

There are two events a, B, and $ P (B) \ Ne 0 $. Then, "conditional probability of A under the condition where B is given" is recorded as $ P (A | B) $, which is defined

$ A (a | B) = P (AB)/P (B) $

4.2 independence of events and probability multiplication theorem

There are two events $ A, B $, $ A $ unconditional probability $ P (a) $ with the probability of a condition under the given $ B $ occurrence $ P (A | B) $, which is generally different. This reflects some associations between the two events. For example, if $ P (A | B)> P (a) $, the occurrence of B increases the possibility of occurrence of A: B promotes the occurrence of.

Otherwise, if $ P (A | B) = P (a) $, the occurrence of B has no effect on the possibility of. In probability theory, the events A and B are independent. We can easily get

$ P (AB) = P (a) P (B) $

For two events a, B that meet the preceding formula, A and B are independent. The formula above is also the multiplication theorem of probability.

Events are independent of each other, and sometimes they are not determined by the above formula.

Assume that three dice are thrown, and the following two events A and B are defined. A = {at least one dice throws 1}, Event B = {at least two of the three dice throws are the same}. Q: Is a and B independent?

At first glance, I often think that A and B are independent, because one is concerned with the number of points that are flushed, and the other is the same as that of B (not concerned about the number of points ). That is, whether or not to roll 1 seems to be not beneficial to Event B.

From another angle, consider the opposite event of A, that is, if no dice throws 1, it means that the points of the three dice are {2, 3, 4, 5, 6}. Then, in event B, each dice has a maximum of five results, which is less likely than the original one. It is clear that the probability of occurrence of Event B has also changed.

Several independent events $ A_1, A_2, \ dots $ are limited or unlimited events. If a limited number of $ A _ {I _1}, A _ {I _2}, \ dots, A _ {I _m} $ are retrieved from them

$ P (A _ {I _1} A _ {I _2} \ dots a _ {I _m}) = P (A _ {I _1}) P (A _ {I _2 }) \ dots P (A _ {I _m}) $

Event $ A_1, A_2, \ dots $ are mutually independent. That is to say, the occurrence or failure of other events on event a does not affect the occurrence of event.

The probability of the product of several independent events $ A_1, \ dots, a_n $ is equal to the product of the probability of each event:

$ P (A_1 \ dots a_n) = P (A_1) \ dots P (a_n) $

The multiplication theorem serves the same purpose as the addition theorem: Compute the probability of a complex event as a simpler calculation of the probability of an event. Of course, there must be a condition that the sum is mutually exclusive and the multiplication is independent.

4.3 full probability formula and Bayesian Formula

Total probability formula

Set $ B _1, B _2, \ dots $ to a finite or infinite event. They are mutually exclusive and at least one event occurs in each experiment.

$ B _ib_j = \ varnothing (impossible event), when I \ ne J \ B _1 + B _2 + \ DOTS = \ omega (inevitable event) $

Sometimes a group of events with these properties is called a "Complete Event Group ". Note that event B and its opposite events form a complete event group.

Now consider any event a, because $ \ Omega $ is a required event, $ A = A \ Omega = AB _1 + AB _2 + \ dots $. Because $ B _1b_2 and \ dots $ are mutually exclusive, obviously $ AB _1, AB _2, \ dots $ are also mutually exclusive. According to the addition theorem

$ P (A) = P (AB _1) + P (AB _2) + \ dots $

The conditional probability is defined as $ P (AB _ I) = P (B _ I) P (A | B _ I) $.

$ P (A) = P (B _1) P (A | B _1) + P (bb_2) P (A | B _2) + \ dots $

The formula above is the full probability formula.

Practical: in complex situations, it is difficult to calculate $ P (a) $ directly, but a always comes with a certain $ B _ I $, constructing this group of $ B _ I $ can simplify the calculation.

Bayesian Formula

Under the assumption of the full probability formula

$ P (B _ I | A) = P (AB _ I)/P (A) = \ frac {P (B _ I) P (A | B _ I)} {\ sum_jp (B _j) P (A | B _j)} $

The above is the famous Bayesian formula.

Meaning: first look at $ P (B _1), P (B _2), \ dots $. It has no further information (I wonder if event a has occurred, people know the possibility of event $ B _1, B _2, \ dots $. Now with new information (knowing that a occurs), people have a new estimate of the possibility of occurrence of $ B _1, B _2, \ dots $.

If we regard event a as a "result" and regard the events $ B _1, B _2, \ dots $ as the possible "cause" of the result ", the full probability formula can be visually viewed as the "Promotion result by reason", while the Bayesian formula is the opposite. Its function is to "push the reason by result ": now there is a "result a has happened". Which of the many possible reasons has caused this result? According to Bayesian formula, the likelihood of each reason is proportional to $ P (B _ I | A) $.