Bayesian inference and Internet applications (I)

Last Update:2018-12-05 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Author: Ruan Yifeng

A year ago, I was translating Paul Graham's hacker and painter.

That book talks about technical philosophy, but Chapter 8 has written a very specific technical question-how to use Bayesian inference to filter spam (English version )?

To be honest, I didn't fully understand that chapter. At that time, the deadline for submission had passed, and I had no time to chew on the Theory Textbook. I had to put my head on it and translate it literally. Despite the submission, the quality of the translation is acceptable, but I feel very uncomfortable and make up my mind to understand it.

A year later, I read some literature on probability theory and found that Bayesian inference is not as difficult as I thought. On the contrary, its principles are easy to understand and do not even require advanced mathematics.

Below are my study notes. I am not an expert in this field. Mathematics is actually my weakness. Therefore, we welcome your valuable comments so that we can learn and improve them together.

============================================

Bayesian inference and Internet applications

Author: Ruan Yifeng

1. What is Bayesian inference

Bayesian inference is a statistical method used to estimate the nature of a statistic.

It is an application of Bayes Theorem. Thomas Bayes, a British mathematician, first proposed this theorem in a paper published in 1763.

Bayesian inference is quite different from other statistical inference methods. It is based on subjective judgment. That is to say, you can estimate a value first without objective evidence, and then revise it based on the inferred results. It is precisely because it is too subjective and has been criticized by many statisticians.

Bayesian inference requires a lot of computation. Therefore, it has not been widely used for a long time in history. Only after the birth of a computer will it receive real attention. It is found that many statistics cannot be objectively judged in advance. large datasets in the Internet age, coupled with high-speed computing capabilities, provide convenience for verifying these statistics, it also creates conditions for applying Bayesian inference, and its power is becoming increasingly apparent.

Ii. Bayesian Theorem

To understand Bayesian inference, we must first understand Bayesian theorem. The latter is actually the formula for calculating the "conditional probability.

The so-called conditional probability refers to the probability of event a in the case of Event B, expressed by P (A | B.

According to the Wen's diagram, we can clearly see that when Event B occurs, the probability of event a is P (A then B) divided by P (B ).

Therefore,

Likewise,

So,

That is

This is the formula for calculating the conditional probability.

Iii. Full probability formula

Because it will be used later, in addition to the conditional probability, we need to deduce the full probability formula here.

Assume that the sample space S is the sum of two events a and.

In, the red part is event a, and the green part is event A', which constitute the sample space S.

In this case, event B can be divided into two parts.

That is

In the derivation in the previous section, we know that

So,

This is the full probability formula. It means that if a and A' constitute a division of the sample space, the probability of Event B is equal to the sum of the probability of a and A' multiplied by the conditional probability of B respectively.

After this formula is substituted into the conditional probability formula in the previous section, another method of conditional probability is obtained:

Iv. Description of Bayesian inference

Deformation of the conditional probability formula can be obtained in the following form:

We call P (A) A "prior probability", that is, we determine the probability of event a before event B occurs. P (A | B) is called the "posterior probability" (posterior probability). After the occurrence of Event B, we re-evaluate the probability of event. P (B | A)/P (B) is called the "likelyhood" (likelyhood). This is an adjustment factor that makes the estimated probability closer to the actual probability.

Therefore, the conditional probability can be understood as the following formula:

This is the meaning of Bayesian inference. We first estimate a "anterior probability" and then add the experiment results to see whether the experiment has enhanced or weakened the "anterior probability" to obtain a "posterior probability" that is closer to the facts ".

Here, if the "probability function" P (B | A)/P (B)> 1, it means that the "prior probability" is enhanced, and the probability of event a is increased; if "probability function" = 1, it means that event B does not help to determine the possibility of event a. If "probability function" <1, it means that "prior probability" is weakened, and the possibility of event a is reduced.

V. [Example] fruit sugar problem

To deepen our understanding of Bayesian inference, let's look at two examples.

The first example. Two identical bowls: Bowl 1 with 30 fruit sugar and 10 chocolate sugar, and bowl 2 with 20 fruit sugar and chocolate sugar. Now we randomly select a bowl and find a sugar from it. It is fruit sugar. How likely is this fruit sugar from Bowl 1?

We assume that H1 represents Bowl 1, while H2 represents bowl 2. Since the two bowls are the same, P (H1) = P (H2), that is, the two bowls have the same probability of being selected before fruit sugar is removed. Therefore, P (H1) = 0.5, we call this probability "A prior probability", that is, before the experiment is performed, the probability from Bowl 1 is 0.5.

Assume that e Indicates fruit sugar, so the problem becomes how likely the probability from Bowl 1 is to obtain P (H1 | E) when we know e ). We call this probability "posterior probability", that is, the correction of P (H1) after the occurrence of the E event.

Obtain the conditional probability formula.

It is known that P (H1) is equal to 0.5, and P (E | H1) is the probability of taking fruit candy out of Bowl 1, which is equal to 0.75. Then P (e) can be obtained. According to the full probability formula,

So,

Enter the number into the original equation to obtain

This indicates that the probability from Bowl 1 is 0.6. That is to say, after fruit sugar is removed, the possibility of H1 events is enhanced.

6. [Example] false positive problem

The second example is a common medical problem that is closely related to real life.

It is known that the incidence of a disease is 0.001, that is, one of 1000 people is ill. There is a reagent available to test whether a patient is ill. Its accuracy is 0.99, that is, 99% may be positive when the patient is indeed ill. Its false positive rate is 5%, that is, 5% of patients may be positive if they are not ill. The test result of a patient is positive. How likely is a patient to be ill?

Assume that event a indicates illness, then P (A) is 0.001. This is the "anterior probability", that is, the Expected Incidence before we did the test. Assuming that event B indicates positive, P (A | B) is calculated ). This is the posterior probability, that is, the estimation of the incidence after the test.

According to the conditional probability formula,

Use the full probability formula to rewrite the denominator,

Place numbers,

We get an amazing result. P (A | B) is about 0.019. That is to say, even if the test shows positive, the probability of a patient getting sick is increased from 0.1% to about 2%. This is the so-called "false positive", that is, the positive result is not enough to indicate that the patient is ill.

Why? Why is the accuracy of this test as high as 99%, but the reliability is less than 2%? The answer is that it has a high false positive rate. (Exercise] If the false positive rate drops from 5% to 1%, what is the probability of a patient getting sick ?)

If you are interested, you can calculate the "False Negative" problem, that is, the test result is negative, but the probability of the patient being ill is very high. Ask yourself, "false positive" and "False Negative", which is the main risk of medical testing?

==========================================

The principles of Bayesian inference are described here today. Next, we will show you how to use Bayesian inference to filter spam.

(To be continued)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Bayesian inference and Internet applications (I)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Bayesian inference and Internet applications (I)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support