Summary of probability theory knowledge in Machine Learning

Source: Internet
Author: User

I. Introduction

Recently I have written many learning notes about machine learning, which often involves the knowledge of probability theory. Here I will summarize and review all the knowledge about probability theory for your convenience and share it with many bloggers, I hope that with the help of this blog post, you will be more comfortable reading machine learning documents! Here, I will only make a summary of the frequently used probability theory knowledge points, mainly the basic concepts. Because machine learning involves probability theory, it is difficult to understand the basic concepts, it will be updated occasionally in the future. I hope you can leave more comments.

Ii. Bayes formula

Generally, the probability P (A) of event a is called the hypothetical probability before the experiment, that is, the prior probability (prior probability). If another event B has a certain relationship with event, that is, event a and Event B are not independent of each other. When event B does occur, the probability of event a should be re-estimated, that is, P (A | B ), this is called the conditional probability or the hypothetical probability after the test, that is, the posterior probability (posterior probability ).

Formula 1:

Then introduce the full probability formula: Set event a to be only incompatible events currently (that is, two events cannot occur simultaneously )(I = 1, 2,... n). If you know the probability of an event and the conditional probability of event a under an existing condition, the probability of event a occurring is:

This is the full probability formula.

According to the probability multiplication theorem:

We can get:

So:

Based on the full probability formula described above, we can get the legendary Bayesian formula:

These formula theorems are basically and important throughout machine learning!

3. Common discrete variable distribution

  1. "0-1" distribution ": If the random variable X is set, only two values can be obtained: 0 and 1, and the probability function is: the distribution is usually called" 0-1 "distribution or two-point distribution, is the distribution parameter.
  2. Binomial Distribution: Set the possible value of random variable X0, 1, 2,..., n,The probability function is:

This type of distribution is called a two-item distribution. It contains two parameters and is usually recorded as follows: variable X follows the two-item distribution and is recorded

3. Poisson (Possion) Distribution: Set the possible value of random variable X to all non-negative integers, and the probability function is:

Here, λ> 0 is a constant, which is called a Poisson distribution. The Poisson distribution contains a parameter λ, which is recorded as P (λ). If the random variable X follows the Poisson distribution, it is recorded as X ~ P (λ)

Iv. Distribution Functions of Random Variables

SetXIs any real number. The value obtained from random variable X is not greaterXThe probability of event x ≤XThe probability, recordedF (x) = p (x ≤ x)This function is called the probability distribution function or distribution function of random variable X. Note that it is different from the probability function mentioned above.

If the distribution function f (x) of the random variable X is known, the variable X falls into the semi-open interval (X1, x2] Probability:P(X1 <x ≤ X2) =F(X2)-F(X1)

V. Probability Density of Continuous Random Variables

The probability density of continuous random variables is the guide function of the distribution function.

6. mathematical expectation of Random Variables

If the random variable X can only get a limited value:


The probability of getting a limited value is:

 

Mathematical expectation:

 

 

If the probability density of the continuous random variable X is, the mathematical expectation of the continuous random variable is as follows:

The mathematical expectation of a constant is equal to the constant itself.

Theorem: the mathematical expectation of the product of two independent random variables equals the product of their mathematical expectation. The proof is as follows:

For discrete random variables X and Y:

For consecutive random variables X and Y:

VII. variance and standard deviation

The variance of random variable X is written as d (x) and defined:

The following is a useful formula (the property is used: the mathematical expectation of a constant is equal to the constant itself ):

In short, the variance of a random variable is equal to the expectation of the square of the variable minus the expected square.

Standard deviation is the arithmetic square root of variance.

The variance of the constant is 0.

 

8. covariance and Correlation Coefficient

The covariance between random variable X and random variable Y is recorded:

Further derivation can be found:

Because the product of two independent random variables is expected to be equal to the expected product of two random variables, when two random variables are independent, it is easy to obtain their covariance as 0.

The correlation coefficient between two random variables X and Y is:

The absolute values of correlation coefficients of two random variables are not greater than 1.

If and only when there is a linear relationship between the random variable Y and X:

The absolute value of the correlation coefficient is equal to 1, and

9. Normal Distribution

A normal distribution is also called a Gaussian distribution. It sets the probability density of a continuous random variable X.

Where μ and σ> 0 are constants, and such distribution is a normal distribution.

The normal distribution contains two parameters, μ and σ> 0. μ is equal to the mathematical expectation of the normal distribution, and σ is equal to the standard deviation of the normal distribution, the random variable X follows the normal distribution and is recorded:

The Theorem sets that the random variable X follows a normal distribution, then the linear function y = a + bx (B =0) of X also follows a normal distribution, and

 

I will summarize so many important probability theory knowledge points that will be further supplemented in the future!

Summary of probability theory knowledge in Machine Learning

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.