One is asked: What direction are you studying?
I answered silently: Data mining
Others have a meaningful answer: Oh ...
I must have seen that I was just a little white.
Since it is clear that you are a small white, open this blog is just to make their own better notes, and eager to get the guidance of the great God, so that their progress faster.
So,begin:
Data mining Well, so, statistics and probabilities, matrices, machine learning, blah .....
A heap of things to read, from the contact data mining, the formula is huge, huge, huge, the most cordial is the Bayesian formula.
So let's start with the simplest, the probability
First, we need to explain the noun clearly: prior probability, posterior probability
The priori probability refers to the probability obtained from the past experience and analysis, such as the full probability formula, which is often used as the "cause" in the problem of "seeking fruit by reason".
The posterior probability refers to the probability of re-correcting after obtaining the information of "result", which is the "cause" in the problem of "seeking for fruit".
In my own understanding, a priori probability can be calculated by known information, and the posteriori probability is corrected by a priori probability. The Bayesian formula is used to modify the prior probability.
Here is a simple impression, next, let's do some math.
Prior probability (Prior probability)
In Bayesian statistics, the prior probability distribution, that is, about the probability distribution of a variable p, is to speculate on the uncertainty of P before obtaining some information or basis. For example, p could be the probability of robbing a train ticket at the beginning of a robbery. This is a characterization of the quantification of uncertainty, not randomness, which can be a parameter or a potential variable.
A priori probability relies only on subjective empirical estimation, which is based on the inference of existing knowledge beforehand. When Bayesian theory is applied, a priori probability is usually multiplied by the likelihood function (likelihoodfunction) and then normalized, the posterior probability distribution is obtained, and the posterior probability distribution is the condition distribution of uncertainty after the given data is known.
Likelihood functions (likelihood function)
Called likelihood, is a function of the parameters of a statistical model. That is, the argument in this function is the parameter of the statistical model. For the result X, the likelihood on the parameter set θ is the probability L (θ|x) =p (x|θ) of the observed result on the basis of the given values of these parameters. In other words, the likelihood is a function of the parameter, under the condition given by the parameter, for the observed value of the X-condition distribution.
posteriori probability (posterior probability)
A posteriori probability is a conditional probability of a random event or an assertion of uncertainty, and is a conditional probability given and taken into account in the relevant evidence or background. The posterior probability distribution is the probability distribution of the unknown quantity as a random variable, and it is the conditional distribution based on the information obtained from the experiment or investigation. The "post-mortem" here means that consideration of the relevant events has been examined and some information is available.
The posterior probability is the probability of the parameter θ under the given evidence information X: P (θ|x). If the posteriori probability and likelihood function are compared, the likelihood function is the probability distribution of the evidence information X under the given parameter: P (x|θ).
The two have the following relationship:
We use P (θ) to denote the probability distribution function (equivalent to a priori probability) and use P (x|θ) to represent the likelihood function of the observed value X. The posteriori probability is defined as follows:
P (θ|x) =p (x|θ) p (θ)/p (x)
In the same denominator, it becomes: posteriorprobability∝likelihoodxprior probability
Next, use an example to illustrate the problem.
A pocket has 3 red ball, 2 white ball, use does not put back way to touch, Beg: ⑴ first touch red ball (record as a) probability; ⑵ second time to touch the red ball (recorded as B) probability; ⑶ known the second time to touch the red ball, to find the first touch is the probability of a red ball.
(1) For this question, it is equivalent to the problem of calculating a priori probability, P (A) = 3/5
(2) When considering this problem: P (B) = P (AB) +p (a inverse B) = P (b/a) p (a) +p (b/a Inverse) p (a inverse) = 2/4 *3/5+3/4*2/5 = 3/5
(3) For this problem, which is called P (A/b), this is a typical posteriori probability, p (A/b) = P (AB)/p (b) = P (b/a) *p (A)/P (B) = (2/4*3/5)/(3/5) = 1/2
In the first article, I'll write about what I saw today.