The idea of the learning algorithm in the previous lessons is given after the data set. Determine the best if H function based on this data set, determine the optimal if H by the learning algorithm, and then pass the best if function h to derive the result of the new dataset. In this lesson introduced a new way of thinking, its core idea is to directly calculate the various if the highest probability, and then fit each if the highest probability of the parameters, and use the probability of fitting to calculate the probability of the new data set, select the highest probability if the classification category.
The essence of the entire generative learning algorithm is the use of conditional probabilities. In the two-tuple category, it can also be called a separate algorithm. Determine the P (Y) and P (x|y) in the given data set, and then according to the Bayes theorem. Get
X|y=1) P (y=1) +p (x|y=0) p (y=0).
In order to get the highest probability of each of them, we know
1, Gauss respectively algorithm (Gaussian discriminant Analysis,gda)
The functions of the multivariate positive distribution are not specifically expanded here, and a separate blog topic on machine learning will be opened in the future. The Gaussian algorithm faces the continuous variable x. Analyze the model separately in Gauss
Y~bernoulli (φ)
X|y=0 ~n (μ0,σ)
x|y=1~ N (μ1,σ)
So their probability distribution functions are:
In the probability distribution function, the parameter φ,σ,μ0 andμ1 can be calculated by the maximum likelihood probability. The likelihood probability function is
Maximizing likelihood probabilities, you can determine the following values for example:
2, Gaussian algorithm and logistic regression algorithm comparison
Both are for classification problems. However, if P (x|y) satisfies the multivariate Gaussian distribution, the P (y|x) can be deduced to satisfy the logistic regression. Otherwise it is not. This shows that the Gaussian algorithm has better model if sex, need less data when training. The Gaussian algorithm is more effective than the logistic regression algorithm when the data set is large, generally speaking. We also think that the Gaussian algorithm is more effective when the data set is small. The logsitic regression algorithm has better robustness and the efficiency of the logistic regression algorithm is better than that of Gaussian algorithm when the data set is obviously not in accordance with the Gaussian distribution. So. Many of the other things that are used in practice are logistic regression algorithms.
In addition, when x|y = 0 to Poisson (λ0), x|y = 1 ~ Poisson (λ1) (satisfies the exponential cluster), p (y|x) satisfies the logistic regression.
Stanford machine Learning Lesson5 sentiment ——— 1, learning algorithm