Stanford "Machine Learning" Lesson5 ——— 1, Learning algorithm

Source: Internet
Author: User

The idea of the learning algorithm in the previous lessons is given after the data set. Determine the best if H function based on this data set, determine the optimal if H by the learning algorithm, and then pass the best if function h to derive the result of the new dataset. In this lesson introduced a new way of thinking, its core idea is to directly calculate the various if the highest probability, and then fit each if the highest probability of the parameters, and use the probability of fitting to calculate the probability of the new data set, select the highest probability if the classification category.

The essence of the entire generative learning algorithm is the use of conditional probabilities. In the two-tuple category, it can also be called a separate algorithm. Determine the P (Y) and P (x|y) in the given data set, and then according to the Bayes theorem. Get

X|y=1) P (y=1) +p (x|y=0) p (y=0).

In order to get the highest probability of each of them, we know

1, Gauss respectively algorithm (Gaussian discriminant Analysis,gda)

The functions of the multivariate positive distribution are not specifically expanded here, and a separate blog topic on machine learning will be opened in the future. The Gaussian algorithm faces the continuous variable x. Analyze the model separately in Gauss

Y~bernoulli (φ)

X|y=0 ~n (μ0,σ)

x|y=1~ N (μ1,σ)

So their probability distribution functions are:


In the probability distribution function, the parameter φ,σ,μ0 andμ1 can be calculated by the maximum likelihood probability. The likelihood probability function is

Maximizing likelihood probabilities, you can determine the following values for example:

2, Gaussian algorithm and logistic regression algorithm comparison

Both are for classification problems. However, if P (x|y) satisfies the multivariate Gaussian distribution, the P (y|x) can be deduced to satisfy the logistic regression. Otherwise it is not. This shows that the Gaussian algorithm has better model if sex, need less data when training. The Gaussian algorithm is more effective than the logistic regression algorithm when the data set is large, generally speaking. We also think that the Gaussian algorithm is more effective when the data set is small. The logsitic regression algorithm has better robustness and the efficiency of the logistic regression algorithm is better than that of Gaussian algorithm when the data set is obviously not in accordance with the Gaussian distribution. So. Many of the other things that are used in practice are logistic regression algorithms.

In addition, when x|y = 0 to Poisson (λ0), x|y = 1 ~ Poisson (λ1) (satisfies the exponential cluster), p (y|x) satisfies the logistic regression.

Stanford machine Learning Lesson5 sentiment ——— 1, learning algorithm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.