"Cs229-lecture5" Generation Learning algorithm: 1) Gaussian discriminant analysis (GDA); 2) Naive Bayes (NB)

Last Update:2014-12-05 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Reference:

cs229 Handout Machine Learning (a): Generating learning algorithms generative Learning Algorithms : http://www.cnblogs.com/zjgtan/ archive/2013/06/08/3127490.html

First, a simple comparison of the discriminant Learning algorithm (discriminative learning algorithm) and the Generation Learning algorithm (generative Learning algorithm) mentioned in the previous lesson the difference.

Eg: problem: Consider a classi?cation problem in which we want to learn to distinguishbetween elephants (y = 1) and dogs (y = 0), based on some features of
An animal.

Discriminant Learning algorithm: (DLA learns to get P (y|x) by establishing a mapping relationship between the input space X and the output callout {1, 0})

Given a training set, an algorithm like logistic regression or the perceptron algorithm (basically) tries to ? nd a St RaightLine-so is, a decision boundary-that separates the elephants and dogs. Then, to classify a new animal as either an elephant or a dog, it checks on which side of the decision boundary it falls, and makes its prediction accordingly.

Generate Learning Algorithm: (GLA first determines P (x|y) and P (y), the posterior distribution is obtained by Bayesian criterion, and is predicted by the maximum posterior criterion)

First, looking at elephants, we can build a modelof the What elephants look like. Then, the looking at dogs, we can build a separate modelof what the dogs look like. Finally, to classify a new animal, we can match the new animal against the elephant model, and match it against the Do G model, to see whether the new animal looks more like the elephants or more like the dogs we had seen in the training set .

(PS: Priori probability vs posteriori probability

things have not happened yet, the size of the possibility of asking this thing to happen is

Prior probability

.
The thing that has happened is that the reason for this happening is the size of the possibility caused by a certain factor , is

Post-Test probability

)

Generate Learning Algorithms

First, review the relevant knowledge of the Gaussian distribution :

Gaussian distributionGaussian distribution

Gaussian distribution is also Normal Distribution ; The mathematical expectation is that Variance the Gaussian distribution for is usually recorded as.

Standard Normal DistributionStandard Normal distribution

the standard normal distribution refers to Mathematical Expectations is, the variance is the Normal Distribution . For the mathematical expectation, the variance is the normal distribution random variable, and the following linear transformation can get the random variable that obeys the standard normal distribution.

two USD normal distributionBivariate Normal Distribution

The binary normal distribution [1] refers to the joint probability distribution of two random variables that obey the normal distribution . The joint probability density function of two-yuan normal distribution is

which

,,,, for the probability distribution of the parameters. The above two-yuan normal distribution is recorded as.

two USD normal distribution feature Functions to be

Multivariate normal distributionMultivariate normal distribution

multivariate normal distribution refers to multiple random variables that obey the normal distribution. random Vector has a joint probability distribution. Mathematical Expectations to and Covariance matrix for the one Random Variables of the multivariate normal distribution. Joint probability density function to be

Obey multivariate normal distribution can be recorded as.

If, and, then.

can see that Multivariate normal distribution is related to two quantities: mean and covariance matrices. So, next, look at the changes that are caused by changing the values of these two quantities through an image.

1. Gaussian discriminant analysis (Gda,gaussian discriminant):

A, the proposed hypothesis follows the normal distribution:

In the this model, we'll assume that p (x|y) are distributed according to a multivariate normal distribution (multivariate Normal distribution).

B, the conquest samples were fitted, and the corresponding models were obtained.

Finally, compare the GDA and logistic regression

gda--if the actual data is true, a good model can be obtained with only a small number of samples

Logistic regression--logistic regression model has better robustness

Summarize:

GDA makes stronger modeling assumptions, and is more data e?cient (i.e., requires less training data To learn ' well ') when the modeling assumptions is correct or at least approximately correct.

logistic regression makes weaker Assumptions , and

Speci?cally, when the data was indeed Non-gaussian, then in the limit of large datasets, logistic re Gression'll almost always do better than GDA. for the reason, in practice logistic regression are used more often than GDA. (Some related considerations about Discriminative vs. Generative models also apply for the Naive Bayes algorithm so we discuss next, but the Naive Bayes algorithm is still co Nsidered a very good, and is certainly also a very popular, classi?cation algorithm.)

2. Naive Bayes (Nb,naive Bayes):

Take the text classification as an example, based on the hypothesis of conditional independence . There is a certain connection between some words in the actual grammar, but naive Bayes still shows very good performance.

Because of independence, so

Get the joint likelihood function joint likelihood:

After getting the estimated values of these parameters, give you a new message that can be calculated based on the Bayesian formula

Laplace Smoothing (Laplace smooth)

When a new word is encountered in the message, (0/0) The essence is the enhancement of the spatial dimension of the input sample feature space, and the old model cannot provide valid classification information.

When this occurs, smoothing can be performed: (+1)

==============>

"Cs229-lecture5" Generation Learning algorithm: 1) Gaussian discriminant analysis (GDA); 2) Naive Bayes (NB)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

"Cs229-lecture5" Generation Learning algorithm: 1) Gaussian discriminant analysis (GDA); 2) Naive Bayes (NB)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

"Cs229-lecture5" Generation Learning algorithm: 1) Gaussian discriminant analysis (GDA); 2) Naive Bayes (NB)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support