Machine Learning Algorithms Summary (10)--Naive Bayes

Last Update:2018-07-12 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1, the definition of the model

Naive Bayes is a splitting method based on Bayesian theorem and independent hypothesis of characteristic condition. First, let's understand the Bayesian theorem and the model to be established. For a given data set

Suppose the output category yi∈{c1, C2, ...., ck}, Naive Bayes learns the joint probability distribution P (x|y) by training the data set. But it is generally difficult to find the joint probability distribution P (x|y), so we can find a priori probability distribution and conditional probability distribution to replace it. The prior probability distribution is as follows

For the solution of the prior probability, it can be considered as the proportion of the class in the overall sample according to the large number theorem. The conditional probability distribution is as follows

Through the solution of the prior probability and the conditional probability, we can learn the joint probability (it is generally believed that the joint probability is proportional to the product of the prior probability and the conditional probability), however, the conditional probability is not good, and if it is solved directly, the number of the parameters is the number of the values of all the characteristics of the multiplication. Therefore, the naïve Bayesian idea is introduced here.

The naïve Bayes method assumes that the above conditional probabilities are independent of each other before each feature. At this point we can do the chain expansion, the expression is as follows

Naive Bayesian method is actually the process of seeking joint probability distribution, and the process of finding the posterior probability (also a conditional probability) through joint probability, such classifier belongs to the generative model. Distinguishing from it is the discriminant generation model, common decision tree, logistic regression, SVM, etc., such models are directly generated results (may be P (Y) or P (y|x)). Knowing the process of calculating the priori probability and conditional probability, we will take a look at how Bayesian theorem calculates the posteriori probability.

The introduction of naïve thinking, assuming that the characteristics of each other independent

This is the basic formula for naive Bayes classification, so our model can be built as

And for the denominator in the right side, the denominator is a class-independent formula, that is, for all CK is the same, and then here we just ask for the maximum probability of the category, so the removal of this item will not affect the result (that is, the expression of the same proportion of amplification or reduction will not affect the maximum solution of the judgment), The final formula can be written

2. Maximum posteriori probability

Let's look at the next 0-1 loss function:

At this time the expected risk function, in the optimization of the model, we aim to minimize the expected loss of

For naive Bayesian model, the expected loss function can be expressed as

This loss function and 0-1 loss is different, can be regarded as the probability of classification to each class multiplied by 0-1 loss function, that is, in K only once L function will take 0, the rest of the 1, at this time we also make the conditional probability of taking 0 o'clock P (ck|x) is the largest, so that the overall expected loss is the smallest. The specific mathematical derivation process is as follows

To minimize the expected risk is to maximize the posteriori probability.

3. Estimation of the parameters of naive Bayes

Using maximum likelihood estimation to solve the prior probability and conditional probability, the maximum likelihood estimation of the prior probability

Maximum likelihood estimation of conditional probabilities

However, it is possible to use a maximum likelihood estimate of a probability value of 0. This will affect the calculation of the posterior probability (since the chain solution, once there is a value of 0, it will result in the entire chain of the solution is 0, that is, the probability of obtaining a condition of 0). So we will use Bayesian estimation, the expression of prior probability

Conditional probability expressions for Bayesian estimation

So the naive Bayesian model only calculates the various parameters on the training set, such as the prior probability, the probabilities of each feature on each category (these are used to calculate the conditional probabilities), and so on, based on the values of these studies to predict

4, Naive Bayesian summary

The advantages of Naive Bayes:

1) Simple Bayesian model classification efficiency and stability

2) The small-scale data set performance is very good, can deal with multi-classification problem, suitable for incremental training, especially when the data set out of memory, we can batch of training

3) Less sensitive to missing data, simple algorithm, often used for text classification

The shortcomings of naive Bayes;

1) In theory, Naive Bayes has the smallest error rate compared to other models, but it is not necessarily, because naive Bayesian introduces the hypothesis that each feature is independent of each other. So when the correlation between the characteristics is strong, naive Bayesian performance is general, but the independence between the characteristics of strong, naive Bayesian performance is very good

2) by Apriori and data to determine the probability of posterior examination to determine the classification, so there is a certain error rate classification decision

3) sensitive to the expression of input data

Machine Learning Algorithms Summary (10)--Naive Bayes

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Machine Learning Algorithms Summary (10)--Naive Bayes

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Machine Learning Algorithms Summary (10)--Naive Bayes

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support