Machine Learning Algorithms Summary (10)--Naive Bayes

Source: Internet
Author: User

1, the definition of the model

Naive Bayes is a splitting method based on Bayesian theorem and independent hypothesis of characteristic condition. First, let's understand the Bayesian theorem and the model to be established. For a given data set

  

Suppose the output category yi∈{c1, C2, ...., ck}, Naive Bayes learns the joint probability distribution P (x|y) by training the data set. But it is generally difficult to find the joint probability distribution P (x|y), so we can find a priori probability distribution and conditional probability distribution to replace it. The prior probability distribution is as follows

  

For the solution of the prior probability, it can be considered as the proportion of the class in the overall sample according to the large number theorem. The conditional probability distribution is as follows

  

Through the solution of the prior probability and the conditional probability, we can learn the joint probability (it is generally believed that the joint probability is proportional to the product of the prior probability and the conditional probability), however, the conditional probability is not good, and if it is solved directly, the number of the parameters is the number of the values of all the characteristics of the multiplication. Therefore, the naïve Bayesian idea is introduced here.

The naïve Bayes method assumes that the above conditional probabilities are independent of each other before each feature. At this point we can do the chain expansion, the expression is as follows

  

Naive Bayesian method is actually the process of seeking joint probability distribution, and the process of finding the posterior probability (also a conditional probability) through joint probability, such classifier belongs to the generative model. Distinguishing from it is the discriminant generation model, common decision tree, logistic regression, SVM, etc., such models are directly generated results (may be P (Y) or P (y|x)). Knowing the process of calculating the priori probability and conditional probability, we will take a look at how Bayesian theorem calculates the posteriori probability.

  

The introduction of naïve thinking, assuming that the characteristics of each other independent

  

This is the basic formula for naive Bayes classification, so our model can be built as

  

And for the denominator in the right side, the denominator is a class-independent formula, that is, for all CK is the same, and then here we just ask for the maximum probability of the category, so the removal of this item will not affect the result (that is, the expression of the same proportion of amplification or reduction will not affect the maximum solution of the judgment), The final formula can be written

  

2. Maximum posteriori probability

Let's look at the next 0-1 loss function:

  

At this time the expected risk function, in the optimization of the model, we aim to minimize the expected loss of

  

For naive Bayesian model, the expected loss function can be expressed as

  

This loss function and 0-1 loss is different, can be regarded as the probability of classification to each class multiplied by 0-1 loss function, that is, in K only once L function will take 0, the rest of the 1, at this time we also make the conditional probability of taking 0 o'clock P (ck|x) is the largest, so that the overall expected loss is the smallest. The specific mathematical derivation process is as follows

  

To minimize the expected risk is to maximize the posteriori probability.

3. Estimation of the parameters of naive Bayes

Using maximum likelihood estimation to solve the prior probability and conditional probability, the maximum likelihood estimation of the prior probability

  

Maximum likelihood estimation of conditional probabilities

  

However, it is possible to use a maximum likelihood estimate of a probability value of 0. This will affect the calculation of the posterior probability (since the chain solution, once there is a value of 0, it will result in the entire chain of the solution is 0, that is, the probability of obtaining a condition of 0). So we will use Bayesian estimation, the expression of prior probability

  

Conditional probability expressions for Bayesian estimation

  

So the naive Bayesian model only calculates the various parameters on the training set, such as the prior probability, the probabilities of each feature on each category (these are used to calculate the conditional probabilities), and so on, based on the values of these studies to predict

  

4, Naive Bayesian summary

The advantages of Naive Bayes:

1) Simple Bayesian model classification efficiency and stability

2) The small-scale data set performance is very good, can deal with multi-classification problem, suitable for incremental training, especially when the data set out of memory, we can batch of training

3) Less sensitive to missing data, simple algorithm, often used for text classification

The shortcomings of naive Bayes;

1) In theory, Naive Bayes has the smallest error rate compared to other models, but it is not necessarily, because naive Bayesian introduces the hypothesis that each feature is independent of each other. So when the correlation between the characteristics is strong, naive Bayesian performance is general, but the independence between the characteristics of strong, naive Bayesian performance is very good

2) by Apriori and data to determine the probability of posterior examination to determine the classification, so there is a certain error rate classification decision

3) sensitive to the expression of input data

Machine Learning Algorithms Summary (10)--Naive Bayes

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.