Naive Bayesian algorithm based on statistical learning method

Source: Internet
Author: User

What you need to know is when you can use naive Bayesian algorithms: you need to ensure that feature conditions are independent.

The main process is to learn the joint probability distribution of input and output.

When predicting, the output y corresponding to the post-test probability can be obtained according to the input.

Prior probability: The output is known and the input is obtained. The posterior probability is reversed.

In simple terms, the Naive Bayes algorithm, after learning the sample, gives the maximum probability of Y, given x when it is time to make a decision. This is essentially a typical posteriori probability model. However, the algorithm of the model is pushed up, and the prior probability calculation is also used. But note: The final naïve Bayesian is a posteriori probability model for P (y|x).

The posteriori probability model has one benefit, which is the equivalent of minimizing the expected risk. This is a good understanding that the Y I have achieved is the greatest probability of all possibilities. Then the risk of my choosing to take the wrong is naturally reduced.

  

Parameter estimation

It's time to select the parameters.

Estimate process:

1> the probability of determining the different values of Y

2> calculates the probability of x|y at different values of Y.

3> a posteriori probability formula.

Laplace Smoothing:

P (Y) and P (x|y) should be Laplace smoothed at the time of calculation. Because naive Bayes is independent of the same distribution. P (xj|y) when an item is 0. The entire function formula value is all 0. This is extremely problematic. So it is very necessary to do Laplace smoothing.

Advantages:

Very good for small-scale data, suitable for multi-classification tasks, suitable for incremental training.

Disadvantages:

The number of supported features is relatively small. Because of the number of features, it is more difficult to ensure that all characteristics are independent of the same distribution. is particularly sensitive to input data. If the number of inputs is small, it is likely that one of the data will change the result.

Naive Bayesian algorithm based on statistical learning method

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.