Principle and practice of naive Bayesian classification algorithm

Source: Internet
Author: User

Today we introduce naive Bayesian classification algorithm, talk about the basic principles, and then use text classification practice.

A simple example

Naive Bayesian algorithm is a typical statistical learning method, the main theoretical basis is a Bayesian formula, Bayesian formula is the basic definition as follows:

The formula seems simple, but it can sum up history and predict the future. To the right of the formula is to summarize history, the left side of the formula is to predict the future, if y see the category, X to see the feature, P (yk| x) is the probability of finding the Yk class in the case of a known feature X, and the P (yk| X) are all converted to the feature distribution of the category YK.

For example, when you were in college, a boy often go to the library late self-study, found he likes that girl also often go to that study room, secretly happy, so every day to buy some delicious point in the study room to wait for her to come, but other girls do not necessarily come every day, watching the weather gradually hot, the library does not open air-conditioning, If the girl did not go to study room, the boy will not go, every time the boy summon up the courage to say: "Hey, you still come tomorrow?" "Ah, don't know, look at the situation." The boy then took her to the study room every day and made a note of some other situation, use Y to indicate whether the girls go to study room, that is y={go, not to},x is to go to study room has a number of related conditions, such as the day on which the subject, the camp statistics for a period of time, the boys are not going to camp today, It's about predicting if she's going to be there, and now we know that the ordinary differential method thereupon the computation p (y= go | Ordinary differential equation) with P (y= not go | Ordinary differential equation), see which probability is large, if P (y= go | Ordinary differential equation) >p (y= do not go | Ordinary differential equations), Then this boy no matter how hot all fart to go to study room, otherwise not go to study room suffer. P (y=) The calculation of the ordinary differential equation can be converted to the case where she went before, that day the subjects are the probability p of ordinary differential (ordinary differential equation | y=, note that the denominator to the right of the formula is the same for each category (go/No), so the calculation ignores the denominator, so that although the resulting probability value is no longer between 0~1, its size can be selected.

Later he found that there are some other conditions can be dug, such as the day of the week, the day of the weather, as well as the last time with her in the study room atmosphere, statistics for some time after the man a calculation, found bad forget, because summed up the history of the formula:

Here N=3,x (1) Represents a lesson, X (2) indicates the weather, X (3) indicates the day of the Week, X (4) is the atmosphere, Y is still {go, not}, now there are 8 subjects, the weather has clear, rain, overcast three, the atmosphere has a+,a,b+,b,c five, then the total need to estimate the parameters have 8*3*7*5* 2 = 1680, only one data can be collected every day, so that all 1680 data are graduated from university, the boy is not good, so he made an independent assumption, assuming that these influence her to study room is independent and unrelated, so

With this independent assumption, the parameters that need to be estimated become, (8+3+7+5) *2 = 46, and a daily collection of data that can provide 4 parameters, so that the boy is more accurate prediction.

Naive Bayesian classifier

Tell the little story above, we come to the naïve Bayes classifier representation:

When the feature is X, the conditional probability of all categories is computed, and the category with the most conditional probability is chosen as the category to be sorted. Because the denominator of the formula above is the same for each category, the calculation can be calculated without regard to the denominator, i.e.

Naive Bayes ' simplicity is embodied in its independence assumption of each condition, and the assumption of independence is greatly reduced.

On the application of text categorization

There are many applications of text categorization, such as spam and spam filtering is a 2 classification problem, news classification, text affective analysis can be regarded as text categorization problem, classification problem consists of two steps: training and forecasting, to establish a classification model, at least a training data set. The Bayesian model can be applied naturally to text categorization: Now there is a document D (documents) that determines which category of CK it belongs to, and only the probability that the document D belongs to is the most likely:

In the classification problem, we don't use all the features, for a document D, we use only some of the feature word items <t1,t2,..., tnd> (ND represents the total number of entries in D), because many words have no value for classification, such as some stop words ", yes, in" In each category will appear, the term will also be fuzzy classification of the decision surface, on the selection of feature words, I have introduced this article. When you represent a document with a feature word item, the category of the calculated document D is converted to:

Note that P (ck|d) is directly proportional to the last part of the formula, and the complete calculation has a denominator, but we discussed earlier that the denominator is the same for each category, so we can classify it just by counting the molecules. In the actual calculation, the multiplication of multiple probability P (tj|ck) is very easy to overflow to 0, thus converting to logarithmic computation, the multiplication becomes cumulative:

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.