Principle and practice of naive Bayesian classification algorithm

Last Update:2017-02-27 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Today we introduce naive Bayesian classification algorithm, talk about the basic principles, and then use text classification practice.

A simple example

Naive Bayesian algorithm is a typical statistical learning method, the main theoretical basis is a Bayesian formula, Bayesian formula is the basic definition as follows:

The formula seems simple, but it can sum up history and predict the future. To the right of the formula is to summarize history, the left side of the formula is to predict the future, if y see the category, X to see the feature, P (yk| x) is the probability of finding the Yk class in the case of a known feature X, and the P (yk| X) are all converted to the feature distribution of the category YK.

For example, when you were in college, a boy often go to the library late self-study, found he likes that girl also often go to that study room, secretly happy, so every day to buy some delicious point in the study room to wait for her to come, but other girls do not necessarily come every day, watching the weather gradually hot, the library does not open air-conditioning, If the girl did not go to study room, the boy will not go, every time the boy summon up the courage to say: "Hey, you still come tomorrow?" "Ah, don't know, look at the situation." The boy then took her to the study room every day and made a note of some other situation, use Y to indicate whether the girls go to study room, that is y={go, not to},x is to go to study room has a number of related conditions, such as the day on which the subject, the camp statistics for a period of time, the boys are not going to camp today, It's about predicting if she's going to be there, and now we know that the ordinary differential method thereupon the computation p (y= go | Ordinary differential equation) with P (y= not go | Ordinary differential equation), see which probability is large, if P (y= go | Ordinary differential equation) >p (y= do not go | Ordinary differential equations), Then this boy no matter how hot all fart to go to study room, otherwise not go to study room suffer. P (y=) The calculation of the ordinary differential equation can be converted to the case where she went before, that day the subjects are the probability p of ordinary differential (ordinary differential equation | y=, note that the denominator to the right of the formula is the same for each category (go/No), so the calculation ignores the denominator, so that although the resulting probability value is no longer between 0~1, its size can be selected.

Later he found that there are some other conditions can be dug, such as the day of the week, the day of the weather, as well as the last time with her in the study room atmosphere, statistics for some time after the man a calculation, found bad forget, because summed up the history of the formula:

Here N=3,x (1) Represents a lesson, X (2) indicates the weather, X (3) indicates the day of the Week, X (4) is the atmosphere, Y is still {go, not}, now there are 8 subjects, the weather has clear, rain, overcast three, the atmosphere has a+,a,b+,b,c five, then the total need to estimate the parameters have 8*3*7*5* 2 = 1680, only one data can be collected every day, so that all 1680 data are graduated from university, the boy is not good, so he made an independent assumption, assuming that these influence her to study room is independent and unrelated, so

With this independent assumption, the parameters that need to be estimated become, (8+3+7+5) *2 = 46, and a daily collection of data that can provide 4 parameters, so that the boy is more accurate prediction.

Naive Bayesian classifier

Tell the little story above, we come to the naïve Bayes classifier representation:

When the feature is X, the conditional probability of all categories is computed, and the category with the most conditional probability is chosen as the category to be sorted. Because the denominator of the formula above is the same for each category, the calculation can be calculated without regard to the denominator, i.e.

Naive Bayes ' simplicity is embodied in its independence assumption of each condition, and the assumption of independence is greatly reduced.

On the application of text categorization

There are many applications of text categorization, such as spam and spam filtering is a 2 classification problem, news classification, text affective analysis can be regarded as text categorization problem, classification problem consists of two steps: training and forecasting, to establish a classification model, at least a training data set. The Bayesian model can be applied naturally to text categorization: Now there is a document D (documents) that determines which category of CK it belongs to, and only the probability that the document D belongs to is the most likely:

In the classification problem, we don't use all the features, for a document D, we use only some of the feature word items <t1,t2,..., tnd> (ND represents the total number of entries in D), because many words have no value for classification, such as some stop words ", yes, in" In each category will appear, the term will also be fuzzy classification of the decision surface, on the selection of feature words, I have introduced this article. When you represent a document with a feature word item, the category of the calculated document D is converted to:

Note that P (ck|d) is directly proportional to the last part of the formula, and the complete calculation has a denominator, but we discussed earlier that the denominator is the same for each category, so we can classify it just by counting the molecules. In the actual calculation, the multiplication of multiple probability P (tj|ck) is very easy to overflow to 0, thus converting to logarithmic computation, the multiplication becomes cumulative:

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Principle and practice of naive Bayesian classification algorithm

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Principle and practice of naive Bayesian classification algorithm

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support