Application of Naive Bayes algorithm in spam filtering, Bayesian Spam

Source: Internet
Author: User

Application of Naive Bayes algorithm in spam filtering, Bayesian Spam

I recently wrote a paper on Big Data Classification (SPAM: My tutor reminds me every day), so I borrowed several books on big data from the library. Today, I read spam in "New Internet Big Data Mining" (if you are interested, you can take a look), which reminds me that I saw a famous enterprise interview question in the 1280 community yesterday, "in real-time game communication, how do I filter those advertisements? ". At that time, I thought about keyword filtering, but I didn't think about it.

In fact, spam filtering and AD filtering are the most commonly used Naive Bayes algorithms.

Bayesian theorem is A theorem about the conditional probability (or edge probability) of random events A and B.

(See Wikipedia http://zh.wikipedia.org/wiki/%E8%B4%9D%E5%8F%B6%E6%96%AF%E5% AE %9A%E7%90%86)


By studying a large number of identified spam and normal emails, we can determine the possibility of spam based on the probability comparison of the occurrence of the same words in the two emails. The advantage is high accuracy, but the disadvantage is that a large amount of historical data is required.



Naive Bayes algorithm problems

Use this to compile the software? The tips I gave you are also my graduation project. You can use excel to implement your computing. This is more convenient than software, then you are using VB to interact with your excel file. There are not many specific applications in life

I am using Naive Bayes algorithm for Information Filtering and need a training set. I hope someone can provide the following

Today, June 4 is passed. No one will answer you. It's better for LZ to share it with me ~

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.