Data mining algorithm Learning (3) naivebayes Algorithm

Source: Internet
Author: User

Algorithm Overview

NBC is one of the most widely used classification algorithms. The naive Bayes model originated from classical mathematical theory and has a solid mathematical foundation and stable classification efficiency. At the same time, the NBC model requires few parameters, which are not sensitive to missing data and the algorithm is relatively simple.

Algorithm hypothesis

Given the target value, attributes are mutually independent.

Algorithm input

Training data t = {(x1, Y1), (X2, Y2 ),......, (Xn, yn )}
Data to be classified: X0 = (x0 (1), x0 (2 ),......, X0 (N) T

Algorithm output
Classification Result of x0 for data to be classified y0, {C1, C2 ,......, CK}

Algorithm IDEA



Run WEKA
The running result of weather. Nominal. ARFF is as follows:


It can be seen from the results that there are two categories, so a 2*2 confusion matrix is generated.

Function call code

// Read the sample

Filefile = new file ("F: \ Program Files (x86) \ WEKA-3-7 \ data \ weather. Nominal. ARFF ");

Arffloaderloader = newarffloader ();

Loader. setfile (File );

INS = loader. getdataset ();

INS. setclassindex (INS. numattributes ()-1 );

// Initialize and train the Classifier

CFS = (classifier) class. forname ("WEKA. classifiers. BAYes. naivebayes"). newinstance ();

CFS. buildclassifier (INS );

// Obtain the classifier result

Testingevaluation. evaluatemodelonceandrecordprediction (CFS, testinst );

// Print the classification result

System. Out. println ("classifier accuracy rate:" + (1-testingevaluation.errorrate ()));


The running result is as follows:

Classifier accuracy: 0.9583333333333334


Algorithm Application

? Spam filtering system? Classified web pages? Classified text

The spam filtering system can be referred to in this paper: Zhou Weicheng Ma suxia Qi Lin Hai, a machine learning-based spam intelligent filtering method.


For Original Articles, please indicate the source. Thank you.



Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.