Pattern Recognition (vii): MATLAB implements naive Bayesian classifier

Last Update:2018-08-03 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This series of articles by the Cloud Twilight Edition, reproduced please indicate the source

http://blog.csdn.net/lyunduanmuxue/article/details/20068781

Thank you for your cooperation.

Basic Introduction

Today, we introduce a simple and efficient classifier- naive Bayesian classifier (Naive Bayes Classifier).

It is believed that the students who have studied probability theory should not be unfamiliar with the name Bayesian, because there is an important formula in probability theory, which is named by Bayes, this is "Bayesian formula":

Bayesian classifier is based on this formula developed, the reason is also added to the simple word, because the classifier to the distribution of the various types of assumptions, that is, the different types of data samples are independent of each other. Such assumptions are very strong, but do not affect the applicability of naive Bayesian classifiers. In 1997, Domingos and Pazzani of Microsoft Research proved that the classifier still exhibited good performance even if its assumptions were not established. One explanation for this phenomenon is that the classifier needs to train fewer parameters, so it is good to avoid the occurrence of overfitting (overfitting).

Implementation Notes

Below we step by step to implement Bayesian classifier.

The training of classifiers is divided into two steps:

Calculate the prior probability; calculate the likelihood function;
The application process simply calculates the posteriori probability by using the prior probability and likelihood function obtained in the training process.
The so-called transcendental probability, in fact, is the probability of each class occurrence, this is a simple statistical problem, that is, the training data concentration of different classes accounted for the ratio can be calculated.
The training likelihood function is similar to this, which is to see the value of each characteristic corresponding to the probability value of a class.
As for the posterior probability, it is generally not true to complete the calculation, but only to calculate the right molecular portion of the Bayesian formula, because the denominator part of the knowledge of a factor, the specific problem is a constant value.
code Example

After the naïve Bayesian classifier has the most basic understanding, the following we began to try to design a MATLAB.

First, the prior probabilities are computed:

[Plain] View Plain copy function priors = nbc_priors (training) %nbc_priors calculates the priors for each class by using the training data %set. %% priors = nbc_priors (training) % Input: % training - a struct representing the training data set % training.class - the class of each data % training.features - the feature of each data %% Output: % priors - a struct representing the priors of each class % priors.class - the Class labels % priors.value - the priors of its corresponding classes %% running these code to get some examples: %nbc_mushroom %% edited by x. sun % my homepage: http://pamixsun.github.io/ %% % check the input arguments if nargin < 1 &NBSP;&NBSP;&NBSP;&NBSP;ERROR (Message (' MATLAB:UNIQUE:NotEnoughInputs ')); end % extract the class labels Priors.class = unique ( Training.class); % initialize the priors.value priors.value = Zeros (1, length (priors.class)); % calculate the priors for i = 1 : length (Priors.class) priors.value (i) = (sum ( Training.class == class (i)) / (Length (training.class)); end % check the results if sum (priors.value) ~= 1 error (' Prior error '); end end

followed by the training of a complete naive Bayesian classifier:

function [Likelihood, priors] = TRAIN_NBC (training, featurevalues, AddOne)%TRAIN_NBC trains a naive Bayes classifier usin
G The training data set. percent per cent [likelihood, priors] = TRAIN_NBC (training, featurenames, AddOne)% Input:% training-a struct representing the
Training data set% Training.class-the class of each data% training.features-the feature of each data  % featurevalues-a cell that contains the values of each feature% Addone-to chose whether use add one smoothing or
Not,% 1 indicates yes, 0 otherwise.        Percent Output:% likelihood-a struct representing the likelihood% likelihood.matrixcolnames-the feature values%  Likelihood.matrixrownames-the class labels% likelihood.matrix-the likelihood values% priors- A struct representing the priors of each class% Priors.class-the class labels% priors.value-the priors Of its corresponding classes percent Running these code to get some examples:%nbc_mushroom edited by x. Sun% My homepage:http://pamixsun.github.io/percent% Check the input arguments if Nargin
< 2 error (Message (' MATLAB:UNIQUE:NotEnoughInputs '));
End% Set The default value for AddOne if it's not given if Nargin = = 2 AddOne = 0;

End% Calculate the priors priors = nbc_priors (training); % learn the features by calculating likelihood for i = 1:size (Training.features, 2) uniquefeaturevalues = Featureval
    Ues{i};
    Trainingfeaturevalues = Training.features (:, i);
    Likelihood.matrixcolnames{i} = uniquefeaturevalues;
    Likelihood.matrixrownames{i} = Priors.class;
    Likelihood.matrix{i} = Zeros (length (priors.class), Length (uniquefeaturevalues));
        for j = 1:length (uniquefeaturevalues) item = Uniquefeaturevalues (j);
            For k = 1:length (Priors.class) class = Priors.class (k);
            Featurevaluesinclass = trainingfeaturevalues (Training.class = = Class);
     Likelihood.matrix{i} (k, j) = ...           (Length (featurevaluesinclass (Featurevaluesinclass = = Item)) + 1 * addone) .../(Length (Featurev
        Aluesinclass) + addone * Length (uniquefeaturevalues)); End End End

Finally, use the classifier we have trained.

function [predictive, posterior] = PREDICT_NBC (test, priors, likelihood)%PREDICT_NBC uses a naive Bayes classifier to pre
Dict the class labels of %the test data set. Percent per cent [predictive, posterior] = PREDICT_NBC (test, priors, likelihood)%  input:%   test-a struct representing t He test data set%       Test.class    -The class of each data%       Test.featu Res-the feature of each data%   priors-a struct representing the priors of each class%       PRI Ors.class-the class labels%       priors.value-the priors of its corresponding classes%   Likeli  Hood-a struct representing the likelihood%       likelihood.matrixcolnames-the feature values%       Likelihood.matrixrownames-the class labels%       Likelihood.matrix        -The likelihood values  output:%   predictive-the predictive results of The test data set%       Predictive.class-the Predictive class for each data  %   posterior -a struct representing the posteriors of each class  %       Posterior.class-the class labels &nbs
P %       posterior.value-the posteriors of the corresponding classes  percent  running these code to Get some examples:%nbc_mushroom%  edited by x. Sun%   My homepage:http://pamixsun.github.io/percent% Check t
He input arguments if Nargin < 3     error (Message (' MATLAB:UNIQUE:NotEnoughInputs '));


End posterior.class = Priors.class;
% Calculate posteriors for each test data record Predictive.class = zeros (Length (Size (test.features, 1)), 1);
Posterior.value = zeros (Size (test.features, 1), Length (Priors.class)); For i = 1:size (test.features, 1)     record = Test.features (i,:);    % Calculate posteriors for EA CH Possible class of that record     for j = 1:length (PRiors.class)         class = Priors.class (j);        % Initialize posterior as the prior value of that class         POSTERIORV
Alue = Priors.value (Priors.class = = Class);
        for k = 1:length (record)             item = record (k);             Likelihoodvalue = ...                 Li Kelihood.matrix{k} (J, likelihood.matrixcolnames{k} (:) = = Item);             Posteriorvalue
= Posteriorvalue * Likelihoodvalue;         end         Calculate the posteriors         Posteri
Or.value (i, j) = Posteriorvalue;     End    % Get The Predictive class     Predictive.class (i) = ...       &N Bsp Posterior.class (Posterior.value (i,:) = = Max (Posterior.value (i,:))); End Predictive.class =

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Pattern Recognition (vii): MATLAB implements naive Bayesian classifier

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Pattern Recognition (vii): MATLAB implements naive Bayesian classifier

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support