Realization of Bayesian classifier by Matlab

Last Update:2015-03-30 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The classification principle of Bayesian classifier is based on the prior probability of an object, and the Bayesian formula is used to calculate the posteriori probability, that is, the probability of the object belonging to a certain class, and select the class with the maximum posteriori probability as the class to which the object belongs. In other words, the Bayesian classifier is an optimization in the sense of minimum error rate, which follows the basic principle of "most dominant".

First, the basic concept of the classifier

After a phase of pattern recognition learning, there is a basic understanding of the concepts of patterns and pattern classes, and attempts to use MATLAB to implement some pattern class generation. And then how to classify these patterns becomes the second focus of learning. This requires a classifier.
There are many ways to express pattern classifiers, most of which are in the form of a discriminant function, if for all j≠i, there are:

This classifier will award this eigenvector x as the Ωi class. Therefore, this classifier is visualized as the calculation of a C discriminant function and the selection of a network or machine corresponding to the maximum discriminant value. The network structure of a classifier is as follows:

Second, Bayesian classifier

A Bayesian classifier can be simply and naturally expressed as the above network structure. The classification principle of Bayesian classifier is based on the prior probability of an object, and the Bayesian formula is used to calculate the posteriori probability, that is, the probability of the object belonging to a certain class, and select the class with the maximum posteriori probability as the class to which the object belongs. Under the condition of complete statistical knowledge with pattern, an optimal classifier is designed according to Bayesian decision theory. A classifier is a software or hardware device that assigns a category name to each input pattern, while a Bayesian classifier is a classifier that minimizes the probability of classification errors or the lowest average risk at a predetermined cost in a variety of classifiers. Its design method is one of the most basic statistical classification methods.

For Bayesian classifiers, the selection of discriminant functions is not unique, we can multiply all discriminant functions by the same normal number or add an identical constant without affecting its verdict; In a more general case, if each GI (x) is replaced by F (GI (x)), where F (?) is a monotonically increasing function whose classification effect remains the same. Especially in the case of minimum error rate classification, the same classification results can be obtained by selecting any of the following functions, but some of them are easier to calculate than others:

A typical pattern recognition system consists of two phases: feature extraction and pattern classification, in which the performance of the pattern classifier (Classifier) directly affects the performance of the whole recognition system. Therefore, it is necessary to explore how to evaluate the performance of classifiers, which is a long-term exploration process. Classifier performance evaluation method see: HTTP://BLOG.CSDN.NET/LIYUEFEILONG/ARTICLE/DETAILS/44604001

Three, the Basic Bayes classifier realization

In Matlab, a Bayesian classifier can be used to classify two types of pattern samples, assuming that the distributions of the two pattern classes are Gaussian distributions. The mean vector of the pattern Class 1 M1 = (1, 3), the covariance matrix is S1 = (1.5, 0; 0, 1), the mean vector of the pattern Class 2 is m2 = (3, 1), the covariance matrix is S2 = (1, 0.5; 0.5, 2), and the prior probability of the two classes P1 = P2 = 1/2. The detailed operation consists of the following four sections:

1. First, write a function that generates a specified number of random samples for several pattern classes, where 100 random samples are generated for each of the two pattern classes, and a two-dimensional scatter plot of these samples is drawn in a picture;

2. Since each random sample contains two feature components, this first only uses one of the feature components of the pattern set as the classification feature, classifies the 200 samples in the first step, statistics the percentage of the correct classification, and draws the correct classification and the wrong sample with different colors on the two-dimensional graph; Note: The green dots represent the scatter of the first class, and the red represents the second category; the green circle represents the scatter that is divided into the first category, and the red represents the scatter that is divided into the second category! Therefore, a different color in the inside and out points is the wrong sample. ）

3. Repeat the second step by using only the second feature component of the pattern as the classification feature;

4. At the same time, using the two components of the pattern as the classification feature, 200 samples are classified, statistical correct classification percentage, and in the two-dimensional map with different colors to draw the correct classification and the wrong score of the sample;

Correct rate:

It can be seen that the error rate is higher when a classification feature is used alone (it is not possible to get a better classification result in multiple experiments), and the increase of the number of classification features is an effective means to improve the correct rate, of course, this will bring additional time cost to the algorithm.

Iv. further Bayes classifiers

If the classification data satisfies the Gaussian distribution, a discriminant classifier is designed, which is designed to get a preliminary understanding and design of a classifier.

1. Write a Gaussian-type Bayes discriminant function Guassianbayesmodel (mu,sigma,p,x), the function input is: a given normal distribution mean mu, covariance Matrix sigma, a priori probability p and a pattern sample vector X, the value of the output discriminant function, The code is as follows:

2. The table below shows 10 sample points for each class of three samples, assuming that each class is normally distributed, and that three categories have a prior probability equal to P (W1) =p (W2) =p (W3) =1/3. Calculate the mean vector and covariance matrix for each class of samples, and design a classifier for these three categories.

3. The following test points are categorized using the classifier designed in the second step: (1,2,1), (5,3,2), (0,0,0), and the mahalanobis distance between each test point and each category mean is calculated using the following formula. Here is the explanation from the Baidu encyclopedia about the distance between horses:

Markov Distance Calculation formula:

More specifically, see: http://baike.baidu.com/link?url= pcos75ou28q7iukueepcnqf8n7xzifuxotrwzewpjulgvrrnytb9gji6iehezlk6q4etlvx45tajdxvd7lnn2q

4. If P (W1) =0.8, P (W2) =p (W3) = 0.1, then the second and third steps of the experiment. The results of the experiment are as follows:

The first is to derive the respective mean and covariance matrices of the three types of sample points:

When the prior probabilities of the three categories are P (W1) =p (W2) =p (W3) =1/3, the functions are used to classify the results and to give the mahalanobis distance between the test points and the average values of each category.

Verify that when the prior probabilities of the three categories are unequal, the same function is used to classify the results and to give the mahalanobis distance between the test points and the average values of each category.

It can be seen that, in the case of mahalanobis distance, different prior probabilities have great influence on the classification result of Gaussian Bayes classifier ~ In fact, the optimal judgment will be biased to the category of the higher priori probability.

The complete code is comprised of two functions and the main execution flow:

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% generated mode class function%N: Number of scatter points generatedC: Number of Categories D: Scatter dimension% mu: mean matrix of various scatter points% Sigma: covariance matrix for various scatter points%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%function result =Mixgaussian(N,C, D, Mu, sigma) color = {' R. ',' G. ',' M. ',' B. ',' K. ',' Y. '}; % used to store different kinds of data in color%ifNargin <=3&N<0&C<1& D <1% error (' too few parameters or parameter errors ');ifD = =1         fori =1 : C             forj =1 : N/CR (j,i) = sqrt (Sigma (1, i)) * RANDN () + mu (1, i);End            X= Round (Mu (1, i)-5);Y= Round (Mu (1, i) + sqrt (Sigma (1, i)) +5); B = hist (R (:, i),X: Y); Subplot (1,C, i), Bar (X: YB' B '); Title' distribution histogram of three kinds of one-dimensional random points '); Grid onEndElseIf D = =2             fori =1: CR:,:, i) = Mvnrnd (Mu (:,:, i), Sigma (:,:, i), round (N/C)); Plot (R (:,1, i), R (:,2, i), char (color (i))); Hold on;EndElseIf D = =3             fori =1: CR:,:, i) = Mvnrnd (Mu (:,:, i), Sigma (:,:, i), round (N/C)); PLOT3 (R (:,1, i), R (:,2, i), R (:,3, i), char (color (i))); Hold on;EndElseDisp' dimension can only be set to 1, 2 or 3 ');Endresult = R;%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%Gaussian typeBayesdiscriminant function% mu: input average of normal distribution% Sigma: Enter the covariance matrix of the normal distribution% P: the prior probability of input typesX: Input sample vector% output discriminant function value, Markov distance and discriminant result%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%function Guassianbayesmodel (mu,sigma,p,x)%Build discriminant function% compute points to each class'sMahalanobisDistance fori =1:3;P(i) = Mvnpdf (X, Mu (:,:, i), Sigma (:,:, i)) * p (i); R (i) = sqrt ((X-Mu (:,:, i)) * INV (Sigma (:,:, i)) * (X-Mu (:,:, i))end% determines the probability of which class the sample belongs to and displays the Mahalanobis distance of the point to each class MAXP = max (p); style = Find (p = = maxp);d ISP (['Point', Num2str (X), '] The value of the discriminant function for the first class is:', Num2str (P (1))]); Disp (['Point', Num2str (X), '] The value of the discriminant function for the second class is:', Num2str (P (2))]); Disp (['Point', Num2str (X), '] The value of the discriminant function for the third class is:', Num2str (P (3))]); Disp (['Point', Num2str (X), '] to section1Class ofMahalanobisDistance is:', Num2str (r (1))]); Disp (['Point', Num2str (X), '] to section2Class ofMahalanobisDistance is:', Num2str (R (2))]); Disp (['Point', Num2str (X), '] to section3Class ofMahalanobisDistance is:', Num2str (R (3))]); Disp (['Point', Num2str (X), '] Belongs to section', num2str (style), 'Class']); Experimental main function of%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Bayesian classifier%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  Generate two classes of 100 scatter samples mu (:,:, 1) = [1 3];  Sigma (:,:, 1) = [1.5 0; 0 1];p 1 = 1/2;MU (:,:, 2) = [3 1]; Sigma (:,:, 2) = [1 0.5; 0.5 2];p 2 = 1/2;% generates 200 two-dimensional scatter points, flattened into two classes, 100 AA per class = Mixgaussian (2,, 2, Mu, sigma);Two types of common $A Gaussian distribution of scatter pointsthe% x component is the classification of the classification characteristics figure;% the number of scatter points correctly classified right1 = 0;RIGHT2 = 0;% correct rate rightRate1 = 0;rightrate2 = 0;for i = 1:100 x = AA    (i,1,1); Plot (AA (:,), AA (:, 2,1), 'R.P1 = normpdf (x, 1, sqrt (1.5)); P2 = Normpdf (x, 3, sqrt (1)), if P1 > P2 plot (aa (i,1,1), AA (i,2,1), 'Ks');    Hold on; right1 = right1 + 1;% Statistics correct number ElseIf P1 < P2 plot (AA (i,1,1), AA (i,2,1), 'Go'); Hold on;endendrightrate1 = right1/100;    % correct rate for i = 1:100 x = AA (i,1,2); Plot (AA (:, up), AA (:, 2,2), 'G.P1 = normpdf (x, 1, sqrt (1.5)); P2 = Normpdf (x, 3, sqrt (1)), if P1 > P2 plot (aa (i,1,2), AA (i,2,2), 'Ks'); Hold On;elseif P1 < P2 plot (AA (i,1,2), AA (i,2,2), 'Go');    Hold on; right2 = right2 + 1; % statistics correct number endendrightRate2 = Right2/100;title ('Using the classification results of the first categorical feature');d ISP (['When using only the first feature, the accuracy of the first category is:', Num2str (rightrate1*100), '%']); Disp (['When using only the first feature, the accuracy of the second category is:', Num2str (rightrate2*100), '%']); % only use the classification of the Y-component classification characteristics figure;% The number of scatter points correctly classified right1 = 0;RIGHT2 = 0;% correct rate rightRate1 = 0;rightrate2 = 0;for i = 1:100 y = AA (i,2,1      ); Plot (AA (:,), AA (:, 2,1), 'R.P1 = Normpdf (y, 3, sqrt (1)); P2 = Normpdf (y, 1, sqrt (2)), if P1 > P2 plot (aa (i,1,1), AA (i,2,1), 'Ks');    Hold on; right1 = right1 + 1; % statistics correct number ElseIf P1 < P2 plot (AA (i,1,1), AA (i,2,1), 'Go'); Hold on;endendrightrate1 = right1/100;      % correct rate for i = 1:100 y = AA (i,2,2); Plot (AA (:, up), AA (:, 2,2), 'G.P1 = Normpdf (y, 3, sqrt (1)); P2 = Normpdf (y, 1, sqrt (2)), if P1 > P2 plot (aa (i,1,2), AA (i,2,2), 'Ks'); Hold On;elseif P1 < P2 plot (AA (i,1,2), AA (i,2,2), 'Go');    Hold on; right2 = right2 + 1; % statistics correct number endendrightRate2 = right2/100; % correct rate title ('Classification results using the second classification feature');d ISP (['When using only the second feature, the accuracy rate for the first category is:', Num2str (rightrate1*100), '%']); Disp (['When using only the second feature, the accuracy of the second category is:', Num2str (rightrate2*100), '%']); % use the classification of two classification characteristics figure;% the number of scatter points correctly classified right1 = 0;RIGHT2 = 0;% correct rate rightRate1 = 0;rightrate2 = 0;for i = 1:100 x = AA (i,1,1)    ;  y = AA (i,2,1); Plot (AA (:,), AA (:, 2,1), 'R.P1 = Mvnpdf ([x, Y], mu (:,:, 1), Sigma (:,:, 1)); P2 = Mvnpdf ([x, Y], mu (:,:, 2), Sigma (:,:, 2)); if P1 > P2 plot (aa (i,1,1), AA (i,2,1), 'Ks');    Hold on; right1 = right1 + 1;else if P1 < P2 plot (AA (i,1,1), AA (i,2,1), 'Go');    Hold on;    EndendendrightRate1 = Right1/100;for i = 1:100 x = AA (i,1,2);    y = AA (i,2,2); Plot (AA (:, up), AA (:, 2,2), 'G.P1 = Mvnpdf ([x, Y], mu (:,:, 1), Sigma (:,:, 1)); P2 = Mvnpdf ([x, Y], mu (:,:, 2), Sigma (:,:, 2)); if P1 > P2 plot (aa (i,1,2), AA (i,2,2), 'Ks'); Hold On;else if P1 < P2 plot (AA (i,1,2), AA (i,2,2), 'Go');    Hold on;    right2 = right2 + 1; EndendendrightRate2 = Right2/100;title ('Classification results using two categorical features');d ISP (['When using two features at the same time, the accuracy rate of the first category is:', Num2str (rightrate1*100), '%']); Disp (['When using two features at the same time, the accuracy rate of the second category is:', Num2str (rightrate2*100), '%"]);%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% further Bayes classifier%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%             %%% W1,W2,W3 three-class scatter w = Zeros (10,3,3); W (:,:, 1) = [ -5.01-8.12-3.68; .... -5.43-3.48-3.54;...             1.08-5.52 1.66, .....             0.86-3.78-4.11;-2.67 0.63 7.39; ....             4.94 3.29 2.08;..-2.51 2.09-2.59; .... -2.25-2.13-6.94; ....             5.56 2.86-2.26; ....             1.03-3.33 4.33];w (:,:, 2) = [ -0.91-0.18-0.05; ....             1.30-.206-3.53; -7.75-4.54-0.95;..-5.47 0.50 3.92;             6.14 5.72-4.85; ....             3.60 1.26 4.36; ....             5.37-4.63-3.65, .....             7.18 1.46-6.66; -7.39 1.17 6.30;.. -7.50-6.32-0.31];w (:,:, 3) = [5.35 2.26 8.13; ....             5.12 3.22-2.66; .... -1.34-5.31-9.87;..             4.48 3.42 5.19; ....             7.11 2.39 9.21; .... 7.17 4.33-0.98; ....             5.75 3.97 6.65; ....             0.77 0.27 2.41; ....             0.90-0.43-8.71, .....  3.52-0.36 6.43];% mean Value MU1 (:,:, 1) = SUM (W (:,:, 1))./10;MU1 (:,:, 2) = SUM (W (:,:, 2))./10;MU1 (:,:, 3) = SUM (W (:,:, 3))./10;% Covariance matrix SIGMA1 (:,:, 1) = CoV (W (:,:, 1)); SIGMA1 (:,:, 2) = CoV (W (:,:, 2)); SIGMA1 (:,:, 3) = CoV (W (:,:, 3));% of various other prior probabilities% P (1) = 1/3;% p (2) = 1/3;% p (3) = 1/3;p (1) = 0.8;p (2) = 0.1;p (3) = 0.1;% sample Vector x = [1 0 0];% call Gaussian Bayes discriminant function, output discriminant function value, Markov distance and discriminant result Guassianbayesm Odel (MU1,SIGMA1,P,X);

Matlab Implementation Bayesian classifier

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Realization of Bayesian classifier by Matlab

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support