Naive Bayesian Classification (NBC) is the most basic classification method in machine learning, and it is the basis of the comparison of classification performance of many other classification algorithms, and the other algorithms are based on NBC in evaluating performance. At the same time, for all machine learning methods, there is the idea of Bayes statistics everywhere.
Naive Bayes in Bayesian geography and characteristic condition independence hypothesis, first based on conditional independence Hypothesis learning input x x and output y y of the Joint distribution P (x, Y) p (x, y), while using the prior probability P (y) p (y), according to Bayes theorem to calculate the posterior probability p (y| X) P (Y | X), identify the maximum posteriori probabilities for each category and determine the corresponding category. The algorithm is simple to implement, the learning and prediction efficiency is very high, the basic definition
The input space Rn R^n is a set of characteristic n-n-dimensional vector samples, the output space is the category set Y={C1,C2,... CK} y=\{c_1, c_2, ... c_k\}, K K is the number of categories. Any one sample x x can be represented as follows:
Xi={x (1) i,x (2) i,x (3) I,..., x (n) i} x_i=\{x_i^{(1)}, x_i^{(2)}, x_i^{(3)}, ..., x_i^{(n)}\}
Each x (d) I x_i^{(d)} represents the value of the D-D characteristic component of the I-I sample, d=1,2,3...n;n D=1,2,3...N; n is the number of features.
The entire sample set can be represented as follows:
t={(x1,y