Naive Bayes
Naive Bayesian network is a Bayesian classifier, Bayesian classification algorithm is a statistical classification method, using probability theory and statistical knowledge to classify. The principle is to use the Bayesian formula to calculate the posterior probability based on the prior probability of the sample (i.e., the probability that the sample belongs to a certain class), and then select a class with the maximum posterior probability as the category of the object. Naive Bayesian classification is based on probability theory, has a solid mathematical basis, and stable classification efficiency, its advantage is that the algorithm is simple, in the case of less data is still accurate . Theoretically naive Bayesian classification has the smallest error rate, but the actual Bayesian hypothesis samples are independent of each other and often do not establish, thus affecting the classification correctness. 1. Bayesian theory
In Bayesian theory, suppose x,y two events:
P (x) p (x) is a priori probability of x x.
P (y| X) P (y| x) is the probability of the occurrence of Y y after x x has been known, also called y y.
In Bayesian theory, the usual formula:
Multiplication formula: P (XYZ) =p (z| XY) P (y| x) p (x) p (XYZ) =p (z| XY) P (y| x) P (x)
full probability formula: P (X) =∑KP (x| Yi) P (yi) p (X) =\SUM_KP (x| y_i) P (y_i)
Bayesian formula: P (yi| X) =p (x| Yi) P (X) p (y_i| X) =\frac{p (x| y_i)}{p (X)} 2. Study and classification of naive Bayesian method
Set the input space X⊆rn \mathcal{x}\subseteq R^n as the set of N-n-dimensional eigenvector, the output space is the collection y={c1,c2,..., ck} \mathcal{y}=\{c_1,c_2,\ldots, c_k\}. The input of Naive Bayes classifier is eigenvector x∈x x\in\mathcal{x}, and output is the predictive class tag y∈