Classification algorithm pay attention to the beginning algorithm to find the probability characteristic parameter of the prior probability condition the continuous mean of the discrete characteristic parameter and the standard deviation to find the probability of conditional probabilities based on mean and standard deviation
Naive Bayesian algorithm is a classification algorithm
Using a few characteristic parameters of a thing, the model is trained by a large amount of data, and then the model is used to predict what classification the thing is. Note
Before the algorithm needs to be clear, the characteristic parameter is discrete or continuous, discrete is, temperature high school low, numerical size, good or bad, continuous is the amount of rainfall to how much, the temperature of several degrees, which will affect the probability of seeking conditional.
Each classification result corresponds to a posteriori probability, the most posterior probability is the predicted result
Posteriori probability = prior probability * Conditional probability start algorithm seeking prior probability
First we get the training data set (which is a 2-d data table),
For example, to obtain a picture, predicted under the rain, the characteristics of the temperature high school low, humidity high school low, wind size, the corresponding 10 data. The classification results for the rain data divided by 10, is the prior probability of rain, such as 0.3, the data that does not rain divided by 10 is no more than the rain prior probability, such as 0.7, so that the prior probability of each prediction result is good, pay attention to the prior probability and equal to 1. Conditional Probabilities Discrete characteristic parameters
For example, there are 3 data in the rain, and then we find the ratio of high temperature, medium and low in the three data, For example, for 0.2,0.1,0.7, and then find out the proportion of low humidity high school, for example, 0.3,0.5,0.2 and then find out the proportion of wind and small respectively, for example, respectively, 0.6,0.4
In fact, when the rain is satisfied, the proportion of each value of each characteristic parameter is added equal to 1 for each value of each characteristic parameter.
And then ask when the rain is satisfied, each characteristic parameter of the proportion of each value, such as the rain of the data there are 7, and then we in the three data to find out the temperature is high, the proportion of the low, such as 0.4,0.5,0.1 respectively.
To find out the proportion of low humidity high school, for example, 0.1,0.2,0.7 and then find out the proportion of wind large and small respectively, for example, 0.3, 0.7 respectively.
So the prior probability of rain and rain and rainy and the condition probabilities are all well, and now can be predicted, for example, I give you a parameter, temperature, high humidity, wind small, then the probability of his rain (also known as rain after the probability) is 0.3*0.1*0.6* 0.4, the probability of rain is 0.7*0.5*0.1*0.7.
And then see the probability of rain or not rain of the probability of large and then predict the completion of, pay attention to just find a probability of big, not necessarily is completely correct. feature parameter Continuous
The pre-test probability method is as above.
Because it is continuous, the conditional probability is different before the method is obtained, so we need to find the conditional probability of each characteristic parameter by the formula
First, by default, he obeys the Gaussian distribution, and then he knows the formula.
Here n is the average, Rou is the standard deviation, x is the value of the validation sample's properties, and the method for finding the mean and standard deviation of conditional probability
Meaning of mean and standard deviation
The mean value is not said,
The standard deviation is applied to the investment and can be used as an indicator to measure return stability. The higher the standard deviation value, the higher the risk is that the return is far from the past average value and the return is less stable. Conversely, the smaller the standard deviation, the more stable the return, and the lower the risk.
Therefore, the standard deviation must be used as the median value
The mean is equal to one of the characteristic parameters in the rain record all the values are added by the number of records in the rain, and the mean value of rain is the same
After the completion is a 2-dimensional array, the mean value of each characteristic parameter when it rains, and the mean value of each characteristic parameter when it is not raining.
Standard deviation
n is the number of records for each classification, directly using the array to be obtained prior to the pre-test probability, it should be noted that SI is a two-dimensional array si "classification" "attribute"
Each feature parameter has a standard deviation for each record
Gaussian distribution
Gaussian distribution, because the attribute values are discrete, it is not possible to compare the size of the training data to find out the probability of each attribute of the validation data, so our default data of each property value is subjected to Gaussian distribution, is an inverted clock shape distribution, and the specific shape of the Gaussian distribution curve is determined by the mean value N and the standard deviation rou , we need to find the mean and standard deviation in the training data, and determine a Gaussian distribution curve for the mean and standard deviation of each attribute of the training data, and then bring the mean value, the difference of the brick and the value of the corresponding property of the validation data into
You know the probability that the attribute value will appear in the training data. conditional probability based on mean value and standard deviation
A validation data xn into a x[attribute], training data mean variance n[classification, attributes] and the mean rou[classification, properties] of the formula, to derive a tiaojian[attribute, classification] of the two-dimensional array is a variety of properties corresponding to the probability of various classifications, before the pre-test probability qianyan[classification] is the probability of various classifications appearing in the training data.
Finally we find the posterior probability is the synthesis of all the attributes (because each property is conditional independent, so the direct multiplicative on the line) after the probability that the various categories appear the conditional probability, which is said to synthesize all attributes is p[attribute, classification] Each row of data is multiplied together, the final column of data, this column of data and qianyan[ The posterior probability is obtained by multiplying, and the maximum is taken as the result of the classification.
In fact, before the discrete is to find a validation data xn, various attributes corresponding to the probability of various classifications, and finally the attribute elimination, that is, each row of data are multiplied together, a list of different classification of one by one corresponding probability as the conditional probability of the posterior probability
After averaging and standard deviation, and with the validation sample's characteristic parameter values into the formula, we came to the record in the rain and the premise of rain, the conditional probability of each characteristic parameter
and then put the rain The conditional probabilities of each characteristic parameter are multiplied together, and then multiply by the rain prior probability
Multiply the conditional probabilities of each of the rain each feature parameter together, and then multiply it by a priori probability of rain
look at which big, I know the probability of rain is big or it doesn't rain.