Original Address
background: Why do you want to do smooth processing?
0 probability problem is that when calculating the probability of an instance, if a quantity X is not present in the Observation sample library (training set), the probability result of the whole instance is 0. In the text classification problem, when a word does not appear in the training sample, the word adjustment probability is 0, the use of a hyphen to calculate the probability of the text is also 0. This is unreasonable and cannot be arbitrarily considered as an event without observing the probability of the event being 0. the theoretical support of Laplace
In order to solve the problem of 0 probability, the French mathematician Laplace first proposed using the method of adding 1 to estimate the probability of no phenomenon, so the addition of smoothing is also called Laplace smoothing.
Assuming that the training sample is very large, the estimated probability change caused by the count of each component X plus 1 can be neglected, but it is convenient and effective to avoid the 0 probability problem. Application Examples
Suppose in the text classification, there are 3 classes, C1, C2, C3, in the specified training sample, a word K1, in each class, the probability of observing the count 0,990,10,k1 is 0,0.99,0.01, the calculation of these three quantities using Laplace smoothing is as follows:
1/1003 = 0.001,991/1003=0.988,11/1003=0.011
The addition of Lambda (1≥lambda≥0) is often used to replace the simple plus 1 in practical use. If you add a lambda to n counts, you should also remember to add N*LAMBDA to the denominator.