Part 1 Naive Bayes
Or the junk e-mail classification problem, which was mentioned in the last lesson, is divided into two kinds of event models:
1.1. Multivariable Bernouli Event Model
"This is the last lesson.
Maintain a long and long long dictionary
For a sample (x, y), X[i]=0or1 Indicates whether dictionary I have appeared in a sample message, Y=0or1 indicates that the sample is spam
In this model, Xi takes a value of only 0or1, so $x _{i} | y$ is Bernouli distribution
$ANS = P\left (y\right) \prod ^{n}_{i=1}p\left (X_{i} | y\right) $
1.2. Polynomial event model
Still need a long long long dictionary "set dictionary with 50,000 words"
For a sample (x, y), x[j] indicates which word (number) is present in the sample text in the J position $\in [1,50000] $
Note: The Convention $x ^{\left (i\right)}_{j}$ indicates which word in the J-position of the first I sample
In this model, Xi can take 50000 kinds of values, so $x _{i} | y$ is a polynomial distribution
$ANS = P\left (y\right) \prod ^{n}_{i=1}p\left (X_{i} | y\right) $
The parameters involved in this model are:
$\phi_{y}=p\left (y=1\right) $
$\phi_{i|y=1}=p\left (x_{j}=i | y=1\right) $
$\phi_{i|y=0}=p\left (x_{j}=i | y=0\right) $ "NOTE: the location and order in which the words occurs does not affect the results
For the case of I samples, the likelihood function can be obtained and the maximum likelihood is obtained.
For this result can also be implemented Laplace smooth, get "here's | V| is the number of words in dictionary, which is 50000 of the above
Naive Bayes, neural network preliminary, SVM