Data Mining Algorithm Learning notes Summary
Data mining Algorithm (one) –k nearest neighbor algorithm (KNN)
Data mining Algorithm (ii) – Decision tree
Data mining Algorithm (III.) –logistic regression
Before introducing logistic regression, it is helpful to review a few basic knowledge points to help the understanding behind. basic mathematical Knowledge points 1, logarithmic likelihood function
If the total X is discrete, the probability distributions are listed as P (x=x) =p (x,θ) p (x=x) =p (X,\theta) Θ\theta as unknown parameters. Set (x1,x2,..., Xn) (x_1,x_2,..., x_n) is taken from a sample with an overall sample size of n, the joint probability distribution rate (x1,x2,..., Xn) (x_1,x_2,..., x_n) is ∏I=1NP (xi,θ) \prod_{i=1 }^{n}p (x_i, \theta)
A set of observations (X1,x2,..., Xn) (x_1,x_2,..., x_n) (x1,x2,..., Xn) (x_1,x_2,..., x_n), accessible samples x1,x2,..., Xn x_1,x_2,..., X_n were taken to the observed value X1 , the probability of X2,..., xn x_1,x_2,..., x_n is L (θ) =l (x1,x2