At present, machine learning and other popular AI domain algorithms are mostly statistical methods, Hangyuan Li Teacher's "statistical learning method" is a very good way to get started statistical learning method of the book, while reading this book, by the way to write down notes, for their future reference.
Before I talk about statistical learning methods, I remember a friend of mine asked me, "Now the algorithms in the field of artificial intelligence are based on probabilities." ”
I thought that was worse than that, most of the algorithms are based on statistics, but later found that "statistics" is not "probability"
What is the difference between statistics and probabilities?
First a picture (a picture wins thousand words)
Thus, "probability" is a known model that predicts the results of the next new data. "Statistics" is a known data that induces a model.
To give a vivid example: in biology class, there is a problem is to look at the animal's feet, guess the name of the animal, a test taker is not going to do, the anger of the paper tore off to go outside, the teacher saw, grabbed him loudly said: "You which class, so arrogant." The student said, "You guessed it, you guessed it." ”
Statistics is to give you a black box containing cats and dogs, only to see their legs, you need to collect all the animal's legs (that is, the past data), and then summarize the characteristics of these legs (summary). When the legs appear in the picture, you can determine whether the leg is a cat's leg according to the previous summary.
The probability is that we've got a new animal's leg, and we can judge what kind of animal by observing a series of features.
And then back to "now the algorithm in the field of artificial intelligence is based on probability." "This problem is not, in fact, in the statistical learning method, we have not only probabilistic models, but also non-probabilistic models (such as decision-making functions), which involves the statistical learning method of the first element-model. (forcibly cut into the topic:)) statistical Learning methods three elements
The three elements of statistical learning are: model, strategy and method.
Model:
The model is either the generation model or the discriminant model. The two models are different in the target, the generation model is to find the joint probability distribution of the source data, the discriminant model is to find the conditional probability or decision function .
The details of the build model and discriminant model can be viewed http://blog.csdn.net/qq_33414271/article/details/79092438
Policy:
Select an appropriate loss function or risk function, i.e. select a target function (optimized target)
algorithm:
Here is the optimization algorithm, including gradient descent method, Newton method/Quasi-Newton method, Lagrange method and other classical optimization algorithm (statistical learning problem has a specific form after the optimization problem)
The above three elements can form a method , that is, the method of statistical learning.
Here are 10 of the most common statistical learning methods summarized
You can also compare the three elements used in each method above to deepen the understanding, like naive Bayes is a typical generation model, and logistic regression is a typical discriminant model.
Reference:
https://www.douban.com/group/topic/105567510/
https://betterexplained.com/articles/a-brief-introduction-to-probability-statistics/
http://blog.csdn.net/qq_33414271/article/details/79092438
The method of statistical learning Hangyuan Li