Machine learning Learning Note 1 Zhou Zhihua machine learning Flyu6
Time:2016-6-12
- Basic Concepts of learning
- Learning Style (Learning type)
- Supervised (supervised learning)
- Unsupervised (unsupervised learning)
- Hypothetical space
- Induction (induction)
- Deduction (deduction)
- Inductive preference
- No complimentary lunch (Nfl:no free Lunch theorem)
Basic concept of Learning learning Style (learning type) supervised (supervised learning)
-
Supervised learning
-
The so-called supervised learning is actually in the form of data set D (x , y), we can see that we have a definite target value or a label (y) associated with the data set of X. We can identify a relational model by looking at the relationship between X and Y . In the study of this model, we constantly adjust the parameters of the model through the true Y value.
category and regression two ways:
-
classification
Blockquote style= "padding-top:0; padding-right:0; padding-bottom:0; padding-left:15px; margin-top:0; margin-right:0; margin-bottom:20px; margin-left:0; border-left:5px solid #dddddd; " > category
Regression
If we predict a continuous value, such as: in the trend of housing prices, three rooms, one hall, Xiangyang, the price will be how much? This kind of learning task we call regression .
Of course, in this case we involve two processes, in which the process of looking for a model is called training , and the data set used is called the training set. The process of using a well-trained model to verify accuracy (accuracy) is called testing , and the data set used is called a test set.
Unsupervised (unsupervised learning)
-
Unsupervised learning
-
The most straightforward difference between unsupervised learning and supervised learning is that the data set format for supervised learning is (x, y), and the data set format for unsupervised learning is (x.). It is straightforward to say that unsupervised learning has no target value. The main goal of non-supervised learning is to learn the "intrinsic" structure of data from the data set of X.
In unsupervised learning, the most practical and representative method is Clustering (cluster).
For example we can look for a group of people (yellow people inside), everyone has some data to describe (accent, dietary preferences, ...) And so on, we can get a rough idea of the different clusters (cluster) through these characteristics. The concept of these clusters is automatically formed by machine learning and may correspond to some potential concepts. In this example, we can infer from the data provided that it is the people of the North or the south or the province.
This is the cluster . We can put data with similar structure in a cluster by using the data.
Of course, there is also a semi-supervised learning (semi-supervised)between supervised learning and unsupervised learning during normal time.
In the testing process, we hope that the higher the accuracy, the better, but in this process, we also need to learn the model has a good " generalization ability ." In other words, our model should not only have good predictive ability in training set and test set, but also require the model to have good predictive ability to new data or new data, which is called generalization.
Hypothesis Space Induction (induction)
Induction and deduction are two basic means of scientific reasoning.
-
Induction
-
From the special to the general "generalization" (generalization) process called induction, that is, from the specific facts to the general law
example, the process of learning a model from a sample is a process of induction. Also known as "inductive learning".
Deduction (deduction)
-
Interpretation
-
From the general to the special "specialized" (specialization) the process, namely from the basic principle to be far out concrete condition.
For example, in a mathematical kilometer system, a theorem is deduced based on a set of axioms and inference rules, which is deductive.
Inductive preference
This is actually the problem of overfitting (overfiting) and underfiting (less fitting). It is also the process of adapting the model we have trained to the new data set.
-
Ames Razor (Occam ' s razor)
-
If more than one hypothesis is consistent with observation, choose which is the simplest.
No complimentary lunch (Nfl:no free Lunch theorem)
Proof 1
Proof 2
Machine Learning Learning Note 1