Objective
Machine learning is divided into: supervised learning, unsupervised learning, semi-supervised learning (can also be used Hinton said reinforcement learning) and so on.
Here, the main understanding of supervision and unsupervised learning. Supervised learning (supervised learning)
A function (model parameter) is learned from a given training dataset, and when new data arrives, the results can be predicted based on this function. The training set requirements of supervised learning include input and output, which can be said to be characteristic and objective. The goal of a training set is marked by a person. Supervised learning is the most common classification (attention and clustering distinction) problem, through the existing training samples (i.e. known data and its corresponding output) to train to get an optimal model (this model belongs to a set of functions, the optimal representation of an evaluation criteria is the best), Using this model, all the inputs are mapped into corresponding outputs, and the output is simply judged to achieve the classification. It also has the ability to classify unknown data. The goal of supervised learning is often to get computers to learn the classification system (model) we have created.
Supervised learning is a common technique for training neural networks and decision trees. These two technologies are highly dependent on the information given by the predetermined classification system, and for neural networks, the classification system uses information to judge the errors of the network, and then adjusts the network parameters continuously. For decision trees, the classification system uses it to determine which attributes provide the most information.
Common supervised learning algorithms: Regression analysis and statistical classification. The most typical algorithms are KNN and SVM.
The most common of supervised learning is: regression&classification
Regression:y is the vector of the real numbers. Regression problem is a curve of fitting (x,y), which makes the value function (costfunction) L minimum
Classification:y is a poor number (Finitenumber), can be regarded as a class label, the classification problem should first be given a lable data training classifier, it belongs to the supervision of the learning process. The Cost function L (x,y) in the classification process is the negative logarithm of the probability that X belongs to class Y.
where Fi (X) =p (y=i/x).
Unsupervised learning (unsupervised learning)
The input data is not marked and there is no definite result. The sample data class is unknown, the sample set needs to be classified according to the similarity between samples (clustering, clustering) to minimize the gap between classes and maximize the gap between classes. Popular point is the actual application, in many cases can not know in advance of the label of the sample, that is, there is no training samples corresponding to the category, so only from the original sample set without the samples label to start learning classifier design.
The unsupervised learning goal is not to tell the computer how to do it, but to let it learn how to do things on its own. There are two ways of unsupervised learning. The first idea is to instruct the agent not to specify a clear classification, but to succeed, to adopt some form of incentive system. It is to be noted that such training is often placed in the framework of decision-making issues, because its goal is not to produce a classification system, but to make the decision of the maximum return, which is a good summary of the real world, the agent can motivate the right behavior, and the wrong behavior to punish.
Unsupervised learning methods are divided into two main categories:
(1) A class of direct methods based on probability density function estimation: means to find various types of distribution parameters in the feature space, and then classification.
(2) The other is a concise clustering method called based on similarity metrics between samples: the principle is to try to set the core or initial kernel of different classes, and then gather the samples into different categories based on the similarity measure between the samples and the cores.
Using clustering result, we can extract hidden information from data and classify and forecast future data. Applied to data mining, pattern recognition, image processing and so on.
PCA and many deep learning algorithms belong to unsupervised learning. The difference between the two points
1. Supervised learning methods must have training sets and test samples. In the training focus to find the law, and the test samples to use this law. Instead of supervised learning, there is no training set, only a set of data, in the set of data to find the law.
2. The method of supervised learning is to identify things, and the result of recognition is to label the data to be identified. Therefore, the training sample set must be composed of labeled samples. Instead of supervised learning methods, only the data set to be analyzed has no label in advance. If a dataset is found to exhibit some aggregation, it can be categorized by natural clustering, but not by some sort of predefined label for the purpose of the number.
3. Unsupervised learning methods in the search for the regularity of the data set, this regularity does not necessarily have to achieve the purpose of dividing the dataset, that is, not necessarily "classification."
This is more than the use of supervised learning methods. For example, analyzing the principal component of a pile of data, or analyzing the characteristics of a dataset can be attributed to the category of unsupervised learning methods.
4. There is a difference between the principal component of the data set by unsupervised learning method and the principal component of the dataset using the K-L transformation. The latter is not a method of learning. Therefore, the Shing component of K-L transform is not a unsupervised learning method, that is, the method is not. and by learning to find regularity this embodies the learning method. The method of finding principal component in artificial neural network belongs to unsupervised learning method. When to use which method
The simple method is to start from the definition, there are training samples to consider the use of supervised learning methods, no training samples, it must not be supervised learning methods. However, in reality, even without a training sample, we can use our own eyes, from the data to be sorted, manually labeled some samples, and as training samples, so that the conditions can be improved, supervised learning methods to do. For different scenarios, the distribution of positive and negative samples if there is an offset (may be a large offset, may be relatively small), in this case, the effect of monitoring learning may be less than supervised learning.