Stanford UFLDL Tutorial Self-study _stanford

Source: Internet
Author: User
Self-learning Contents [hide] 1 Overview 2 feature learning 3 data preprocessing 4 unsupervised feature learning terms 5 Chinese and English

If there is already a strong enough machine learning algorithm, one of the most reliable ways to achieve better performance is to give the algorithm more data. There is even a saying in the machine learning community: "Sometimes winners don't have the best algorithms, they have more data." ”


People can always try to get more tagged data, but it's often expensive to do so. For example, researchers have spent considerable effort on using tools such as AMT (Amazon mechanical Turk) to get a larger training dataset. Compared to a lot of researchers building features by hand, it's a step forward to make people hand-label data in crowdsourcing, but we can do better. Specifically, if the algorithm can learn from the data, then we can easily get a large number of data without annotations, and learn from it. Self-learning and unsupervised feature learning is the algorithm. Although a single, undated sample contains less information than a labeled sample, but if you can get a large amount of undocumented data (such as downloading random, undated images, audio clips, or text from the Internet), and the algorithms can use them effectively, compare the large-scale manual construction of features and data, The algorithm will achieve better performance.


In the problem of self-learning and unsupervised feature learning, it is possible to give the algorithm a large amount of undated data and learn a better feature description. When trying to solve a specific classification problem, you can use supervised learning methods to complete the classification based on these learned feature descriptions and any (possibly less) annotated data.


These ideas may be most effective in scenarios with large amounts of undated data and a small amount of tagged data. Even in cases where the data is already marked (when we usually ignore the class label of the training data for feature learning), the above ideas can also be very good results.


Characteristic learning

We've learned how to use a self encoder (Autoencoder) to learn about features from no annotated data. Specifically, it is assumed that there is a training dataset without annotations (the subscript represents "no class mark"). Now use them to train a sparse self encoder (which may require bleaching or other appropriate preprocessing of the data first).


The activation quantity (activations) of the hidden unit can be calculated by using the model parameters obtained from training and given any input data. As mentioned earlier, it may be a better feature description than the original input. The neural network of the following graph describes the calculation of the feature (activation quantity).


This is actually the sparse self encoder that was previously obtained, removing the last layer here.


Given the size of the labeled training set (subscript denotes "band class"), we can find a better feature description for the input data. For example, you can get input to a sparse self encoder to obtain a hidden cell activation amount. Next, you can use it directly instead of the original data ("Alternative representations", replacement representation). can also be combined, using a new vector to replace the original data ("Cascade representations", concatenation representation).


After a transformation, the training set becomes or is (depending on whether to use substitution or merging the two). In practice, combining and merging usually perform better. However, you can also use the substitution operation, given the memory and the cost of the calculation.


Finally, a supervised learning algorithm (such as SVM, logistic regression, etc.) can be trained to predict the value of a discriminant function. The prediction process is as follows: Given a test sample, repeat the previous process and send it to the sparse self encoder. The (or) is then sent (or) into the classifier to get the predicted value.


Data preprocessing

In the feature learning phase, we never note the training concentration learning, in which various data preprocessing parameters may be computed. For example, the data mean is computed and the mean value is normalized (mean normalization), or the original data is called PCA, and then the original data is represented (or using PCA albinism or Zca albinism). In this case, it is necessary to save these parameters and use the same parameters in the subsequent training and testing phases to ensure that the data passes through the same transformation before it enters a sparse, encoded neural network. For example, if the data set is preprocessed by PCA, the obtained matrix must be saved and applied to the labeled training set and the test set, and the annotated training set cannot be used to estimate a different matrix (also can not recalculate the mean value and do mean value standardization), Otherwise, it is possible to get a completely inconsistent data preprocessing operation, which results in the data distribution of the encoder is different from the data distribution when training the encoder.


Terminology of unsupervised characteristics learning

There are two common unsupervised characteristics of learning, the difference is in what you have not annotated data. Self-learning (self-taught learning) is a more general and powerful way of learning, and it does not require that the annotated data and the tagged data come from the same distribution. Another restrictive approach is also known as semi-supervised learning, which requires and obeys the same distribution. The following examples explain the difference between the two.


Suppose there is a computer vision task, the goal is to distinguish between car and motorcycle images, that is, the training sample is either a car image or a motorcycle image. Where can I get a lot of undated data? The simplest way may be to download some random image datasets from the Internet, and train a sparse self encoder to derive useful features from the data. In this case, the undated data comes entirely from a different distribution of the data that is being labeled (no data set, perhaps some of the images contain cars or motorcycles, but not all images). This situation is called self-learning.


On the contrary, if there is a large number of undated image data, either car images, or motorcycle images, is simply missing the class label (not labeled each picture is the car or motorcycle). You can also use these annotated data to learn features. This approach, which requires that the sample not be labeled and with the labeled sample subject to the same distribution, is sometimes referred to as semi supervised learning. In practice, it is often impossible to find the undated data that satisfies this requirement (where to find an image database where each image is not a car or a motorcycle, but a class label is lost). Therefore, self-learning is more widely used in the feature learning of the data set without annotation.



Chinese-English control self-learning/self-learning self-taught learning unsupervised characteristic learning unsupervised feature learning Self encoder Autoencoder Whitening Whitening Activation Volume Activation sparse from Encoder Sparse Autoencoder Semi-supervised learning semi-supervised learning from:http://ufldl.stanford.edu/wiki/index.php/%e8%87%aa%e6%88% 91%e5%ad%a6%e4%b9%a0

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.