Deep Learning 11 _ Depth Learning UFLDL Tutorial: Data preprocessing (Stanford Deep Learning Tutorial)

Source: Internet
Author: User

theoretical knowledge : UFLDL data preprocessing and http://www.cnblogs.com/tornadomeet/archive/2013/04/20/3033149.html

Data preprocessing is a very important step in deep learning! If the acquisition of raw data is the most important step in deep learning, then the preprocessing of the raw data is an important part of it.

1. Methods of data preprocessing :

① Data Normalization :

Simple Scaling : Re-adjusts the value of each dimension of the data so that it is within the range of [0,1] or [−1,1]

Per- sample mean reduction : the statistical average of data is subtracted from each sample for smooth data, and the image is generally used only on grayscale images.

feature normalization : Make each dimension of the data have a 0 mean and a unit variance. The mean value of the data on each dimension is calculated first (calculated using the entire data), then the mean is subtracted from each dimension, and then divided by the standard deviation of the data on that dimension for each dimension of the data. Most commonly used!

② Whitening : PCA Whitening, Zca whitening. The focus is on the choice of epsilon items!

If the epsilon value is too low, the data after the whitening will appear to be noisy; Conversely, if the epsilon value is too high, the albino data will be too blurry compared to the original data.

Epsilon method of selection:

A. Draw the eigenvalues of the data graphically; b. Select a characteristic value that is larger than most of the noise in the data to reflect the epsilon .

2. How to adjust the epsilon specifically? I don't know, if I had a exercise, I'd be fine.

2. When preprocessing, when should be the per-sample mean reduction (i.e.: Each sample 0 is a single value, instead of all samples of each dimension 0 is the value)?

The statistical nature of each dimension of the data is the same time. For the image, it is not interested in the illumination of the image, but more attention to its content, then the average of each data point to remove the pixel is meaningful, it can be reduced by the sample mean, it is generally only applicable to grayscale.

Note : Color images cannot be "per-sample mean reduction", but must follow Deep Learning Nine _ depth Learning UFLDL Tutorial: Linear decoder_exercise (Stanford University Deep Learning Tutorial) method, namely: "Each dimension 0 is valued", for preprocessing.

Deep Learning 11 _ Depth Learning UFLDL Tutorial: Data preprocessing (Stanford Deep Learning Tutorial)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.