First, Mapminmax
Process matrices by mapping row minimum and maximum values to [-1 1]
This means that each row of the matrix is processed into [ -1,1] intervals, for pattern recognition or other statistics, the data should be each column is a sample, each row is the same dimension of multiple samples, that is, for a m*n matrix, the sample dimension is M, the sample number is N, a total of n column n samples.
The main invocation forms are:
1. [Y,ps] = Mapminmax (X,ymin,ymax)
2. [Y,ps] = Mapminmax (X,FP)
3. Y = Mapminmax (' Apply ', X,ps)
4. X = Mapminmax (' reverse ', y,ps)
5. Dx_dy = Mapminmax (' Dx_dy ', x,y,ps)
For the 1 and 2 invocation forms, X is the preprocessed data, ymin and Ymax are expected to be the minimum and maximum value of each line, and the FP is a struct member primarily Fp.ymin, Fp.ymax. This structure can replace the effects of ymin and ymax,1 and 2. Only the parameters are brought in different forms.
Code:
X=[2,3,4,5,6;7,8,9,10,11];
Mapminmax (x,0,1)
fp.ymin=0;
Fp.ymax=1;
Mapminmax (X,FP)
For 3, in pattern recognition or statistics, PS is a training sample of data mapping, that is, PS contains the training data of the maximum and minimum value, here x is a test sample, for the test sample, the preprocessing should be consistent with the training sample, the maximum and minimum value should be the maximum and minimum value of the training set. Assuming Y is a test sample, with a total of two test samples, the code is as follows:
X=[2,3,4,5,6;7,8,9,10,11];
y=[2,3;4,5];
[Xx,ps]=mapminmax (x,0,1);
Mapminmax (' Apply ', Y,ps)
For 4-type, the data after preprocessing is reversed to get the original data.
X=[2,3,4,5,6;7,8,9,10,11];
y=[2,3;4,5];
[Xx,ps]=mapminmax (x,0,1);
Yy=mapminmax (' Apply ', y,ps);
Mapminmax (' reverse ', yy,ps)
For 5-type, the inverse wizard number (reverse derivative) is obtained according to the given matrix X, the normalized matrix Y and the mapping PS. If the given X and Y are the matrices of the M row n columns, then the result dx_dy is an array of 1XN structures, each of which is a diagonal matrix of MXN. This usage is not commonly used and is no longer an example here.
Second, Mapminmax principle and its realization
The mathematical formula for Mapminmax is y = (ymax-ymin) * (x-xmin)/(xmax-xmin) + ymin. If the data for a row is all the same, xmax=xmin at this point, the divisor is 0, the data is unchanged at this time.
The MATLAB implementation is:
function [Out]=mymapminmax (X,ymin,ymax)
out= (ymax-ymin). * (X-repmat (min (x,[],2), 1,size (x,2)))./repmat ((Max (x , [],2)-min (x,[],2)), 1,size (x,2)) +ymin;
Index=isnan (out);
Out (index) =x (index);
End
Note that the above code assumes that the sample in data x is a column vector.
Third. Standardization of MAPSTD
Process matrices by mapping each row's means to 0 and deviations to 1: maps each line of the matrix to 0 mean 1 variance data.
The main invocation forms are:
1. [Y,ps] = MAPSTD (X,YMEAN,YSTD)
2. [Y,ps] = MAPSTD (X,FP)
3. Y = MAPSTD (' Apply ', X,ps)
4. X = MAPSTD (' reverse ', y,ps)
5. Dx_dy = MAPSTD (' Dx_dy ', x,y,ps)
Similar to Mapminmax, the 1 and 2 are standardized for data x, where Ymean and YSTD are expected to get the mean and variance of each row of data, and similarly, we can also use a struct containing ymean and ystd to carry in.
X=[2,3,4,5,6;7,8,9,10,11];
y=[2,3;4,5];
[Xx,ps]=mapstd (x,0,1)
fp.ymean=0;
fp.ystd=1;
[Xx,ps]=mapstd (X,FP)
3 is the pretreatment of the test data, using the mean and variance of the training data to deal with, 4 is the data reversal after preprocessing.
X=[2,3,4,5,6;7,8,9,10,11];
y=[2,3;4,5];
[Xx,ps]=mapstd (x,0,1);
YY=MAPSTD (' Apply ', y,ps);
MAPSTD (' reverse ', yy,ps)
Fourth. Realization of MAPSTD standardization
The formula is Y = (x-xmean) * (YSTD/XSTD) + Ymean. If the ystd=0 is set, or if the data for a row is all the same (XSTD =0 at this point)
function [out] = MYMAPSTD (X,YMEAN,YSTD)
out= (X-repmat (Mean (x,2), 1,size (x,2)))./repmat (STD (x,0,2), 1,size (x,2)) . *ystd+ymean;
End
Fifth. Description of functions such as mean, STD, etc.
Mean default is to sum each column, mean (x,2) is the sum of each row, the STD function by default is the unbiased estimate of the standard deviation, there are three usages, s = STD (x), s = STD (x,flag), s = STD (X,flag,dim)
Where flag is an unbiased estimate of the parameters, flag=0 is unbiased estimation, that is, the default is unbiased estimation, flag=1 is biased estimation, Dim indicates the variance of the first dimension, STD (x,0,2) for each row of X to do unbiased standard deviation estimation.