A summary of normalization methods

Last Update:2018-07-25 Source: Internet

Author: User

Tags arithmetic min

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

REF:

http://blog.csdn.net/zbc1090549839/article/details/44103801

http://my.oschina.net/sanji/blog/215364

==============================

Normalized methods (normalization method)
1. The number changed to (0,1) between the decimal
is mainly for the data processing convenience proposed, the mapping to 0~1 within the scope of processing, more convenient and fast, should be classified into the digital signal processing category.
2. Transforming a dimensional expression into a dimensionless expression
normalization is a way of simplifying computation, and the expression of dimension will be transformed into dimensionless expression and become pure quantity.
For example, complex impedance can be normalized to write: Z = r + jωl = R (1 + jωl/r), the plural part becomes a pure quantity, no dimension.
In addition, microwave is also the circuit analysis, signal system, electromagnetic wave transmission, there are many operations can be so processed, not only to ensure the convenience of the operation, but also to highlight the nature of the physical meaning of the volume. The standardization of the
Standardized methods (normalization method)
data is to scale the data proportionally to a small, specific interval. In order to be able to participate in the evaluation of the indicators, it is necessary to standardize the indicators and map them to a numerical range by function transformation.
(1) min-Max normalization linearly transforms the original data. It is assumed that Maxa and Mina represent the maximum and minimum values of attribute a respectively. The minimum maximum normalization is calculated by mapping the value of attribute A to V on the interval [a, b]. In general, the minimum-maximum normalization is used in the credit index data, commonly used in the following two function forms:
A) benefit indicator (the larger the better type) of the membership function:
B) The cost type indicator (the smaller the better type) of the Membership function:
(2) Z-score Normalization is also called 0-mean normalization. The value of property A is normalized based on the average of a and the standard deviation. (2 in ref:http://blog.csdn.net/zbc1090549839/article/details/44103801) The
(3) Decimal calibration Normalization is achieved by moving the decimal position of attribute a. The number of moving digits in a decimal point depends on the maximum absolute value of a. (ref:http://wenku.baidu.com/link?url=mpuuobm4wgiislfegxi10jszyara-tb2zybl__qtj9vouwcoeaykdiep9cj5jpk5dm6otzq_ DN4LRZQ0KAHNWHNELAU4XT_MMWEBWN3OXB3)

==========================================================================================================

Ok! The following just for ref. The main content is over!

///////////////////////////////////////////////////////////////////////////////////////////////////

Here we mainly discuss two normalization methods: 1, linear function normalization (Min-max scaling)

The linear function converts the original data linearization method to the range of [0 1], and the normalized formula is as follows:
The method achieves equal scaling of the raw data, where Xnorm is the normalized data, X is the original data, and Xmax, xmin are the maximum and minimum values of the original dataset, respectively.
2, 0 mean value Normalization (z-score standardization) 0 mean normalization method The original data set is normalized to a data set of mean 0, variance 1, and the normalized formula is as follows:
where μ and σ are the mean values and methods of the original data set respectively. This normalization requires the distribution of the original data to approximate the Gaussian distribution, otherwise the normalized effect becomes worse.

These are the two more common but commonly used normalization technology, then the two normalization of the application scenario is how it. When the first method is better, when the second method is better. Here is a brief analysis of the summary: 1, in the classification, clustering algorithm, the need to use distance to measure similarity, or use PCA technology to reduce dimensionality, the second method (Z-score standardization) performance is better. 2, the first method or other normalization method can be used when the distance measurement, covariance calculation and data non-conforming distribution are not involved. For example, in image processing, the RGB image is converted to a grayscale image and its value is limited to the range of [0 255].

///////////////////////////////////////////////////////////////////////////////////////////////////
On the collation of neural network normalization method
Because the data collected by the units are inconsistent, it is necessary to carry out [ -1,1] normalization of the data, the normalization method mainly has the following several, for your reference: (by James)
1, linear function conversion, the expression is as follows:
y= (X-minvalue)/(Maxvalue-minvalue)
Description: X, Y are the pre-and post-conversion values, MaxValue, MinValue are the maximum and minimum values for the samples, respectively.
2, logarithmic function conversion, the expression is as follows:
Y=LOG10 (x)
Description: A logarithmic function transformation with a base of 10.
3, the inverse cotangent function conversion, the expression is as follows:
Y=atan (x) *2/pi
Normalization is to speed up the convergence of the training network, can not be normalized processing
The specific function of normalization is to summarize the statistical distribution of uniform samples. Normalization between 0-1 is the probability distribution of statistics, and normalization between -1–+1 is a statistical coordinate distribution. Normalization has the same meaning, unity and oneness. Whether it is for modeling or for computing, first of all, the basic unit of measurement is the same, the neural network is a sample in the event of statistical probability of training (probability calculation) and prediction, normalization is the same in 0-1 of the statistical probabilistic distribution;
When all sample input signals are positive, the weights associated with the first hidden layer neurons can only increase or decrease at the same time, resulting in a slow learning rate. In order to avoid this situation, speed up the network learning speed, the input signal can be normalized, so that all samples of the input signal its average value is close to 0 or less than its mean variance.
Normalization is because the value of the sigmoid function is between 0 and 1, the output of the last node of the network is the same, so often the output of the sample is normalized. So it is better to use [0.9 0.1 0.1] to classify problems than [1 0 0].
But normalization is not always appropriate, and other statistical transformations, such as normalization, can sometimes be better based on the distribution of output values.
About normalization with PREMNMX statements:
The syntax format for the PREMNMX statement is: [Pn,minp,maxp,tn,mint,maxt]=premnmx (P,t)
Where p,t are raw input and output data respectively, MINP and MAXP are the minimum and maximum values in P respectively. Mint and Maxt are the minimum and maximum values for t respectively.
The PREMNMX function is used to normalized the input data or output data of the network, and the normalized data will be distributed within the [ -1,1] interval.
When we are training the network, if we are using normalized sample data, then the new data used in the future use of the network should also receive the same preprocessing as the sample data, which will use the tramnmx.
The following describes the TRAMNMX function:
[Pn]=tramnmx (P,MINP,MAXP)
where P and PN are the input data before and after the transformation, MAXP and MINP respectively are the maximum and minimum values found by the PREMNMX function.
(by terry2008)
There are three ways to treat normalization in MATLAB
1. Premnmx, Postmnmx, Tramnmx
2. RESTD, POSTSTD, TRASTD
3. Self-programming
Specifically, that way, it's about your specific problem.
(By Happy)
Pm=max (ABS (P (i,:))); P (i,:) =p (i,:)/pm;
And
For i=1:27
P (i,:) = (P (i,:)-min (P (i,:)))/(Max (P (i,:))-min (P (i,:)));
End can be one to 0 1
0.1+ (x-min)/(max-min) * (0.9-0.1) where Max and min represent the sample maximum and minimum values, respectively.
This could be one to 0.1-0.9.
=============
Data types are converted to each other
This conversion can occur when an arithmetic expression, an assignment expression, and an output are present. There are two ways to convert: auto-convert and cast.
========
Automatic conversion
Automatic conversion is done automatically by the compilation system, which converts data of one data type into data of a different data type.
1) data conversion in arithmetic operations
If an operator has two operating components of different types, the C language is automatically converted to the same data type for operation when the expression is evaluated. The lower type of data is promoted to a higher type, so that the data types of the two are consistent (but the values are not changed) before being evaluated, and the result is a higher type of data. Automatic conversion follows the principle-"type promotion": Conversions are carried out in the direction of data type promotion (from low to high) to ensure that accuracy is not reduced. The height of the data type is determined by the size of the space occupied by the type, and the larger the space, the higher the type. Conversely, the lower. For example: arithmetic operation X+y, if the type of x and y are all int variables, then the result of x+y is naturally int. If x is short and y is int, you need to first convert the X to int and then add to Y, and the result of the expression is type int.
2) type conversion of assignment operations
When performing an assignment operation, if the data types on either side of the assignment operator are different, the data on the right-hand expression type of the assignment is converted to the type of the left variable of the assignment number. The conversion principle is that when the value of the expression on the right side of the assignment operator "=" is computed, whatever type is converted to the type of the variable to the left of "=" and then assigned to the variable on the left.
For example: float A;
a=10;? /* result is a=10.0 (data fill) */
int A;
a=15.5/* Results are a=15 (data interception) */
When assigning type conversions, be aware that the range of values cannot overflow. Must be within the allowable range of the data type. If the right variable data type length is longer than the left side, a portion of the data will be lost, resulting in reduced data accuracy.
3) type conversion at the time of data output
At output time, the data is converted to the type required by the format control. Data loss or overflow can also occur. The actual type conversion is: The character type to the integer is the ASCII value of the character, the integer to the character type only takes its lower 8 bits, the real type to the integral type to remove the decimal part; The integer to the real value is not changed, but is stored as a real number, and the double precision to the real type is rounded.
========
Forced conversions
In general, the conversion of data types is usually done automatically by the compilation system, and does not require programmers to manually write program intervention, so it is called implicit type conversion. However, if a program requires that certain types of data be coerced from that type to another type, it is necessary to manually program a forced type conversion, also known as an explicit conversion. The purpose of forcing type conversions is to make the data types change so that the operations between different types of data can go on.
The syntax format is as follows:
(type descriptor) expression
The function is to forcibly convert the type of an expression to the type required in parentheses.
For example: (int) 4.2 The result is 4;
Another example: int x;
(float) The value of the x;x is cast to a real type, but the X type that does not change is an integer type. It is only the actual processing when participating in the arithmetic processing.

======

A linear function goes to a series of data mapped to a corresponding interval, such as mapping all data to a 1~100

The following functions are available

Y= ((x-min)/(Max-min)) * (100-1) +1

1-100 in range

min is the minimum value in the dataset, Max is the maximum value////////////////////////////////////////////////////////////////////////////////////////

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More