Talking about feature Scaling

Source: Internet
Author: User

Talking about feature Scaling

definition : Feature Scaling is a method used to standardize the range of independent variables or features of data. In data processing, it's also known as data normalization and is generally performed during the data preprocessing step. ( From Wikipedia)

In simple terms, it is mainly used to map all eigenvalue ranges into the same range as (0,1), ( -1,1), ( -0.5,0.5) and so on.

Feature Scaling (data normalization) is a common step in data mining or machine learning, and this step can sometimes have a huge impact on the efficiency and accuracy of the algorithm.

impact on Precision : Obviously, the necessity of this step depends on the characteristics of the data features, if there are >=2 features, and the difference in the range of values between different features, it is necessary to use feature scaling. For example, in credit card fraud detection, if we only use the user's income as a learning feature, there is no need to do this step. However, if we use both the user's income and the user's age two characteristics, it is very possible to use this step before modeling to improve the detection accuracy, this is because the user's income this feature can range from [50000,60000] or even greater, but the user age may be [20,100] or so, At this time, if I use K nearest neighbor method to do the test, the user income of the similarity of the characteristics of the results will be significantly greater than the effect of the user's age, but in fact, these two characteristics of fraud detection may have the same importance. So, if we normalize the two features before the test is implemented, we can really treat them equally in our detection methods.

impact on efficiency : One more example, from Professor Ng's ML course,



For example, in this example, we want to use linear regression to predict housing prices based on the size of the house and the number of bedrooms in the house, using the Optimization method for batch gradient descent. In the process of building the model, if the size of the house and the number of bedroom two features of the house are not normalized, our optimization problem will be in a very skewed area (as shown on the left), which will make batch gradient descent convergence is very slow. And when we normalize it, the problem will be transformed into a partial circle of space optimization, at this time, batch gradient descent convergence speed will be greatly improved.

Practice:

The common feature scaling methods are as follows:

Xi ' = (xi-a)/b;

Where a can be the mean of feature XI, B can be the maximum value of Xi, (maximum-minimum value), standard deviation.

Summary :

The principles and methods of this step are simple, but if data mining or machine learning is less of a step, it can sometimes have a huge impact on learning efficiency and accuracy, so before learning to model, consider whether to feature scaling

Reference resources:

Http://en.wikipedia.org/wiki/Feature_scaling

https://class.coursera.org/ml/

Talking about feature Scaling

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.