Standardization and centrality of data and scale explanation in R language (RPM)

Source: Internet
Author: User

1. The centrality of data

The so-called centralization of data refers to the data in the dataset minus the mean of the data set.
For example, there are datasets 1, 2, 3, 6, 3, and its mean value is 3, then the data set after the 1-3,2-3,3-3,6-3,3-3 is a: -2,-1,0,3,0

2. Standardization of data
Standardization of the so-called data refers to the normalized data divided by the standard deviation of the dataset, that is, the data in the dataset minus the mean of the dataset, divided by the standard deviation of the dataset.
For example, Datasets 1, 2, 3, 6, 3, with a mean of 3, and a standard deviation of 1.87, the normalized data set is (1-3)/1.87, (2-3)/1.87, (3-3)/1.87, (6-3)/1.87, (3-3)/1.87, i.e.: -1.069,- 0.535,0,1.604,0

The meaning of data center and standardization is the same, in order to eliminate the influence of dimension on data structure.

The scale method can be used to center and standardize data in the R language:

#限定输出小数点后数字的位数为3位 > Options (digits=3) > Data <-C (1, 2, 3, 6, 3) #数据中心化 > scale (data, center=t,scale=f)     [, 1] [1,]   -2[2,]   -1[3,]    0[4,]    3[5,]    0attr (, "Scaled:center") [1] 3# data Normalization > scale         [, 1][1,] -1.06904[2,] -0.53452[3,]  0.00000[4,]  1.60357[5,]  0.00000attr (, "Scaled:center") [1] 3attr (, "Scaled:scale") [1] 1.8708

The two parameters in the scale method are explained in center and scale:
1.center and scale defaults to true, that is, t or True
2.center for true presentation data centric
3.scale for true presentation data normalization

Standardization and centrality of data and scale explanation in R language (RPM)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.