R Language Learning Note (12): Principal component analysis and factor analysis

Source: Internet
Author: User
Tags fsm

#主成分分析par (mfrow= (c)) library (Psych) head (usjudgeratings,5) head (usjudgeratings[,-1],5) Fa.parallel ( Usjudgeratings[,-1],fa= "PC", N.iter=100,show.legend = false,main= "scree plot with parallel analysis")
#如, one of the main ingredients found in the test data

#提取主成分pc <-principal (Usjudgeratings[,-1],nfactors=1) PC

Principal Components Analysis
Call:principal (r = usjudgeratings[,-1], nfactors = 1)
Standardized loadings (pattern matrix) based upon correlation matrix
PC1H2 u2 com
Intg0.920.84 0.1565 1
Dmnr0.910.83 0.1663 1
Dilg0.970.94 0.0613 1
Cfmg0.960.93 0.0720 1
DECI0.960.92 0.0763 1
PREP0.980.97 0.0299 1
Fami0.980.95 0.0469 1
ORAL1.000.99 0.0091 1
Writ0.990.98 0.0196 1
PHYS0.890.80 0.2013 1
Rten0.990.97 0.0275 1

PC1
SS loadings 10.13
Proportion Var 0.92

Mean Item complexity = 1
Test of the hypothesis 1 component is sufficient.

The root mean square of the residuals (RMSR) is 0.04
With the empirical chi-square 6.2 with prob < 1

Fit based upon off diagonal values = 1

#例子, principal component analysis of Body Measurement index
Library (Psych)
Fa.parallel (harman23.cor$cov,n.obs=302,fa= "PC", N.iter=100,show.legend = false,main= "scree plot with parallel Analysis ")

Pc<-principal (harman23.cor$cov,nfactors=2,rotate= "None") pc

Principal Components Analysis
Call:principal (r = harman23.cor$cov, nfactors = 2, rotate = "none")
Standardized loadings (pattern matrix) based upon correlation matrix
PC1 PC2 h2 u2 com
Height 0.86-0.37 0.88 0.123 1.4
Arm.span 0.84-0.44 0.90 0.097 1.5
Forearm 0.81-0.46 0.87 0.128 1.6
Lower.leg 0.84-0.40 0.86 0.139 1.4
Weight 0.76 0.52 0.85 0.150 1.8
Bitro.diameter 0.67 0.53 0.74 0.261 1.9
Chest.girth 0.62 0.58 0.72 0.283 2.0
Chest.width 0.67 0.42 0.62 0.375 1.7

PC1 PC2
SS loadings 4.67 1.77
Proportion Var 0.58 0.22
Cumulative Var 0.58 0.81
Proportion explained 0.73 0.27
Cumulative proportion 0.73 1.00

Mean Item complexity = 1.7
Test of the hypothesis that 2 components is sufficient.

The root mean square of the residuals (RMSR) is 0.05

Fit based upon off diagonal values = 0.99

#主成份旋转
Rc<-principal (harman23.cor$cov,nfactors = 2,rotate= "VariMAX")
Rc

Principal Components Analysis
Call:principal (r = harman23.cor$cov, nfactors = 2, rotate = "VariMAX")
Standardized loadings (pattern matrix) based upon correlation matrix
RC1 RC2 h2 u2 com
Height 0.90 0.25 0.88 0.123 1.2
Arm.span 0.93 0.19 0.90 0.097 1.1
Forearm 0.92 0.16 0.87 0.128 1.1
Lower.leg 0.90 0.22 0.86 0.139 1.1
Weight 0.26 0.88 0.85 0.150 1.2
Bitro.diameter 0.19 0.84 0.74 0.261 1.1
Chest.girth 0.11 0.84 0.72 0.283 1.0
Chest.width 0.26 0.75 0.62 0.375 1.2

RC1 RC2
SS loadings 3.52 2.92
Proportion Var 0.44 0.37
Cumulative Var 0.44 0.81
Proportion explained 0.55 0.45
Cumulative proportion 0.55 1.00

Mean Item Complexity = 1.1
Test of the hypothesis that 2 components is sufficient.

The root mean square of the residuals (RMSR) is 0.05

Fit based upon off diagonal values = 0.99

#获取每个变量在主成份上的得分
Pc<-principal (Usjudgeratings[,-1],nfactors=1,score=true)
Head (pc$scores)

PC1
Aaronson,l.h.-0.19
ALEXANDER,J.M. 0.75
ARMENTANO,A.J. 0.07
Berdon,r.i. 1.14
BRACKEN,J.J.-2.16
burns,e.b. 0.77

#获取主成分得分系数
Rc<-principal (harman23.cor$cov,nfactors=2,rotate= "VariMAX")
Round (Unclass (rc$weights), 2)

RC1 RC2
Height 0.28-0.05
Arm.span 0.30-0.08
Forearm 0.30-0.09
Lower.leg 0.28-0.06
weight-0.06 0.33
bitro.diameter-0.08 0.32
chest.girth-0.10 0.34
chest.width-0.04 0.27

#探索性因子分析
#整理测试数据options (digits=2) Covariances<-ability.cov$covcorrelations<-cov2cor (covariances) correlations

General picture Blocks Maze reading vocab
General 1.00 0.47 0.55 0.34 0.58 0.51
Picture 0.47 1.00 0.57 0.19 0.26 0.24
Blocks 0.55 0.57 1.00 0.45 0.35 0.36
Maze 0.34 0.19 0.45 1.00 0.18 0.22
Reading 0.58 0.26 0.35 0.18 1.00 0.79
Vocab 0.51 0.24 0.36 0.22 0.79 1.00

#判断需提取的公共因子数, the result shown in this example is: there are two factors that can be obtained

Fa.parallel (correlations,n.obs=112,fa= "Both", n.iter=100,main= "scree plots with parallel analysis")

#提取公共因子fa <-fa (correlations,nfactors=2,rotate= "None", fm= "Pa")  #nfactors指出需要提取的因子数
Fa

Factor Analysis Using method = PA
CALL:FA (r = correlations, nfactors = 2, rotate = "NONE", FM = "PA")
Standardized loadings (pattern matrix) based upon correlation matrix
PA1 PA2 h2 u2 com
General 0.75 0.07 0.57 0.432 1.0
Picture 0.52 0.32 0.38 0.623 1.7
Blocks 0.75 0.52 0.83 0.166 1.8
Maze 0.39 0.22 0.20 0.798 1.6
Reading 0.81-0.51 0.91 0.089 1.7
Vocab 0.73-0.39 0.69 0.313 1.5

PA1 PA2
SS loadings 2.75 0.83
Proportion Var 0.46 0.14
Cumulative Var 0.46 0.60
Proportion explained 0.77 0.23
Cumulative proportion 0.77 1.00

Mean Item complexity = 1.5
Test of the hypothesis that 2 factors is sufficient.

The degrees of freedom for the null model is and the objective function was 2.5
The degrees of freedom for the model is 4 and the objective function was 0.07

The root mean square of the residuals (RMSR) is 0.03
The DF corrected root mean square of the residuals is 0.06

Fit based upon off diagonal values = 0.99
Measures of factor score adequacy
PA1 PA2
Correlation of scores with factors 0.96 0.92
Multiple R square of scores with factors 0.93 0.84
Minimum correlation of possible factor scores 0.86 0.68

#因子旋转
#正交旋转
Fa.varimax<-fa (correlations,nfactors=2,rotate= "VariMAX", fm= "Pa")
Fa.varimax

Factor Analysis Using method = PA
CALL:FA (r = correlations, nfactors = 2, rotate = "VariMAX", fm = "PA")
Standardized loadings (pattern matrix) based upon correlation matrix
PA1 PA2 h2 u2 com
General 0.49 0.57 0.57 0.432 2.0
Picture 0.16 0.59 0.38 0.623 1.1
Blocks 0.18 0.89 0.83 0.166 1.1
Maze 0.13 0.43 0.20 0.798 1.2
Reading 0.93 0.20 0.91 0.089 1.1
Vocab 0.80 0.23 0.69 0.313 1.2

PA1 PA2
SS Loadings 1.83 1.75
Proportion Var 0.30 0.29
Cumulative Var 0.30 0.60
Proportion explained 0.51 0.49
Cumulative proportion 0.51 1.00

Mean Item complexity = 1.3
Test of the hypothesis that 2 factors is sufficient.

The degrees of freedom for the null model is and the objective function was 2.5
The degrees of freedom for the model is 4 and the objective function was 0.07

The root mean square of the residuals (RMSR) is 0.03
The DF corrected root mean square of the residuals is 0.06

Fit based upon off diagonal values = 0.99
Measures of factor score adequacy
PA1 PA2
Correlation of scores with factors 0.96 0.92
Multiple R square of scores with factors 0.91 0.85
Minimum correlation of possible factor scores 0.82 0.71

#斜交旋转
Install.packages ("Gparotation")
Library (gparotation)
Fa.promax<-fa (correlations,nfactors=2,rotate= "Promax", fm= "Pa")
Fa.promax

Factor Analysis Using method = PA
CALL:FA (r = correlations, nfactors = 2, rotate = "Promax", fm = "PA")

Warning:a Heywood case was detected.
Standardized loadings (pattern matrix) based upon correlation matrix
PA1 PA2 h2 u2 com
General 0.37 0.48 0.57 0.432 1.9
picture-0.03 0.63 0.38 0.623 1.0
blocks-0.10 0.97 0.83 0.166 1.0
Maze 0.00 0.45 0.20 0.798 1.0
Reading 1.00-0.09 0.91 0.089 1.0
Vocab 0.84-0.01 0.69 0.313 1.0

PA1 PA2
SS Loadings 1.83 1.75
Proportion Var 0.30 0.29
Cumulative Var 0.30 0.60
Proportion explained 0.51 0.49
Cumulative proportion 0.51 1.00

With factor correlations of
PA1 PA2
PA1 1.00 0.55
PA2 0.55 1.00

Mean Item complexity = 1.2
Test of the hypothesis that 2 factors is sufficient.

The degrees of freedom for the null model is and the objective function was 2.5
The degrees of freedom for the model is 4 and the objective function was 0.07

The root mean square of the residuals (RMSR) is 0.03
The DF corrected root mean square of the residuals is 0.06

Fit based upon off diagonal values = 0.99
Measures of factor score adequacy
PA1 PA2
Correlation of scores with factors 0.97 0.94
Multiple R square of scores with factors 0.93 0.88
Minimum correlation of possible factor scores 0.86 0.77

#显示因子的相关系数?

Fsm<-function (oblique) {
if (class (oblique) [2] = = "FA" & Is.null (Oblique$phi)) {
Warning ("Object dosen ' t look like oblique EFA")
} else{
P<-unclass (oblique$loading)
f<-p%*% Oblique$phi
Colnames (F) <-c ("PA1", "PA2")
Return (F)
}
}

FSM (FA.PROMAX)

PA1 PA2
General 0.64 0.69
Picture 0.32 0.61
Blocks 0.43 0.91
Maze 0.25 0.45
Reading 0.95 0.46
Vocab 0.83 0.45

#斜交结果的图形展示
Factor.plot (Fa.promax,labels=rownames (fa.promax$loadings))

#因子关联图
Fa.diagram (Fa.promax,simple=false)

#因子得分fa. promax$weights

PA1 PA2
General 0.078 0.211
Picture 0.020 0.090
Blocks 0.037 0.702
Maze 0.027 0.035
Reading 0.743 0.030
Vocab 0.177 0.036

In general, both component analysis and public factor analysis are used to explore which factors are the optimal choice for building models.

R Language Learning Note (12): Principal component analysis and factor analysis

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.