The R implementation of the modeling step of partial least squares regression analysis (Rehabilitation Club 20 Members test data) + complementary pls regression coefficient matrix algorithm implementation

Source: Internet
Author: User

Kf=read.csv (' D:/kf.csv ') # Read recovery data
Kf
Sl=as.matrix (Kf[,1:3]) #生成生理指标矩阵
Xl=as.matrix (Kf[,4:6]) #生成训练指标矩阵
X=sl
X
Y=xl
Y
X0=scale (x)
X0
Y0=scale (y)
Y0
M=t (x0)%*%y0%*%t (y0)%*%x0
M
Eigen (M)
W1=eigen (m) $vectors [, 1]
V1=t (y0)%*%x0%*%w1/sqrt (As.matrix (Eigen (m) $values) [1,])
V1
T1=X0%*%W1 #第一对潜变量得分向量
T1 # above for the first step (1) to extract the first pair of two variables group, and make it the most relevant.
U1=y0%*%v1
U1 #第一对潜变量得分向量
Library ("PRACMA")
Α1=INV (t (t1)%*%t1)%*%t (T1)%*%x0 #也可由t (x0)%*%t1/norm (T1, ' 2 ') ^2 calculate α1 #α1在pls中称为模型效应负荷量
Β1=INV (t (t1)%*%t1)%*%t (T1)%*%y0 #也可由t (y0)%*%t1/norm (T1, ' 2 ') ^2 calculations β1
T (x0)%*%t1/norm (T1, ' 2 ') ^2 # norm (T1, ' 2 ') is the singular value of SVD (T1), or T1, which can also be obtained by sqrt (t (t1)%*%T1)
T (y0)%*%t1/norm (T1, ' 2 ') ^2 # above is the second step (2) to establish Y-T1 regression and X-to-T1 regression.
α1
β1
LM (X0~T1) #验证α1即为x0做应变量, T1 the regression coefficients of the T1 of the least-squares of the independent variables (regression coefficients of weight, waist, and pulse, respectively) (total 3)
LM (Y0~T1) #验证β1即为y0做应变量, T1 the regression coefficients of the T1 of the least-squares of the independent variables (regression coefficients of chins, situps, and jumps, respectively) (total 3)
B=t (x0)%*%U1%*%INV (t (T1)%*%x0%*%t (x0)%*%u1)%*%t (T1)%*%y0 #保留第一对潜变量对应的标准化自变量x和标准化应变量y的pls回归系数矩阵 (The matrix formula see ' Kernelpartialleastsquaresregressionin reproducing Kernelhilbert Space ' p102)

B
Library ("pls")
PLS1=PLSR (y0~x0,ncomp=1,validation= ' LOO ', jackknife=t)
Coef (PLS1) #上式中B的求解等价于R的pls包中保留一个主成分的结果, the coefficients are normalized regression coefficients, which can be reduced to the regression coefficients of the original independent variable x and the strain y by the inverse normalization process. The following is a specific inverse normalization process in the result of preserving 2 principal components.
Px0=t1%*%α1 #求x0的预测值矩阵
E1=x0-px0 #求x0的残差矩阵
Py0=t1%*%β1 #求y0的预测值矩阵
F1=y0-py0 # #求y0的残差矩阵
M2=t (E1)%*%f1%*%t (F1)%*%E1 # (3) Repeat above steps with residual array E1 and F1 instead of x0 and y0.
Eigen (m2)
W2=eigen (m2) $vectors [, 1]
W2
V2=t (F1)%*%e1%*%w2/sqrt (As.matrix (Eigen (m2) $values) [1,])
V2
T2=e1%*%w2
T2
U2=f1%*%v2
U2
Α2=INV (t (T2)%*%t2)%*%t (T2)%*%e1 #也可由t (E1)%*%t2/norm (T2, ' 2 ') ^2 calculations α2
Β2=INV (t (T2)%*%t2)%*%t (T2)%*%f1 #也可由t (F1)%*%t2/norm (T2, ' 2 ') ^2 calculations β2
α2
β2
Library ("pls")
PLS1=PLSR (y0~x0,ncomp=2,validation= ' LOO ', jackknife=t) #以下为R中pls包运算结果, showing regression results (including the squared error of the predicted value and the interpretation of the press and variance), Compared and supplemented with the results of the pure algorithm above,
Summary (PLS1) #其中对于解释变量潜变量T1对应变量y的总变异解释的比例为chins (23.26%), situps (35.06%), and jumps (4.14%) are equivalent to the combined results of y in SAS 20.9447≈mean ( 23.26%,35.06%,4.14%) rounding up the resulting. 2 The Comps column shows the proportion of the total variation interpretation of the corresponding variable y after the introduction of the second explanatory variable latent variable.
Coef (PLS1) #以应变量situps为例得situps关于各自变量的回归方程 (* standardized): situps*=-0.13846688weight*-0.52444579waist*-0.08542029pulse* The normalized regression equation can be used to derive the regression equation of the original variable y and x: (Situps-mean (situps))/sd (situps) =-0.13846688* (Weight-mean (weight))/sd (weight)- 0.52444579* (Waist-mean (waist))/sd (waist) -0.08542029* (Pulse-mean (Pulse))/sd (Pulse)-->SITUPS=SD (situps) [- 0.13846688* (Weight-mean (weight))/sd (weight) -0.52444579* (Waist-mean (waist))/sd (waist) -0.08542029* (Pulse-mean ( Pulse))/sd (pulse)]+mean (waist)
SD (y[,2]) *-0.1384668393/sd (x[,1]) #weight的回归系数
SD (y[,2]) *-0.52444579/sd (x[,2]) #waist的回归系数
SD (y[,2]) *-0.08542029/sd (x[,3]) #pulse的回归系数
SD (y[,2]) * ( -0.13846688*-mean (x[,1))/sd (x[,1]) +-0.52444579*-mean (x[,2])/sd (x[,2]) +-0.08542029*-mean (x[,3])/sd (x [, 3])) +mean (y[,2]) #原始变量y与x的回归方程截距
Model= "situps=612.56712-0.35088weight-10.24768waist-0.74122pulse--Yes! The results are exactly the same as those given by SAS. "
Model
Hypothesis testing of coefficients generated by the jack.test (PLS1) #即对coef (PLS1)
Scores (PLS1) #即求第一解释潜变量的得分向量t1 =x0%*%w1 and second explanatory variables for the score vector t2=e1%*%w2
Loadings (PLS1) #即求α1
Plot (PLS1)
Validationplot (PLS1) #validationplot () function can draw the corresponding rmsep of PLS model under different main fractions (mean square prediction error root calculated by leaving a cross-validation method)
Predict (PLS1) #即求py0 =t1%*%β1
#关于决定系数算法还需研究

Transferred from: http://my.oschina.net/u/1272414/blog/214881

The R implementation of the modeling step of partial least squares regression analysis (Rehabilitation Club 20 Members test data) + complementary pls regression coefficient matrix algorithm implementation

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.