Correlation analysis is an analytical method for a lot of help for further analysis, and the relationship between the two variables can be visualized visually through simple scatter plots.
1:corr procedure solves the relationship between two consecutive variables
2: Column -by-table analysis generally studies whether there is correlation between discrete variables or qualitative variables, and realizes by Proc Freq
2.1: Two qualitative variables (one of them unordered) have no relationship between Chi Fang Distribution Inspection
2.2: There is no causal relationship between two qualitative variables can be tested by trend
/***********************************************proc corr****************************************************** **********/
PROC CORR <options>;
by variables;
FREQ variable;
ID variables;
PARTIAL variables;
VAR variables;
WEIGHT variable;
with variables;
proc Corr Data=renmin.fitness Pearson spearman nosimple* Correlation analysis using different correlation coefficients nosimple to remove unnecessary statistics; var weight Oxygen runtime; * Results of all possible combinations of correlation metrics and hypothesis tests for correlations;
with var1-var10;* calculates the relationship between each variable in var and each variable in the VAR1-VAR10, which is the relationship between the group and the group; Run
* Partial Correlation: The hypothesis we need to calculate the correlation between x and Y, Z represents all the other variables, and the partial correlation coefficients of x and Y can be considered as a simple correlation coefficient between the residual rx obtained by x and Z linear regression and the residual ry of y and Z linear regression, i.e. the Pearson correlation coefficient;
Proc Corr Data=corr_eg; var height weight; Partial age; * After removing the effect of partial, look at the relationship between height and weight; run;
/***********************************************proc freq****************************************************** **********/
Study on the correlation function between two discrete variables in proc freq
PROC FREQ <options>;
TABLES requests </options>;
TEST options;
WEIGHT variable </option>; * When we do not want to enter a lot of observation lines, can be used as the weight of weight calculation, freq provisions weight column, in the list of the corresponding display;
Examples of weight
Data test; InputGroup$ outcome$Count; Datalines;drug Alive -Drug DeadTenPlacebo Alive thePlacebo dead -; run;procFreq data=test; Table Group*Outcome/chisq norow Nocol nopercent; WeightCount; * If there is no weight, then the list is all 1; Run
procformat; Value Purfmt1 ="$ - +" 0 ="<$ -"; run; * disordered qualitative double-variable analysis; procFreq data=Double. B_sales_inc; Tables Gender*Purchase/ chisq expected cellchi2 nocol nopercent; *CHISQ is the key to see Chi-square statistics; Format purchase purfmt.; Title1'Association between GENDER and PURCHASE'; run;
DataDouble. B_sales_inc; Set Double. B_sales; Inclevel= 1*(Income=' Low')+ 2*(Income='Medium') + 3*(Income=' High'); run;procformat; Value Purfmt1 ="$ - +" 0 ="<$ -"; run;procformat; Value Incfmt1='Low Income' 2='Medium Income' 3='High Income'; run; * ordered qualitative double variable analysis; procFreq data=Double. B_sales_inc; Tables Inclevel*Purchase/ chisq trend measures CL; * See Mantel-haenszel Chi Fang, then look at the trend test; Format Inclevel incfmt. Purchase purfmt.; Title1'Ordinal Association between Inclevel and PURCHASE?'; run;
Correlation Analysis SAS