1. Get the R value and the P value between the dataset:
R_fta_pts,p_value = Pearsonr (nba_stats["pts"],nba_stats["FTA"])
R_stl_pf,p_value = Pearsonr (nba_stats["STL"],nba_stats["PF"]) # It'll return R value and P value.
2. The function of getting convariance form the data set, the Convariance is the value of this measure how much both variables correlated with all other. If one changes to bigger, the other changes to bigger. Which said these, variables is corresponse. Here is the function of getting the convariance:
Here is the Formular:
def conv_compute (x, y): #define A function to calculate the Convariance
mean_x = SUM (x)/len (x)
mean_y = SUM (y)/len (y)# Calculate the mean of each column
X_diff = [i-mean_x for i in X]
Y_diff = [n-mean_y for n ' y] # Calculate the difference for both column, if it's hard-to-use for loop, we can think About the list function.
Sum_diff =[x_diff[i]* Y_diff[i] for I in range (len (x)) " # Use Range (len ()) function-to-replace the For loop
return sum (Sum_diff)/len (Sum_diff)
COV_STL_PF = Conv_compute (nba_stats["STL"],nba_stats["PF"])
cov_fta_pts = Conv_compute (nba_stats["FTA"],nba_stats["pts"])
3. The Calculate correlation coefficient:the fomular is
From NumPy import CoV
Cov_1 = CoV (nba_stats["FTA"],nba_stats["BLK"]) [0,1]
Std_1 = nba_stats["FTA"].STD () * nba_stats["blk"].STD ()
R_FTA_BLK = Cov_1/std_1
Statistics and Linear Algebra 3