Transferred from: http://blog.csdn.net/wsywl/article/details/5727327
Since the statistical correlation coefficients used are relatively frequent, here are a few articles to briefly introduce these coefficients.
Correlation coefficient: Examine the degree of correlation between two things (what we call variables in the data).
If there are two variables: X, Y, the meaning of the final calculated correlation coefficients can be understood as follows:
(1), when the correlation coefficient is 0 o'clock, the x and Y variables have no relation.
(2), when the value of x increases (decreases), the Y value increases (decreases), two variables are positive correlation, the correlation coefficient between 0.00 and 1.00.
(3), when the value of x increases (decreases), the Y value decreases (increases), two variables are negatively correlated, and the correlation coefficients are between 1.00 and 0.00.
The greater the absolute value of the correlation coefficient, the stronger the correlation, the closer the correlation coefficient is to 1 or-1, the stronger the correlation, the closer the correlation coefficient is to 0, the weaker the correlation degree.
The relative strength of a variable is usually judged by the following range of values:
Correlation coefficient 0.8-1.0 very strong correlation
0.6-0.8 Strong correlation
0.4-0.6 Intermediate Degree related
0.2-0.4 Weak correlation
0.0-0.2 very weakly correlated or unrelated
Pearson (Pearson) correlation coefficient
1. Introduction
Pearson's correlation, also known as product correlation (or moment-related), is a method of calculating the linear correlation that was proposed by British statistician Pearson in 20th century.
Assuming there are two variables x, Y, the Pearson correlation coefficients between the two variables can be calculated by the following formula:
Formula One:
Formula Two:
Formula Three:
Formula Four:
The four formulas listed above are equivalent, where e is the mathematical expectation, CoV represents the covariance, and N indicates the number of variables to be evaluated.
2. Scope of application
When the standard deviation of two variables is not zero, the correlation coefficients are defined, and the Pearson correlation coefficient applies To:
(1), two variables are linear relations, are continuous data.
(2), two variables are generally normal, or nearly normal single-peak distribution.
The observed values of (3) and two variables are paired, and each pair of observations is independent of each other.
3, MATLAB implementation
The MATLAB implementation of Pearson's correlation coefficient (according to Formula IV):
[CPP]View PlainCopy
- function Coeff = Mypearson (X, Y)
- % This function realizes the calculation operation of Pearson correlation coefficient.
- %
- Input
- % X: Numeric sequence of inputs
- % Y: The numeric sequence of the input
- %
- Output
- % Coeff: Two input numeric sequence correlation coefficient of x, y
- %
- If Length (X) ~= Length (Y)
- Error (' The dimensionality of the two numeric series is not equal ');
- return;
- End
- Fenzi = SUM (x. * Y)-(SUM (x) * SUM (Y))/length (x);
- Fenmu = sqrt ((sum (x. ^2)-sum (x) ^2/length (x)) * (Sum (y. ^2)-sum (y) ^2/length (x)));
- Coeff = FENZI/FENMU;
- End% function Mypearson ends
Pearson correlation coefficients can also be calculated using functions already in MATLAB:
[CPP]View PlainCopy
- Coeff = Corr (X, Y);
4. Reference content
Http://zh.wikipedia.org/zh-cn/%E7%9B%B8%E5%85%B3
Pearson (Pearson) correlation coefficient and MATLAB implementation