Transferred from: http://blog.csdn.net/wsywl/article/details/5859751
Spearman Rank (spearman rank) correlation coefficient
1. Introduction
In statistics, the Spearman grade correlation coefficient is named after Charles Spearman and is often represented by the Greek letter ρ (rho). The Spearman level correlation coefficient is used to estimate the correlation between two variables x and Y, where the correlations between variables can be described using monotone functions. If the same two elements are not present in the two set of two variables, then when one of the variables can be represented as a good monotone function of the other variable (i.e. two variables are trending the same), the ρ between the two variables can reach +1 or-1.
Suppose that two random variables are X, Y (also can be regarded as two sets), they have the number of elements of N, and two of the variables taken by the first (1<=i<=n) values are expressed as Xi, Yi respectively. To sort x, Y (both ascending and descending), get two elements in the set X, Y, where element Xi, Yi is the rank of Xi in X, and Yi's rank in Y. Subtract the elements from the set X, y to get a line of difference set D, where di=xi-yi,1<=i<=n. The Spearman level correlation coefficients between the random variables x and y can be computed by x, Y, or D, as follows:
Calculated from the rank difference set D (Formula One):
Calculated from the rank set X, Y (Spearman grade correlation coefficient is also considered to be the Pearson correlation coefficient of two random variables, the following is actually the Pearson correlation coefficient of x, y) (Formula Two):
The following is an example of calculating the ranking of elements in a collection (only applicable to the calculation of Uspirman level correlation coefficients)
It is important to note that when the two values of a variable are the same, their rank is obtained by averaging their positions.
2. Scope of application
Spearman level correlation coefficient requirements for data conditions No Pearson correlation coefficient is strict, as long as the observation of two variables is a pair of grade data, or is the continuous variable observation data converted to the grade information, regardless of the total distribution of two variables, sample size, Can be studied using Spearman grade correlation coefficients.
3, MATLAB implementation
SOURCE program One:
Spearman level correlation coefficient of MATLAB implementation (based on the difference set D calculation, using the above Formula One)
[CPP]View PlainCopy
- function Coeff = Myspearman (X, Y)
- % This function is used to implement the calculation operation of the Spearman level correlation coefficient
- %
- Input
- % X: Numeric sequence of inputs
- % Y: The numeric sequence of the input
- %
- Output
- % Coeff: Two input numeric sequence correlation coefficient of x, y
- If Length (X) ~= Length (Y)
- Error (' The dimensionality of the two numeric series is not equal ');
- return;
- End
- N = Length (X); % get the length of the sequence
- Xrank = Zeros (1, N); % stores the rank of each element in X
- Yrank = Zeros (1, N); % stores the rank of each element in Y
- % calculates each value in the Xrank
- For i = 1:n
- Cont1 = 1; The% record is greater than the number of elements of a particular element
- Cont2 =-1; % records the same number of elements as a specific element
- for j = 1:n
- if X (i) < X (j)
- Cont1 = cont1 + 1;
- ElseIf x (i) = = X (j)
- Cont2 = Cont2 + 1;
- End
- End
- Xrank (i) = Cont1 + mean ([0:cont2]);
- End
- % calculates each value in the Yrank
- For i = 1:n
- Cont1 = 1; The% record is greater than the number of elements of a particular element
- Cont2 =-1; % records the same number of elements as a specific element
- for j = 1:n
- if Y (i) < Y (j)
- Cont1 = cont1 + 1;
- ElseIf y (i) = = Y (j)
- Cont2 = Cont2 + 1;
- End
- End
- Yrank (i) = Cont1 + mean ([0:cont2]);
- End
- % calculation of Spearman grade correlation coefficients using differential rank (or rank) sequence
- Fenzi = 6 * SUM ((xrank-yrank). ^2);
- FENMU = N * (n^2-1);
- Coeff = 1-FENZI/FENMU;
- End% function Myspearman ends
SOURCE program Two:
Calculate Spearman level correlation coefficients using existing functions in MATLAB (using the Formula Two above)
- Coeff = Corr (X, Y, ' type ', ' Spearman ');
Note: It is necessary to ensure that X and Y are the column vectors when calculating spearman level correlation coefficients using MATLAB self-band function, and the function of MATLAB is to calculate the Spearman level correlation coefficients of the series by Formula Two. In general, using the source program given above can get the desired result, but when an element with the same value appears in sequence x or Y, the result of the source program will be different from the result of the Corr function in MATLAB, because when the sequence x or y has the same element, The results of Formula One and Formula two are biased. Here you can do this by going to the following three lines in the source program
- Fenzi = 6 * SUM ((xrank-yrank). ^2);
- FENMU = N * (n^2-1);
- Coeff = 1-FENZI/FENMU;
Switch
- Coeff = Corr (Xrank', Yrank ');% Pearson correlation coefficient
This allows the source program to get the same result as the MATLAB self-contained function when calculating the spearman level correlation coefficients between variables that contain the same element value (at least one of the values in a variable's collection). Once the program has been modified, it can also be used to calculate the Spearman level coefficients between the general variables (none of the same elements in the collection of two variables).
For the Pearson correlation coefficient calculation, refer to the following articles:
Statistical correlation coefficient (1) correlation coefficient of--pearson (Pearson) and realization of MATLAB
4. Reference Content
(1), Http://en.wikipedia.org/wiki/Spearman ' s_rank_correlation_coefficient
Spearman rank (Spearman grade) correlation coefficient and matlab realization