Spearman rank (Spearman grade) correlation coefficient and matlab realization

Source: Internet
Author: User

Transferred from: http://blog.csdn.net/wsywl/article/details/5859751

Spearman Rank (spearman rank) correlation coefficient

1. Introduction

In statistics, the Spearman grade correlation coefficient is named after Charles Spearman and is often represented by the Greek letter ρ (rho). The Spearman level correlation coefficient is used to estimate the correlation between two variables x and Y, where the correlations between variables can be described using monotone functions. If the same two elements are not present in the two set of two variables, then when one of the variables can be represented as a good monotone function of the other variable (i.e. two variables are trending the same), the ρ between the two variables can reach +1 or-1.

Suppose that two random variables are X, Y (also can be regarded as two sets), they have the number of elements of N, and two of the variables taken by the first (1<=i<=n) values are expressed as Xi, Yi respectively. To sort x, Y (both ascending and descending), get two elements in the set X, Y, where element Xi, Yi is the rank of Xi in X, and Yi's rank in Y. Subtract the elements from the set X, y to get a line of difference set D, where di=xi-yi,1<=i<=n. The Spearman level correlation coefficients between the random variables x and y can be computed by x, Y, or D, as follows:

Calculated from the rank difference set D (Formula One):

Calculated from the rank set X, Y (Spearman grade correlation coefficient is also considered to be the Pearson correlation coefficient of two random variables, the following is actually the Pearson correlation coefficient of x, y) (Formula Two):

The following is an example of calculating the ranking of elements in a collection (only applicable to the calculation of Uspirman level correlation coefficients)

It is important to note that when the two values of a variable are the same, their rank is obtained by averaging their positions.

2. Scope of application

Spearman level correlation coefficient requirements for data conditions No Pearson correlation coefficient is strict, as long as the observation of two variables is a pair of grade data, or is the continuous variable observation data converted to the grade information, regardless of the total distribution of two variables, sample size, Can be studied using Spearman grade correlation coefficients.

3, MATLAB implementation

SOURCE program One:

Spearman level correlation coefficient of MATLAB implementation (based on the difference set D calculation, using the above Formula One)

[CPP]View PlainCopy
  1. function Coeff = Myspearman (X, Y)
  2. % This function is used to implement the calculation operation of the Spearman level correlation coefficient
  3. %
  4. Input
  5. % X: Numeric sequence of inputs
  6. % Y: The numeric sequence of the input
  7. %
  8. Output
  9. % Coeff: Two input numeric sequence correlation coefficient of x, y
  10. If Length (X) ~= Length (Y)
  11. Error (' The dimensionality of the two numeric series is not equal ');
  12. return;
  13. End
  14. N = Length (X); % get the length of the sequence
  15. Xrank = Zeros (1, N); % stores the rank of each element in X
  16. Yrank = Zeros (1, N); % stores the rank of each element in Y
  17. % calculates each value in the Xrank
  18. For i = 1:n
  19. Cont1 = 1; The% record is greater than the number of elements of a particular element
  20. Cont2 =-1; % records the same number of elements as a specific element
  21. for j = 1:n
  22. if X (i) < X (j)
  23. Cont1 = cont1 + 1;
  24. ElseIf x (i) = = X (j)
  25. Cont2 = Cont2 + 1;
  26. End
  27. End
  28. Xrank (i) = Cont1 + mean ([0:cont2]);
  29. End
  30. % calculates each value in the Yrank
  31. For i = 1:n
  32. Cont1 = 1; The% record is greater than the number of elements of a particular element
  33. Cont2 =-1; % records the same number of elements as a specific element
  34. for j = 1:n
  35. if Y (i) < Y (j)
  36. Cont1 = cont1 + 1;
  37. ElseIf y (i) = = Y (j)
  38. Cont2 = Cont2 + 1;
  39. End
  40. End
  41. Yrank (i) = Cont1 + mean ([0:cont2]);
  42. End
  43. % calculation of Spearman grade correlation coefficients using differential rank (or rank) sequence
  44. Fenzi = 6 * SUM ((xrank-yrank). ^2);
  45. FENMU = N * (n^2-1);
  46. Coeff = 1-FENZI/FENMU;
  47. End% function Myspearman ends

SOURCE program Two:

Calculate Spearman level correlation coefficients using existing functions in MATLAB (using the Formula Two above)

    1. Coeff = Corr (X, Y, ' type ', ' Spearman ');

Note: It is necessary to ensure that X and Y are the column vectors when calculating spearman level correlation coefficients using MATLAB self-band function, and the function of MATLAB is to calculate the Spearman level correlation coefficients of the series by Formula Two. In general, using the source program given above can get the desired result, but when an element with the same value appears in sequence x or Y, the result of the source program will be different from the result of the Corr function in MATLAB, because when the sequence x or y has the same element, The results of Formula One and Formula two are biased. Here you can do this by going to the following three lines in the source program


    1. Fenzi = 6 * SUM ((xrank-yrank). ^2);
    2. FENMU = N * (n^2-1);
    3. Coeff = 1-FENZI/FENMU;

Switch

    1. Coeff = Corr (Xrank', Yrank ');% Pearson correlation coefficient

This allows the source program to get the same result as the MATLAB self-contained function when calculating the spearman level correlation coefficients between variables that contain the same element value (at least one of the values in a variable's collection). Once the program has been modified, it can also be used to calculate the Spearman level coefficients between the general variables (none of the same elements in the collection of two variables).

For the Pearson correlation coefficient calculation, refer to the following articles:

Statistical correlation coefficient (1) correlation coefficient of--pearson (Pearson) and realization of MATLAB

4. Reference Content

(1), Http://en.wikipedia.org/wiki/Spearman ' s_rank_correlation_coefficient

Spearman rank (Spearman grade) correlation coefficient and matlab realization

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.