This is the mail archive of the gsl-discuss@sourceware.org mailing list for the GSL project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: spearman coefficient


Looks perfect, thanks a lot!

No problem. In fact I'm not using it myself a lot because I prefer
parametric modeling, but I did use it to reproduce results from other
people.

TimothÃe Flutre


On Tue, May 28, 2013 at 5:44 PM, Patrick Alken
<patrick.alken@colorado.edu> wrote:
> I've added gsl_stats_spearman to the repository and have tested it on a few
> sample datasets. I essentially rewrote the routine using octave and
> numerical recipes as examples, though I rewrote everything from scratch so
> there are no copyright issues.
>
> I added the function gsl_sort_vector2, similar to the numerical recipes
> sort2() function, which eliminates the need to allocate a permutation and
> sort vector. The workspace for the rank vectors is passed directly to the
> function so there is no need to allocate a separate workspace now.
>
> It is possible to write the function to calculate the rank vectors in-place
> in the data vectors, but I opted to keep those inputs untouched to stay
> consistent with the rest of the statistics routines. The user must pass in a
> workspace of size 2*n.
>
> I put the function in statistics/covariance_source.c so it will be defined
> with all the different types (float,double,int,short,etc) and its documented
> in the manual.
>
> I'm sorry I wasn't able to directly use a lot of your code, but I do think
> this implementation is much more consistent with the rest of the library
> design. If you are using this function regularly in your work I would
> appreciate any feedback you can give (ie testing it with a wide range of
> inputs).
>
> Patrick
>
>
> On 05/25/2013 03:25 PM, TimothÃe Flutre wrote:
>>
>> Hi Patrick,
>>
>> thanks for your detailed reply. (I don't know why I didn't received
>> your email, I had to check the GSL mailing list archive to see it,
>> that's why I'm answering directly to you this time.)
>>
>> About introducing a new workspace, I did it based on your advice from last
>> year:
>> http://sourceware.org/ml/gsl-discuss/2012-q1/msg00011.html
>>
>> I don't have a strong opinion on what is the best, but someone else
>> commented on my code and also thought that it would be better to have
>> a workspace:
>> https://gist.github.com/timflutre/1784199#comment-82458
>>
>> Maybe the code could offer two functions, with or without the
>> workspace? In this case, is there any guidelines to name the
>> functions?
>>
>> I had a look at the implementation in R. The description of the
>> interface is here:
>> http://stat.ethz.ch/R-manual/R-patched/library/stats/html/cor.html).
>>
>> Even though it indicates that the argument "method" can take the value
>> "spearman", I don't see it anymore in the R code and thus I am a bit
>> confused by their implementation:
>> https://github.com/wch/r-source/blob/trunk/src/library/stats/R/cor.R#L21
>>
>> Moreover, the R code calls C code:
>>
>> https://github.com/wch/r-source/blob/trunk/src/library/stats/src/cov.c#L623
>>
>> The file with the C code has several macros and functions to compute
>> covariance or correlation, to handle missing data in different ways,
>> to deal with Pearson, Spearman and Kendall coefficients, etc. All this
>> makes it really hard for me to understand it...
>>
>> Finally, I looked at the algorithm in Numerical Recipes in C, the pdf
>> of the book is available here:
>> www2.units.it/ipl/students_area/imm2/files/Numerical_Recipes.pdfâ
>>
>> However, the GSL web site says that we can't use algorithms from this
>> book because of the non-free license.
>>
>> Also, it seems to me that spear() from Numerical Recipe (pdf page 641)
>> uses the function srt2() (Quicksort with 2 arrays, page 334) which
>> seems to require to allocate another array, "istack". Therefore, at
>> the end, it doesn't seem to me that it's much better than my d and
>> perm vector, which have the advantage of using other functions of the
>> GSL (gsl_sort_vector and gsl_sort_vector_index).
>>
>> But again, I'm really not an expert programmer, in C or any other
>> language. So I tried to see how I could change my code based on what
>> you said but I don't see any obvious ways to do it (except copying the
>> code from Numerical Recipe).
>>
>> If you don't want to include the code as it is into the next release
>> of the GSL, I'm fine with that. Of course, if you have a better
>> understandng of all this and you can explain me what to do, I can try
>> to help.
>>
>> Best,
>>
>> TimothÃe Flutre
>
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]