This is the mail archive of the gsl-discuss@sourceware.org mailing list for the GSL project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
On Tue, 23 Aug 2005 15:29:29 +0100 Brian Gough <bjg@network-theory.co.uk> wrote: > Hello, > > The multifit functions are a translation of the corresponding MINPACK > routines, which defined 'covar' as in GSL. In MINPACK the definition > of J includes a factor of sigma > > J_ij = dF_i/dx_j > F_i = (y_i-y(x))/sigma_i > > Does this account for the difference above? > > -- Hello, as far as I understand, given that I do not know MINPACK, I would say yes and no. The point is subtle and has to do with the double naure of the least-quare procedure: a generic tools for "fitting things" and a maximum likelihood estimator of a well defined class of statistical models. I hope you'll forgive me if I try to elaborate the issue a little bit below. >From a statistical point of view, you can assume that your data follow the model (I) y_i ~ f(x_i,B) + \w_i * \epsilon_i where \epsilon_i ~ N(0,1) (i.i.d. normally distributed with mean zero and variance 1) or instead the model (II) y_i ~ f(x_i,B) + \w_i * \epsilon_i where \epsilon_i ~ N(0,\sigma) (i.i.d. normally distributed with mean zero and unknon variance). Both prescriptions allow for the specification of weights w_i, but differ on the number of the unknowns. In particular, the variance of the distribution \sigma is estimated from data in Model II while it is assumed known in Model I. Now in GSL the model assumption for the different linear routines (f(x_i,B) = x_i . B) are: function: assumption: gsl_fit_linear Model II gsl_fit_wlinear Model I gsl_multifit_linear Model II gsl_multifit_wlinear Model I which is consistent (even if, AFAIK, undocumented): if weights are provided, than the variance is assumed know, if they are not, it is estimated. The possible inconsistency is with the non-linear function gsl_multifit_covar Model I now weights are NOT explicitly required, but nonetheless the variance is assumed known. For comparison, consider the analogous function in NAG C library, "e04ycc", which instead assume Model II. Probably it would help to explicitly mention the formula used to compute variance-covariance matrix in ALL the routines, so that the average dumb user (like me), by comparing the different formulas, can immediatly understand were differences can possibly arise. What do you think? Apologies for the lenght of the message. Giulio.
Attachment:
pgp00000.pgp
Description: PGP signature
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |