This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: lseek + read = ENOENT


Eric Blake wrote:

>> [I wrote]
>>This seems to be a bug in gcc.  The off_t argument to lseek is a 64-bit
>>type, but instead of being sign-extended to 64 bits, the value passed
>>(-sizeof(data)) passed is only extended to 32-bits, so is actually +4294967292.

[This is OT for Cygwin, and probably only of interest to Language Lawyers.]

> No, it is not a bug in gcc.  Read a good book on C, please.

Hmm, I wonder why this mailing list seems to encourage throw-away comments
like that?  I  said "seems to be" as I had a suspicion that the type
promotion rules of C may be to blame.  I should have *read* the good book
I have (the ISO C spec, as it happens).

However, I would say that this is one of the areas where I think the C spec
is wrong, as it leads to quite unintuitive semantics in cases like this.
Contrast this with Ada, where (roughly) expression components are coerced to the
required type of the whole expression before arithmetic operators are applied, and
also where intrinsic operations like sizeof yield unconstrained values (ie
they are not considered to be 32-bits, 64-bits or whatever until the value
is needed for a subsequent operation).

>>If you write:
>>   int n = -sizeof(data);
>>   lseek(fd, n, SEEK_END);
>>it works as expected.
> 
> Mostly right, because there you are promoting a signed
> 32-bit number to a signed 64-bit number, which
> sign-extends.  However, that approach is risky - if you
> have a file that is bigger than 2 GB, you will not get the
> correct result, because negation of an unsigned greater
> than 2GB results in a positive signed 32-bit value less
> than 2GB, instead of the intended negative 64-bit value
> with absolute value greater than 2GB.

Indeed, you too were "mostly right".  The size of the file is
nothing to do with this - it's the size of the object "data"
which matters.  My expression can indeed go wrong if
the size of data (in bytes) is between 2**31 and 2**32, which
is a trifle unlikely (but perhaps not so in the future).

The C spec states that the result of sizeof() is of type size_t, which
must be an unsigned integer type defined in stddef.h.  I don't see
any requirement for it to be 32 bits (just that it must be capable of
holding at least 65535), so if the gcc implementation had chosen
to make size_t 64 bits, the original code would have worked.

> The safer fix is to call:
> lseek(fd, -(off_t)sizeof(data), SEEK_END);

-- Cliff

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]