This is the mail archive of the
ecos-discuss@sources.redhat.com
mailing list for the eCos project.
Re: simultaneous socket write/close causes panic?
>>>>> "Grant" == Grant Edwards <grante@visi.com> writes:
>> The current application code appears fatally flawed and must be
>> fixed. A socket is a shared resource, it should only be
>> manipulated by one thread at a time.
Grant> I think that having one thread reading a socket and a
Grant> second one writing a socket should be allowed (and should
Grant> work). The case in my original post (close/write) is
Grant> definitely dodgy though.
>> If the application is fixed then there is no need to worry
>> about extra locking in the TCP/IP stack, at least not for this
>> problem.
Grant> Unless there are other cases that really should work that
Grant> don't: like one thread reading and one thread writing [IIRC
Grant> that is the one that I found/fixed way back when].
Grant> IMO, an embedded OS should only panic if there's absolutely
Grant> nothing else to do and no way to avoid the situation.
I agree that one thread reading/another thread writing should work.
But I believe the system's behaviour for a concurrent write/close is
undefined, i.e.
thread A: close(socket_fd);
thread B: write(socket_fd, buf, size);
is basically equivalent to:
thread A: free(some_buffer);
thread B: memcpy(some_buffer, elsewhere, many_kilobytes);
Any application which attempts the latter is fundamentally broken.
Sometimes you can get away with it, sometimes the system will crash.
You would not blame the memory allocation package when a crash occurs.
Similarly the concurrent write/close is fundamentally broken.
Sometimes the application can get away with it, sometimes you will get
sensible error codes, sometimes the system will crash - in this case,
raise a panic. The fault is with the application developer, not with
the TCP/IP stack. Admittedly more programmers will have experience
with memory corruption bugs than with the subtleties of socket
programming.
Adding more assert's to the TCP/IP stack to catch such problems during
development is fine, because those do not add any overhead to the
production system. Adding lots of locking or additional run-time tests
to work around bugs in some applications is much less acceptable. In
my opinion anyway.
Bart
--
Before posting, please read the FAQ: http://sources.redhat.com/fom/ecos
and search the list archive: http://sources.redhat.com/ml/ecos-discuss