This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug nptl/13165] pthread_cond_wait() can consume a signal that was sent before it started waiting


http://sourceware.org/bugzilla/show_bug.cgi?id=13165

--- Comment #19 from Mihail Mihaylov <mihaylov.mihail at gmail dot com> 2012-09-20 10:21:39 UTC ---
(In reply to comment #16)
Sorry for the long reply. Please, bare with me, because this issue is very
subtle and I don't know how to explain it more succinctly.

First of all, let me clarify that this is a test that exposes the race, and not
the usage scenario that I claim should be supported. The usage scenario is
described in the bug description. Well, actually, I do claim that the scenario
in the test should be supported too, but the scenario in the description makes
more sense.

> I'm not aware of any requirement that pthread_cond_signal should block until a
> waiter has actually woken up. (Your test case relies on it to not block, so
> that it can send out multiple signals while holding the mutex, right?)  I'm
> also not aware of any ordering requirement wrt. waiters (i.e., fairness).  If
> you combine both, you will see that the behavior you observe is a valid
> execution.

I'm not making any assumptions about the state of the waiters when
pthread_cond_signal returns. All I'm assuming is that, no matter if the
signaling thread releases and reacquires the mutex after each sent signal or
sends all signals without releasing the mutex, at least as many waiters as the
number of signals will wake (eventually).

But even if this assumption is wrong (and it's not), if you set
releaseMutexBetweenSignals to true, the test will release the mutex after each
sent signal. In this case the test doesn't send multiple signals while holding
the mutex, and the problem still occurs.

As for fairness, this is not about fairness. It is also not about ordering
between the waiters. It's about ordering between waiters and signalers.

I'm getting tired of people jumping to fairness at the first mention of
ordering. You could say that I'm requesting fairness if I wanted the first
single signal to wake the waiter that blocked first. But all I'm requesting is
for the signal to wake at least one of the waiters that started waiting before
the signal was sent. I don't care which one of them.

This is guaranteed by the standard (from the documentation of pthread_cond_wait
and pthread_cond_signal on the opengroup site):

"The pthread_cond_signal() function shall unblock at least one of the threads
that are blocked on the specified condition variable cond (if any threads are
blocked on cond)."

And I think the next quote makes it very clear what threads are considered to
be blocked on the condvar at the time of the call to pthread_cond_signal():

"That is, if another thread is able to acquire the mutex after the
about-to-block thread has released it, then a subsequent call to
pthread_cond_broadcast() or pthread_cond_signal() in that thread shall behave
as if it were issued after the about-to-block thread has blocked."

In effect this means that each call to pthread_cond_signal() defines a point in
time and all waiters (or calls to pthread_cond_wait() if you prefer) are either
before this call, or after it. And only the ones that are before it are allowed
to consume the signal sent by this call.

Now, of course in a multiprocessor system it is hard to order events in time,
but that's where the mutex comes in. And if the signaling thread sends multiple
signals while holding the mutex, we can consider all these signals to be
simultaneous. But that doesn't change the validity of the test.

On the other hand, the standard doesn't guarantee that there won't be spurious
wakeups. However, glibc tries to prevent them. But the logic for this
prevention is flawed and causes the race that this bug is about.

So the net result is that glibc chose to provide a feature that is not
required, but dropped a much more important feature which is actually required.
Hence, this bug is not a fairness feature request, it is a correctness defect
report.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]