This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Thread-, Signal- and Cancellation-safety documentation


On Sun, 2013-05-19 at 04:06 -0300, Alexandre Oliva wrote: 
> On May 17, 2013, Torvald Riegel <triegel@redhat.com> wrote:
> 
> > On Thu, 2013-04-04 at 16:04 -0300, Alexandre Oliva wrote:
> >> On Mar 26, 2013, Rich Felker <dalias@aerifal.cx> wrote:
> >> > The cleanest fix I can envision would be atomically replacing the
> >> > pointer to the locale object once the new one is prepared.
> >> 
> >> If it was a single pointer, that would indeed be a good course of
> >> action, but there are multiple pointers that may each change
> >> independently *or* with a single setlocale call.
> 
> > Getting atomic snapshots of several memory locations without creating
> > contention issues in the common case is doable, especially like in this
> > case when you control the writers and assume that updates are rare
> > (which I guess is the case for locales, right)?
> 
> This is the case indeed.  However, the exposed internal representation
> doesn't seem to have been designed so as to enable this sort of
> copy,modify,reset-pointer implementation.

It wouldn't need indirections through pointers in the internal
representation, if that's what you're worrying about.  Many STM(-like)
algorithms don't use indirections.  With a mostly-read workload, and
some control over how the data is accessed, those algorithms can be
pretty efficient.  But I'd have to look at the concrete case to make a
better estimate how efficient.

> >> Indeed, besides the ctype functions that don't take a locale object, and
> >> use the active one (which might be the global locale object), there are
> >> those that take a pointer to a locale object, and if this object is the
> >> global locale object, it may change from under them.  Even callers of
> >> these functions that attempt to read locale properties once as they
> >> start are toast if they also call any of these interfaces, for then the
> >> early-reading (presumably for internal consistency) won't guarantee that
> >> the callees use the same locale information.
> 
> > And this is exactly an example why saying that "setlocale is
> > thread-safe" isn't sufficient in terms of the definition.  That's not
> > your fault; but I believe we need to be more precise here, even if POSIX
> > isn't.
> 
> > For example, what you'd get with the atomic snapshots and indirection
> > approach outlined above doesn't give you the ordering guarantees that
> > one might expect
> 
> I'm not sure about what ordering guarantees one might expect, or even
> whether such expectations are reasonable.  Can you quote any passage of
> the relevant standards that might have set such set such expectations
> are reasonable?

At least http://www.unix.org/whitepapers/reentrant.html states:

In POSIX.1c, a "reentrant function" is defined as a "function whose
effect, when called by two or more threads, is guaranteed to be as if
the threads each executed the function one after another in an undefined
order, even if the actual execution is interleaved" (ISO/IEC
9945:1-1996, Â2.2.2).

That hints at some (total?) order.  Note that reentrancy is used to
explain/define the thread-safe property in this paper; the definitions
of thread-safety that I've seen so far in the standard doesn't take this
route of explanation.

> Indeed, it seems to me that âthread safeâ has very little to do with
> ordering guarantees.  IMHO, setlocale (to keep in the current example)
> has no global ordering requirements whatsoever, so it would be perfectly
> legitimate for it to modify a global pointer in a local CPU cache and be
> done with it; any happens-before relationship that might enable other
> threads to get to see the effects of this pointer change would be
> established by *other* behavior of the threads, such as explicit forms
> of inter-thread synchronization, or system calls defined as atomic or
> defining an implicit ordering (say, writing to and reading from opposite
> ends of a pipe, sending and handling a signal, etc).

There must be *some* ordering guarantees.  If there were none, it would
be perfectly fine to always return the initial value of something (e.g.,
the initially set locale) even after it has been changed by the same
thread previously.  Then, you can't actually change anything.  So in the
absence of actions by other threads, it should at least have the
ordering guarantees of a sequential program.

Note that, to state something obvious, we don't want requirements like
"setlocale() is FIFO" because that doesn't make sense.  As you say,
often we'll have ordering guarantees established by something else.  But
if we have, then we can decide whether we want our operations behavior
to be consistent with the (externally established) ordering; and this
guarantee of consistency is also an ordering guarantee.  So, for
example, having something in a local CPU cache would be fine because as
soon as we synchronize with other CPUs, we'll get an update for the
cache line; but caching in a separate thread-local variable would not be
okay.

Second, we need to decide whether we want to give additional ordering
guarantees when we produce a value and another thread consumes it.  For
example, if we run operations A and B, both which produce a value or
some other observable effect, and run B after A: is every other thread
that observes B's results also guaranteed to observe A's result as well?
There's no single right answer to this.  It's likely easier and less
error-prone for programmers if A is always observed as well by the other
thread, but it can require a bit more runtime overheads on some
architectures.  But we need to be clear about what we guarantee, because
depending on one's mindset, both options can be intuitive.

> That's why I believe that the notion of thread safety NOT imposing
> ordering is a conscious decision rather than incompleteness; it's a
> strength rather than a weakness; not imposing ordering where no ordering
> is required enables more efficient implementations.

I have my doubts that it was a conscious choice to give an imprecise
definition.  There are at least some weak ordering guarantees.  In the
end, I think the goal is to make it clear to programmers what is
actually guaranteed, without requiring them to make guesses or draw
complicated conclusions.  The fact that we are discussing what the
standard actually means is an indication that the definition is not
clear enough.

C++ is, based on the opinions of some committee members (meaning I don't
know whether that's an agreed-upon decision), trying to typically give
stronger ordering guarantees (like in the observe-A-if-you-observed-B
example above).

> Under this line of reasoning, calling setlocale in one thread while
> another thread is in a long-running tight loop that keeps on using an
> old locale pointer that remained in the cache in its CPU is perfectly
> reasonable: when there's no ordering relationship between the setlocale
> call in one thread and any of the iterations of the tight loop in the
> other, why would it even make sense for the standards impose one?

If the observing thread doesn't observe anything else that might violate
consistency and the ordering guarantees, then yes it's fine to read an
old value.  Guarantees that would force it to read a new value are
progress guarantees, and they're pretty vague in C++11 (C11 is a little
better in that area).  The ordering guarantees we're looking at are
correctness guarantees, not progress guarantees.

> In
> any case, since there isn't any ordering established in the program
> itself (no operations that establish happens-before), any such ordering
> would have to do with implementation details of the hardware that would
> enable such an ordering to be established.

I'm not quite sure I understand what you're saying, but the base
shouldn't be a hardware's memory model, but a more general-purpose
memory model such as C11's memory model.

> Me, I think it would be a
> mistake to bring that into a standard such as POSIX, for it would tie
> the standard to specific hardware implementation choices.

I would think it should have a memory model that is compatible with
C11's.  Then it would be tied to the language's memory model, and thus
hardware-independent (although implementations would of course have pay
attention to how to map this to the particular hardware's model).

We always need some memory model anyway, or you won't be able to reason
about what happens when several threads access memory.

> Can you imagine how unfortunate it would be if POSIX had chosen to
> specify say atomic, global clock-based ordering requirements for every
> thread-safe function, which might have made sense for uniprocessors of a
> decade ago, and we were now wasting tons of cycles trying to force
> loosely-synchronized multi-distributed-processors with transactional
> memory to abide by that standard?

But a standard cannot avoid having to make a choice of definition if
users need to work with what's being defined.  You can't just throw out
something vague just to avoid making a bad choice.  People will use it,
and if the definition is vague, then chances are high that some people
will understand the definition differently than others.  Once you're in
this position, actually guaranteeing something weaker then is likely to
break some code.
It's fine to give just weak guarantees, but then you need to say this
explicitly *and* clarify how users can achieve the stronger guarantees.
The stronger guarantees aren't something that you can necessarily
enforce externally: You typically can if it's just about lacking memory
barriers, but if the implementation is caching internally (eg, in
thread-local vars), you can't enforce an update without the
implementation making this possible (by not caching).  And, if you give
weaker guarantees, you might be making it more difficult to use for
programmers.

(And as a side comment: the transactional memory hardware
implementations proposed so far actually give pretty strong ordering
guarantees.)

> IMHO, POSIX specifies operations that establish happens-before
> relationships, and operations that are to be atomic, but those are the
> exception, rather than the rule.  For everything else, no ordering
> requirements exist, even for thread-safe functions, and IMHO that's how
> it should be.

I've argued above that we at least need some ordering guarantees. Do you
mean "no ordering requirements" literally, or could you come up with a
precise definition of the minimum set of ordering guarantees you want?

IMHO, it's always a trade-off between making stuff fast or easier to
use.  Often, you can make it reasonably fast and easy to use with
stronger guarantees (eg, typical lock-based implementations give you the
stronger guarantees in the A/B example above).
Often, it might be better to pick an easy-to-use default and provide
implementation variants with weaker guarantees for the operations where
you really need it.  C++ atomics take this approach, for example: the
default is sequentially-consistent memory order, and if you want weaker
guarantees you need to say so explicitly.


Torvald


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]