This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH 09/14] Add a new pthread_mutexattr_setelision_np interface.

From: "Carlos O'Donell" <carlos at redhat dot com>
To: Andi Kleen <andi at firstfloor dot org>
Cc: libc-alpha at sourceware dot org, Andi Kleen <ak at linux dot jf dot intel dot com>, Roland McGrath <roland at hack dot frob dot com>
Date: Tue, 02 Jul 2013 12:41:45 -0400
Subject: Re: [PATCH 09/14] Add a new pthread_mutexattr_setelision_np interface.
References: <1372452807-25216-1-git-send-email-andi at firstfloor dot org> <1372452807-25216-10-git-send-email-andi at firstfloor dot org> <51D09CA6 dot 7060700 at redhat dot com> <20130630215353 dot GL6123 at two dot firstfloor dot org> <51D20715 dot 7080704 at redhat dot com> <20130702005420 dot GU6123 at two dot firstfloor dot org>

On 07/01/2013 08:54 PM, Andi Kleen wrote:
> 
> I first should clarify the use models I see for the per lock elision tuning
> interfaces.
> 
> - Experimentation is one of them

OK.

> - One scenario that may well happen is that some distributions don't
> want to force everyone to use elision (and don't trust the adaptive
> algorithms to be good enough). In this case they likely will not 
> set the configure flag.  But individual applications may want 
> to enable elision, because they have lock scaling problem.
> Now one way would be to use the environment variables (which
> you also chose not to merge), but then the applications would need to
> trust that elision works for every lock.
> With the per lock tuning interface they can instead opt-in
> only for specific locks, they know have scaling problems and they
> explicitely tested.
> I think supporting this model is important (even though I personally
> prefer elision to be enabled by default).

I agree, but I need to talk this out with the community.

Having a top-down approach to controlling elision per lock
is important. I have already stated that I think a perfect API and
implementation that infers elision correctly 100% of the time 
is an unreasonable goal. Somewhere, sometime, downstream
is going to need to disable elision for just one lock, or enable elision
for just one lock. Upstream ignoring the problem just causes more
work for downstream which will need to add an API/ABI and use that
instead.

The interesting thing is that the form of the solution that we're
using downstream looks like this:
* Add private API to glibc.
* Create alternate library libnewapi.so that uses the private API.
* Expose interface to user via libnewapi.so
* Wait for alternate solution in upstream.
* Migrate libnewapi.so users back to glibc.
* Phase out libnewapi.so and private API.

We could leverage something like this upstream to provide an
experimental libc e.g. libcx.so for use by experimental distros
and to collect feedback from users.

>> Existing glibc functions being poorly designed does not excuse
>> us from trying to do better.
> 
> Ok so you're completely disregarding existing practice.

No, only those interfaces I don't like :-)

> I don't think mallopt() is poorly designed: it does exactly
> what it needs to do.
> 
> For example tuning the mmap threshold (that is controlling when
> to return memory back to the OS) makes a lot of sense.

It makes sense to you because your an expert. The setting of
the threshold is not trivial to determine. It would again have
been a better interface if the user described their memory allocation
needs or patterns (something they can easily determine) and then
allow the system to infer an optimal behaviour.

Having said that, the same principle applies here, eventually the
inference by the implementation will be wrong and you'll want a
direct way to control the mmap threshold.

>>> Also for experimenting with the algorithms there needs to be some
>>> way to turn it on/off per lock. That is what the interface does.
>>
>> In which case what you're asking for is an API to be added forever
>> to glibc for a potentially temporary problem.
> 
> It may be temporary if distributions chose to enable 
> elision by default (but I suspect even then there will be always some who
> feel the need for per lock tuning)

I think we'd like to be in that position.

> However if distributions chose to disable it by default,
> it's not temporary at all. It would be the only way to use it
> (see above)

That's true.

>> It's probably easier to use the generic tunnables inteface I'm
>> trying to add to do the kind of experimenting your interested in, 
>> but that's another interface to design.
> 
> Maybe I'm missing something, but how would a generic tunable 
> interface be able to tune something per lock? 

The tunables API is not designed, but the final goal was to apply
the tunables to a given context. The simplest context to use and
manipulate is the global context. That doesn't preclude applying
a set of tunables to a set of critical sections e.g. lock.

> Or are you talking about a generic rwlock_t/mutex_t tuning interface?

That's right we would need some API functions to take a tunning
context and apply it to a lock, probably applied at initialization
time (with no static initializer support).

e.g.
tunable_contex *tctx = create_tunable_context_np ("elidedlocks");
set_tunable_np (tctx, "GLIBC_PTHREAD_ELISION", "elision");
...
pthread_mutexattr_tune_np (tctx);
...
pthread_mutex_init (...);

>>> Could be done, but would be likely awkward.
>> Could you describe exactly what you mean by awkward?
> 
> You could potentially end up with a lot of flags/parameters.
> It would be a very complex interface.

I can't disagree with you there.

Describing the properties of the critical section is
still going to be a useful exercise.

>>
>> I'm thinking of:
>>
>> * Describe the average critical section for the lock:
>> - Does I/O? (yes/no)
>> - Size of critical section? (insn count)
>> - Syscalls (yes/no)
>> - How much data is accessed (bytes)
> 
> IO and syscall could be done (although in most cases they
> are the same). The others are not well defined in TSX.
> 
>> - ...
>> * Lock hints recorded in a table.
> 
> In what table? A global table? How do per object tunables
> fit into your model?

There are many ways to implement a lock hinting API.

Yes, I was thinking of a global table, since it would reduce
the need to record the hint information in the mutex attributes.
The table can be used when the lock is initialized to set the
parameters of the lock. This is just a back of the envelope
design right now.

The per-object tunables are a different model, and I described
that above. We create a tunning context with all the parameters
set, and then apply it to a particular object. Keep in mind that
tunables have a strict meaning and we are still working towards
a consensus of "How" we would do them e.g.
http://sourceware.org/glibc/wiki/TuningLibraryRuntimeBehavior#How.3F

>> The mallopt interface is a direct interface into the malloc
>> implementation. It shouldn't exist. Instead we should have
>> a very easy to use pluggable mechanism to add alternate malloc
>> implementations.
>> While you can currently plugin an alternate
>> malloc, it's incredibly hard to do so correctly,
> 
> For me it would seem overkill to link in another malloc, just
> because I want to change the mmap threshold.

I agree, but some users want more than just he mmap threshold
changed. They want more and more flags to control fast bins,
fast bin coalescing, and even to never return memory to the
OS. At one point we need to provide a way for users to more
easily plug in their own malloc.

> Also this model doesn't work for tuning things per object.

That's right.

>>  and we don't
>> help. Thus users wanted knobs and over the years they got them.
>> The fixed knobs make malloc very difficult maintain.
> 
> You can always ignore them if you don't like them anymore, 
> right?

Not as far as I can tell. We didn't document them as such.

If anything I'd like to add stronger language to the manual
about them being hints and the possibility of ignoring them.

It may be too late in that applications are expecting these
to be around forever because we didn't properly document
them as hints.

>> I would expect mallopt to be subsumed into the generic tunables
>> interface where we explicitly exclude the flags from the ABI
>> and make little to no guarantees.
> 
> Not sure how that is different from existing mallopt(). You can 
> just ignore settings you don't like anymore right?

No, the tunables interface would provide even less guarantees
and allow us to experiment more with internal knobs.

> Of course the user's program may run poorly, but hopefully
> you only do that if you really have a much better algorithm.

Agreed.

> In fact I was doing some code searches for mallopt() users earlier,
> and I found several alternative malloc libraries that just
> ignore most settings (but implement others, it seems to be even
> portable to other OS)

Yes, but then there are some user applications I've seen that
*need* the settings otherwise they OOM.

>>>> I will review this patch anyway because I feel that the review
>>>> of the patch is instructive and because this patch could get
>>>> picked up by a downstream that desires to add these interfaces
>>>> and maintain the as an experimental interface.
>>>
>>> That would fragment the glibc ABI. Is that what you're suggesting?
>>
>> Downstream can do whatever it wants and make whatever guarantees
>> it wants. I don't want to judge them.
>>
>> A fragmented glibc ABI is no different from running a kernel with
>> or without support for certain features.
> 
> FWIW the kernel tries very hard to stay binary compatible.
> 
> And if I'm allowed to judge I think any ABI fragmentation is a
> incredible bad thing.

You are absolutely allowed judge such a strategy and to provide
your own experienced view on it.

Judging downstream decisions is different since we are not
always involved, and they have their own pressures to contend
with. Rather than judging I'd like to see solutions that assist
downstream in solving their problems.

Cheers,
Carlos.

Follow-Ups:
- Re: [PATCH 09/14] Add a new pthread_mutexattr_setelision_np interface.
  - From: Dominik Vogt

References:
- Re: [PATCH 09/14] Add a new pthread_mutexattr_setelision_np interface.
  - From: Carlos O'Donell
- Re: [PATCH 09/14] Add a new pthread_mutexattr_setelision_np interface.
  - From: Andi Kleen

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]