This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Lock elision: Give PTHREAD_MUTEX_NORMAL-like mutexes a new internal type.


On Thu, Jun 27, 2013 at 11:18:15AM +0200, Torvald Riegel wrote:
> On Thu, 2013-06-27 at 07:38 +0200, Dominik Vogt wrote:
> > On Tue, Jun 25, 2013 at 02:39:58PM +0200, Torvald Riegel wrote:
> > The let me rephrase my question:  How can you know what you're
> > interested with zero performance testing up to now.
> 
> What does zero performance testing refer to?  Zero wide-spread testing
> by exposing it to real users?

Zero means zero.  Have so seen a single real number except the ones
I have posted?

Actually, I find the discussion about Abi changes and new Apis
quite strange.  If people were not so intent on getting patches
into 2.18, we could experiment with the interfaces to our hearts
desire and decide which changes and additions are worth it _after_
some testing is done.  There seems to be so much talk about
fruitless points, for example, the vast majority of software is
_not_ going to benefit from elision (because mutexes are used only
sparingly, in ways that kill elision, or not at all), so trying to
enable elision for all applications looks irrelevant in my eyes.

> > As it is now, you're just assuming or hoping for certain
> > properties of transactional memory without any evidence that they
> > exist in reality.  I _have_ data on transactional memory that
> > suggests that your hopes will not come true.
> 
> Then post this data.  I assume that you have data on how Haswell's
> transactional memory (TM) performs, because that's what Andi's patches
> are about.

I have already posted data - for z/architecture, of course.
I've not seen any numbers on Haswell, only vague promises that
"everything will be great", and I know enough about HTM to beleive
that this is by far too optimistic.

Unfortunately I cannot post the test programs for the time being,
but only describe the algorithms.  And I _can_ run test programs
written by someone else and post (relative) results (i.e.  relative
performance of glibc without elision patches, with elision patches
but disabled, and with elision enabled).

> > As far as I know, nobody has ever done real application tests with
> > transactional memory.
> 
> There's published work for STMs on real applications like memcached.

All these tests with STM are not applicable to HTM because nobody
has ever bothered to simulate cache effects of a real HTM (as far
as I know).  But almost the whole implementation of HTM is a sum
of cache effects.  For example STM implementations never abort
transactions because of cache line conflicts like HTM
implementations do.  STM test results can be surpassed by HTM in
some aspects, but they are always too optimistic regarding abort
ratio.

> Sun has done tests on real code back when they worked on the Rock TM.
> No published papers on Haswell TM performance AFAIK, but that's no
> surprise given the hardware is new.

I.e. no publicly available data except marketing stuff.  Not even
real hardware.

> > I'll never believe someone has done real world tests unless he
> > documents the precise test setup so that everybody can repeat the
> > tests.  This is because I tried to do these real world tests
> > myself and was unable to find a suitable application that could
> > substantially benefit from lock elision
> 
> Lock elision isn't equal to TM.  TM is the general programming
> abstraction.  HTM and STM are hardware/software implementations of TM.
> Lock elision is something that you can implement with an HTM or STM, but
> STM will be slower of course.

Sure, but lock elision will certainly not surpass the benefits
that are possible with HTM if it is implemented using HTM.

> > I posted test results some days ago.  The 22 to 45 percent
> > performance loss even with elision disabled do not count?
> 
> As far as I remember this thread, it wasn't quite clear at that time
> whether those results were correct.

The results in the separate thread are real.

> > I.e. at the moment the mere presence of any
> > patches for Intel seems to harm performance for _all_
> > architerctures, even the ones that do not implement lock elision,
> > and even if lock elision is configured out.
> 
> See above.  Which code are you referring to?  Andi's patch set, or what
> I posted?

Andi's v10 of the patch was the last ported version.  I cannot see
any performance relevant changes of the patches since then.  I'll
certainly port a never version and re-run the tests when new patch
sets do not pop up twice per week anymore.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]