This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Fixes tree-loop-distribute-patterns issues


On Fri, 21 Jun 2013, Ondrej Bilka wrote:

> Are you sure? Lower optimization levels keep a structure of program
> mostly intact so a single change is unlikely to have big impact on
> performance. If this is so then combination is likely to produce just a
> noise.

"structure of program" can include such things as an explicit conversion 
step from void * to char * (for example).  If the internal representation 
of one compiler version involves an assignment between two internal 
variables with different types to effect that conversion, and another 
compiler version elides that assignment and uses just one internal 
variable, you may get different code, even though all versions would elide 
such an assignment when optimizing.

> > Any sort of performance measurement involving -O0 is extremely suspect, 
> > simply because performance is essentially not a consideration at all for 
> > -O0 code generation; other matters such as speed of the compiler itself 
> > and debuggability are the considerations involved, and are the things 
> > people may try to avoid regressing across compiler upgrades.
> > 
> Here we need it mainly reference, as in this case it is more important than
> actual performance.

I think a comparison is only particularly meaningful when the different 
versions of a function being compared are built with the same compiler, 
with the same options, and run on the same hardware.

If what you want to do is compare <optimized-memcpy-1> and 
<optimized-memcpy-2>, it's not clear to me why a simple version is needed 
at all; just compare the two implementations of interest directly and 
don't involve a third implementation.  But if you choose to do the 
comparison as <optimized-memcpy-1>/<simple-memcpy> compared to 
<optimized-memcpy-2>/<simple-memcpy>,

(a) it doesn't really matter how <simple-memcpy> performs, as long as the 
two comparisons use identical <simple-memcpy>; and

(b) the ratios themselves will be more meaningful to humans if the 
comparison is against <simple-memcpy> built with the same options used for 
normal C code in glibc, rather than against something so stupid it would 
never go in a glibc binary.

-- 
Joseph S. Myers
joseph@codesourcery.com


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]