This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Preheat CPU in benchtests


On Wed, Apr 24, 2013 at 12:49:12AM +0200, Petr Baudis wrote:
> On Tue, Apr 23, 2013 at 02:11:41PM -0300, Adhemerval Zanella wrote:
> > On 23-04-2013 12:17, OndÅej BÃlka wrote:
> > > On Tue, Apr 23, 2013 at 07:22:16AM -0700, Andi Kleen wrote:
> > >> OndÅej BÃlka <neleai@seznam.cz> writes:
> > >>
> > >>> Benchmarks now are affected by cpu scaling when initialy run at low
> > >>> frequency.
> > >>>
> > >>> Following benchmark runs nonsensial loop first to ensure that benchmark
> > >>> are measured at maximal frequency. This greatly cuts time needed to
> > >>> get accurate results.
> 
> It seems to me pre-heating for a fixed time period (e.g. 500ms) would be
> safer than pre-heating for a fixed number of cycles. However, I'm not
> sure about the exact CPU frequency governor rules usually employed.
> 
A 500ms combined with fact that benchmarks are run automaticaly at
testsuite oculd with 120 funcitons translate to 1 minute of extra 
running time which is already too big.
> > >> FWIW it's generally safer to disable frequency scaling explicitely
> > >> through sysfs (but that needs root), as the reaction time of the
> > >> p-state governour can be unpredictable.
> > > Which needs root, so it would request typing password each time you run 
> > >  automated benchmarks.
> > >
> > I see it should be up to developer to setup the environment and to report
> > its findings and configuration used. Maybe we might add hooks though 
> > env. vars or additional logic on the Makefile/script that runs the benchmark
> > (to bind cpu/memory, setup machine scaling, etc.), but I don't think it
> > should in benchmark logic to setup such things.
> 
> Maybe we should just test whether the conditions are right, i.e. if
> frequency scaling is disabled; if we detect a problem, print a fat
> warning so that the user knows their results aren't reliable, plus
> print an one-liner suggestion for the user to run to fix the situation?

Warning has problem that it will get lost in wall of text. 

Only factor that matters is performance ratio between implementations.
In randomized benchmark these ratios are pretty reliable so calling
results unreliable would be a lie. Results are not that replicable as
you also need rule out other things. Then you get lab enviroment and get
into systematic danger that you get suprised that other process can
modify L3 cache making implementation you thought best inferior.

It would discourage users from running and reading tests. There is
big uncertainity what input distribution, frequency distribution, machine
distribution are. Acting more on benchmarks is more important than
having precise digits after significant digits.


> 
> -- 
> 				Petr "Pasky" Baudis
> 	For every complex problem there is an answer that is clear,
> 	simple, and wrong.  -- H. L. Mencken

-- 

Firmware update in the coffee machine


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]