This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [RFC] Monster testcase generator for performance testsuite

From: Yao Qi <yao at codesourcery dot com>
To: Doug Evans <dje at google dot com>
Cc: gdb-patches <gdb-patches at sourceware dot org>
Date: Wed, 7 Jan 2015 17:39:22 +0800
Subject: Re: [RFC] Monster testcase generator for performance testsuite
Authentication-results: sourceware.org; auth=none
References: <m3lhllpkd6 dot fsf at seba dot sebabeach dot org> <87mw5xuzdc dot fsf at codesourcery dot com> <CADPb22TdP5ZG=xHD-9EH1JoyUZtOkD1nZfzcx9TuVOPdJTU++Q at mail dot gmail dot com>

Doug Evans <dje@google.com> writes:

> If a change to gdb increases the time it takes to run a particular command
> by one second is that ok? Maybe. And if my users see the increase
> become ten seconds is that still ok? Also maybe, but I'd like to make the
> case that it'd be preferable to have mechanisms in place to find out sooner
> than later.
>

Yeah, I agree that it is better to find out problems sooner than later.
That is why we create perf test cases.  If one second time increase is
sufficient to find the performance problem, isn't it good?  Why do we
still need to run a bigger version which demonstrated ten seconds increase?

> Similarly, if a change to gdb increases memory usage by 40MB is that ok?
> Maybe. And if my users see that increase become 400MB is that still ok?
> Possibly (depending on the nature of the change). But, again, one of my
> goals here is to have in place mechanisms to find out sooner than later.
>

Similarly, if 40MB memory usage increase is sufficient to show the
performance problem, why do we still have to use a bigger one?

Perf test case is used to demonstrate the real performance problems in
some super large programs, but it doesn't mean the perf test case should
be as big as these super large programs.

> Note that, as I said, there's more I wish to add here.
> For example, it's not enough to just machine generate a bunch of generic
> code. We also need the ability to add specific cases that trip gdb up,
> and thus I also plan to add the ability to add hand-written code to
> these benchmarks.
> Plus, my plan is to make gmonster1 contain a variety of such cases
> and use it in multiple benchmarks. Otherwise we're compiling/linking
> multiple programs and I *am* trying to cut down on build times here! :-)
>

That sounds interesting...

>>> These tests currently require separate build-perf and check-perf steps,
>>> which is different from normal perf tests.  However, due to the time
>>> it takes to build the program I've added support for building the pieces
>>> of the test in parallel, and hooking this parallel build support into
>>> the existing framework required some pragmatic compromise.
>>
>> ... so the parallel build part may not be needed.
>
> I'm not sure what the hangup is on supporting parallel builds here.
> Can you elaborate? It's really not that much code, and while I could

I'd like keep gdb perf test simple.

> have done things differently, I'm just using mechanisms that are
> already in place. The only real "complexity" is that the existing
> mechanism is per-.exp-file based, so I needed one .exp file per worker.
> I think we could simplify this with some cleverness, but this isn't
> what I want to focus on right now. Any change will just be to the
> infrastructure, not to the tests. If someone wants to propose a different
> mechanism to achieve the parallelism go for it. OTOH, there is value
> in using existing mechanisms. Another way to go (and I'm not suggesting
> this is a better or worse way, it's just an example) would be to have
> hand-written worker .exp files and check those in. I don't have a
> strong opinion on that, machine generating them is easy enough and
> gives me some flexibility (which is nice) in these early stages.
>
>>> Running the gmonster1-ptype benchmark requires about 8G to link the program,
>>> and 11G to run it under gdb.  I still need to add the ability to
>>> have a small version enabled by default, and turn on the bigger version
>>> from the command line.  I don't expect everyone to have a big enough
>>> machine to run the test configuration that I do.
>>
>> It looks like a monster rather than a perf test case :)
>
> Depends.  How long do your users still wait for gdb to do something?
> My users are still waiting too long for several things (e.g., startup time).
> And I want to be able to measure what my users see.
> And I want to be able to provide upstream with demonstrations of that.
>

IMO, your expectation is beyond the scope or the purpose perf test
case.  The purpose of each perf test case is to make sure there is no
performance regression and to expose performance problems as code
evolves.  It is not reasonable to me that we measure what users see by
running our perf test cases.  Each perf test case is to measure the
performance on gdb on a certain path, so it doesn't have to behave
exactly the same as the application users are debugging.

>> It is good to
>> have a small version enabled by default, which requires less than 1 G,
>> for example, to run it under GDB.  How much time it takes to compile
>> (sequential build) and run the small version?
>
> There are mechanisms in place to control the amount of parallelism.
> One could make it part of the test spec, but I'm not sure it'd be useful
> enough.  Thus I think there's no need to compile small testcases
> serially.
>

Is it possible (or necessary) that we divide it to two parts, 1) perf
test case generator and 2) parallel build?  As we increase the size
generated perf test cases, the long compilation time can justify having
parallel build.

> As for what upstream wants the "default" to be, I don't have
> a strong opinion, beyond it being minimally useful.  If the default isn't
> useful to me, it's easy enough to tweak the test with a local change
> to cover what I need.
>
> Note that I'm not expecting the default to be these
> super long times, which I noted in my original email. OTOH, I do want
> the harness to be able to usefully handle (as in not wait an hour for the
> testcase to be built) the kind of large programs that I need to run the
> tests on.  Thus my plan is to have a harness that can handle what
> I need, but have defaults that don't impose that on everyone.
> Given appropriate knobs it will be easy enough to have useful
> defaults and still be able to run the tests with larger programs.
> And then if my runs find a problem, it will be straightforward for
> me to provide a demonstration of what I'm seeing (which is part
> of what I want to accomplish here).

Yeah, I agree.

-- 
Yao (éå)

Follow-Ups:
- Re: [RFC] Monster testcase generator for performance testsuite
  - From: Doug Evans

References:
- [RFC] Monster testcase generator for performance testsuite
  - From: Doug Evans
- Re: [RFC] Monster testcase generator for performance testsuite
  - From: Yao Qi
- Re: [RFC] Monster testcase generator for performance testsuite
  - From: Doug Evans

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]