This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: BZ#2421 - removing duplicate probe handlers


On Tue, 2006-08-01 at 18:49 -0400, Frank Ch. Eigler wrote:
> Hi -
> 
> On Tue, Aug 01, 2006 at 01:36:38PM -0500, David Smith wrote:
> 
> > [...]  As far as probes go, here I ran into conceptual problems.
> > [...] The way the code generation currently works, it isn't going to
> > be easy to merge a timer probe with a dwarf_derived_probe for
> > example - all the needed information won't be there.  [...]  So, if
> > we merge a begin probe and an end probe together, we're going to end
> > up with either 2 begin probes or 2 end probes.
> 
> Maybe what we need is a slightly different representation of merging
> here.  What we want is not to eliminate all those derived_probes, but
> rather to reuse the code generated for identical ->body trees.  It is
> those trees that are translated in c_unparser::emit_probe.  That may
> be the perfect place to detect/handle probe-level duplication.
> (Instead of emitting new function body that duplicates one already
> seen, it could emit a call to the first copy instead.)
> 
> - FChE

I tried to think through this a bit, and I believe I was thinking too
hard about this.  I decided that we need to optimize the general case
first to get the biggest bang for the buck.

The bugzilla references the case we should be optimizing for - something
like this:

probe kernel.function("sys_read"), kernel.function("sys_write")
{
        log("here")
}

In addition, I realized that there are cases where we don't want to
merge probes even if they are identical (but we could use your idea
above of reusing the code bodies).  Such as this:

probe begin { log("here") }
probe begin { log("here") }

If we merge those 2 probes into 1, we're only going to get 1 output line
instead of 2.


So, I've coded up a patch that merges functions (as the last patch did)
and merges dwarf probes and dwarf return probes (keeping the 2 variants
separate).  This was relatively easy and should give us a good return.

I've also attached a (somewhat contrived) stp script that shows the
benefit of this patch.

original stap:
  generated C file: 1612k
  generated module: 1944k

removing duplicate functions only:
  generated C file: 1420k
  generated module: 1880k

removing duplicate functions and probes:
  generated C file:   48k
  generated module:  920k

-- 
David Smith
dsmith@redhat.com
Red Hat, Inc.
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)

? output.txt
? patch
? test1.stp
? test2.stp
? test3.stp
? test4.stp
? test5.stp
? runtime/lket/b2a/.deps
? runtime/lket/b2a/Makefile
? runtime/lket/b2a/lket-b2a
Index: elaborate.cxx
===================================================================
RCS file: /cvs/systemtap/src/elaborate.cxx,v
retrieving revision 1.62
diff -u -p -r1.62 elaborate.cxx
--- elaborate.cxx	5 Jun 2006 19:45:56 -0000	1.62
+++ elaborate.cxx	2 Aug 2006 21:03:43 -0000
@@ -24,7 +24,7 @@ extern "C" {
 #include <vector>
 #include <algorithm>
 #include <iterator>
-
+#include <ext/hash_map>
 
 using namespace std;
 
@@ -43,8 +43,8 @@ lex_cast(IN const & in)
 // ------------------------------------------------------------------------
 
 
-derived_probe::derived_probe (probe *p):
-  base (p)
+derived_probe::derived_probe (probe *p, bool merge):
+  base (p), can_merge (merge)
 {
   if (p)
     {
@@ -55,8 +55,8 @@ derived_probe::derived_probe (probe *p):
 }
 
 
-derived_probe::derived_probe (probe *p, probe_point *l):
-  base (p)
+derived_probe::derived_probe (probe *p, probe_point *l, bool merge):
+  base (p), can_merge (merge)
 {
   if (l)
     this->locations.push_back (l);
@@ -68,6 +68,15 @@ derived_probe::derived_probe (probe *p, 
     }
 }
 
+void
+derived_probe::merge (derived_probe* p)
+{
+  // This probe is a duplicate.  Merge it by adding all
+  // the duplicate probe's locations to this probe.
+  vector<probe_point*>::const_iterator pp;
+  for (pp = p->locations.begin(); pp != p->locations.end(); pp++)
+    locations.push_back(*pp);
+}
 
 // ------------------------------------------------------------------------
 // Members of derived_probe_builder
@@ -1550,6 +1559,151 @@ void semantic_pass_opt4 (systemtap_sessi
     }
 }
 
+struct duplicate_function_remover: public functioncall_traversing_visitor
+{
+  systemtap_session& s;
+  bool& relaxed_p;
+  map<functiondecl*, functiondecl*>& duplicate_function_map;
+
+  duplicate_function_remover(systemtap_session& sess, bool& r,
+			     map<functiondecl*, functiondecl*>&dfm):
+    s(sess), relaxed_p(r), duplicate_function_map(dfm) {};
+
+  void visit_functioncall (functioncall* e);
+};
+
+void
+duplicate_function_remover::visit_functioncall (functioncall *e)
+{
+  functioncall_traversing_visitor::visit_functioncall (e);
+
+  // If the current function call reference points to a function that
+  // is a duplicate, replace it.
+  if (duplicate_function_map.count(e->referent) != 0)
+    {
+      if (s.verbose>2)
+	  clog << "Changing " << e->referent->name
+	       << " reference to "
+	       << duplicate_function_map[e->referent]->name
+	       << " reference\n";
+      e->tok = duplicate_function_map[e->referent]->tok;
+      e->function = duplicate_function_map[e->referent]->name;
+      e->referent = duplicate_function_map[e->referent];
+
+      relaxed_p = false;
+    }
+}
+
+static size_t
+hash_functiondecl (functiondecl* f)
+{
+  ostringstream s;
+  __gnu_cxx::hash<const char*> hash_func;
+
+  // Get the "name:args body" of the function in s.  We have to
+  // include the args since the function 'x1(a, b)' is different than
+  // the function 'x2(b, a)' even if the bodies of the two functions
+  // are exactly the same.
+  f->printsig(s);
+  f->body->print(s);
+
+  // printsig puts f->name + ':' on the front.  Remove this
+  // (otherwise, functions would never compare equal).
+  string str = s.str().erase(0, f->name.size() + 1);
+
+  // Actually generate the hash.
+  return hash_func(str.c_str());
+}
+
+static size_t
+hash_probe (derived_probe* p)
+{
+  ostringstream s;
+  __gnu_cxx::hash<const char*> hash_func;
+
+  // For probes, we need the type and body of the probe.
+  s << p->get_type() << " ";
+  p->body->print(s);
+
+  // Actually generate the hash.
+  return hash_func(s.str().c_str());
+}
+
+void semantic_pass_opt5 (systemtap_session& s, bool& relaxed_p)
+{
+  // Walk through all the functions, looking for duplicates.
+  map<size_t, functiondecl*> function_hash_map;
+  map<functiondecl*, functiondecl*> duplicate_function_map;
+  for (unsigned i=0; i < s.functions.size(); i++)
+    {
+      size_t function_hash = hash_functiondecl(s.functions[i]);
+
+      if (function_hash_map.count(function_hash) == 0)
+	// This function is unique.  Remember it.
+	function_hash_map[function_hash] = s.functions[i];
+      else
+	// This function is a duplicate.
+	duplicate_function_map[s.functions[i]]
+	    = function_hash_map[function_hash];
+    }
+
+  // If we have duplicate functions, traverse down the tree, replacing
+  // the appropriate function calls.
+  // duplicate_function_remover::visit_functioncall() handles the
+  // details of replacing the function calls.  Note that we don't
+  // delete the duplicate functiondecl itself, we'll let pass 1 do
+  // that.
+  if (duplicate_function_map.size() != 0)
+    {
+      duplicate_function_remover dfr (s, relaxed_p, duplicate_function_map);
+
+      for (unsigned i=0; i < s.probes.size(); i++)
+	s.probes[i]->body->visit(&dfr);
+    }
+
+  // Walk through all the probes, looking for duplicates
+  map<size_t, derived_probe*> probe_hash_map;
+  for (unsigned i=0; i < s.probes.size(); /* see below */)
+    {
+      // If this probe type cannot be merged at all, even with a
+      // similar probe type, just go on to the next probe.
+      if (! s.probes[i]->can_merge)
+	i++;
+      // This probe type can be merged, if a duplicate is found.
+      else
+	{
+	  size_t probe_hash = hash_probe(s.probes[i]);
+
+	  if (probe_hash_map.count(probe_hash) == 0)
+	    {
+	      // This probe is uniqe.  Remember it.
+	      probe_hash_map[probe_hash] = s.probes[i];
+	      i++;
+	    }
+	  else
+	    {
+	      // This probe is a duplicate.  Remove it by first adding
+	      // all the duplicate probe's locations to the original
+	      // probe, then remove the probe itself.
+	      if (s.verbose>2)
+		{
+		  clog << "Merging probe ";
+		  s.probes[i]->printsig(clog);
+		  clog << " into probe ";
+		  probe_hash_map[probe_hash]->printsig(clog);
+		  clog << endl;
+		}
+
+	      probe_hash_map[probe_hash]->merge(s.probes[i]);
+
+	      s.probes.erase(s.probes.begin() + i);
+	      relaxed_p = false;
+	      // NB: don't increment i
+	    }
+	}
+    }
+}
+
 
 static int
 semantic_pass_optimize (systemtap_session& s)
@@ -1571,6 +1725,7 @@ semantic_pass_optimize (systemtap_sessio
       semantic_pass_opt2 (s, relaxed_p);
       semantic_pass_opt3 (s, relaxed_p);
       semantic_pass_opt4 (s, relaxed_p);
+      semantic_pass_opt5 (s, relaxed_p);
     }
 
   if (s.probes.size() == 0)
Index: elaborate.h
===================================================================
RCS file: /cvs/systemtap/src/elaborate.h,v
retrieving revision 1.31
diff -u -p -r1.31 elaborate.h
--- elaborate.h	2 Jun 2006 23:13:38 -0000	1.31
+++ elaborate.h	2 Aug 2006 21:03:43 -0000
@@ -109,13 +109,16 @@ class translator_output;
 
 struct derived_probe: public probe
 {
-  derived_probe (probe* b);
-  derived_probe (probe* b, probe_point* l);
+  derived_probe (probe* b, bool merge = false);
+  derived_probe (probe* b, probe_point* l, bool merge = false);
   probe* base; // the original parsed probe
   virtual probe* basest () { return base->basest(); }
 
   virtual ~derived_probe () {}
 
+  bool can_merge;
+  virtual const char* get_type () { return("unknown"); }
+
   virtual void emit_registrations (translator_output* o) = 0;
   // (from within module_init):
   // rc = ..... register_or_whatever (ENTRYFN);
@@ -137,6 +140,8 @@ struct derived_probe: public probe
   // From within unparser::emit_common_header, add any extra variables
   // to this probe's context locals.
 
+  virtual void merge (derived_probe* p);
+
 protected:
   void emit_probe_prologue (translator_output* o, const std::string&);
   void emit_probe_epilogue (translator_output* o);
Index: tapsets.cxx
===================================================================
RCS file: /cvs/systemtap/src/tapsets.cxx,v
retrieving revision 1.138
diff -u -p -r1.138 tapsets.cxx
--- tapsets.cxx	1 Aug 2006 02:20:47 -0000	1.138
+++ tapsets.cxx	2 Aug 2006 21:03:43 -0000
@@ -1714,6 +1714,8 @@ struct dwarf_derived_probe : public deri
 		       Dwarf_Addr addr,
 		       dwarf_query & q);
 
+  const char* get_type () { return (has_return ? "dwarf_return" : "dwarf"); }
+
   vector<Dwarf_Addr> probe_points;
   bool has_return;
 
@@ -1739,6 +1741,8 @@ struct dwarf_derived_probe : public deri
   virtual void emit_registrations (translator_output * o);
   virtual void emit_deregistrations (translator_output * o);
   virtual void emit_probe_entries (translator_output * o);
+
+  virtual void merge (derived_probe* p);
 };
 
 // Helper struct to thread through the dwfl callbacks.
@@ -3006,7 +3010,7 @@ dwarf_derived_probe::add_probe_point(str
 dwarf_derived_probe::dwarf_derived_probe (Dwarf_Die *scope_die,
 					  Dwarf_Addr addr,
 					  dwarf_query & q)
-  : derived_probe (q.base_probe, 0 /* location-less */),
+  : derived_probe (q.base_probe, 0 /* location-less */, true /* merge */),
     has_return (q.has_return)
 {
   string module_name(q.dw.module_name);
@@ -3374,6 +3378,28 @@ dwarf_derived_probe::emit_probe_entries 
 
 
 void
+dwarf_derived_probe::merge(derived_probe* p)
+{
+  // Sanity check.  Make sure types are the same.
+  if (strcmp(get_type(), p->get_type()) != 0)
+    throw runtime_error("bad probe merging");
+
+  // This is cheating, but we need a pointer to the
+  // dwarf_derived_probe, not the derived_probe.  Because of the check
+  // above, we can be certain this is a dwarf_derived_probe.
+  const dwarf_derived_probe* p2 = (const dwarf_derived_probe *)p;
+
+  // First call the base class.
+  derived_probe::merge(p);
+
+  // Now handle the dwarf_derived_probe specific stuff.  Merge it by
+  // adding all the duplicate probe's addressed to this probe.
+  vector<Dwarf_Addr>::const_iterator pp;
+  for (pp = p2->probe_points.begin(); pp != p2->probe_points.end(); pp++)
+    probe_points.push_back(*pp);
+}
+
+void
 dwarf_builder::build(systemtap_session & sess,
 		     probe * base,
 		     probe_point * location,
@@ -3765,7 +3791,7 @@ mark_derived_probe::mark_derived_probe (
                                         uintptr_t a,
                                         const string& m,
                                         probe* base):
-  derived_probe (base, 0), sess (s), probe_name (p_n), probe_sig (p_s),
+  derived_probe (base), sess (s), probe_name (p_n), probe_sig (p_s),
   address (a), module (m)
 {
   // create synthetic probe point

Attachment: dup_function.stp
Description: Text document


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]