This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug dynamic-link/15310] _dl_sort_fini is O(n^3) causing slow exit when many dsos


http://sourceware.org/bugzilla/show_bug.cgi?id=15310

--- Comment #9 from Don Hatch <dhatch at ilm dot com> 2013-03-28 00:31:40 UTC ---
(In reply to comment #4)
> On Wed, Mar 27, 2013 at 07:47:46AM +0000, dhatch at ilm dot com wrote:
> > http://sourceware.org/bugzilla/show_bug.cgi?id=15310
> > 
> > This is just a topsort, which can be done simply in O(n) time
> > with no fancy data structures.
> > 
> I do not have domain knowledge to understand what is necessary to
> consider.
> 
> However I could write topsort if I knew dependencies. From code I think
> following pseudocode should work upto possibility that some arrows need
> be reversed.
> 
> add_edge(g,x,y) // add edge from x to y
> topsort(g)      // returns topologic ordering of g
> 
> graph *g; 
> for(k=0;k<nmaps;k++)
>   {
>       /* Add dependencies of the object.  */
>       struct link_map **runp = maps[k]->l_initfini;
>       if (runp != NULL)
>         while (*runp != NULL) 
>           add_edge(g,maps[k],*runp);
> 
>           /* Add relocation dependencies.  */
>       if (__builtin_expect (maps[k]->l_reldeps != NULL, 0))
>         {
>           unsigned int m = maps[k]->l_reldeps->act;
>           struct link_map **relmaps = &maps[k]->l_reldeps->list[0];
>           for (j=0;j<m;j++)
>             add_edge(g,relmaps[k],*runp) 
>         }
>   }
> 
>     memcpy(maps,topsort(g));

That looks essentially right,
although there are a couple of nuances in the existing code.
    - under the comment "Do not handle ld.so in secondary namespace"
      (whatever that means), it ignores a dependency "B depends on A"
      if A!=A->l_real  (but doesn't check whether B!=B->l_real).
      I don't really know what this means, or whether this assymmetry
      is intentional...
      QUESTION: can I just omit all edges involving such items
      altogether, not caring where these items come out
      in the output order?
    - it also ignores "B depends on A" if A->l_idx == -1 (i.e.
IDX_STILL_USED)--
      this one I believe I understand--
      it definitely doesn't matter where such items
      come out in the final sort order.
    - static dependendies should override dynamic ones if there's a conflict.
      I'll do two passes in order to get this right, and more well-behaved
      than the existing code--
      see bug 15311 for details.

I hadn't thought about creating the graph structure explicitly as you
suggest...
arguably that would result in cleaner topsort code, but
I was just going to work directly with the data structure that's
already there instead.
Note that an explicit auxiliary graph representation would have size O(N+E)
(number of nodes plus number of edges)
whereas the size of needed auxiliary structures is currently only O(N)
(making it feasible to allocate them as local variables on the runtime stack)
and I wasn't planning on changing that.

The topsort algorithm will be essentially Tarjan's SCC algorithm:
http://en.wikipedia.org/wiki/Tarjan%27s_strongly_connected_components_algorithm
(actually two passes of it).  This is a bit more involved than
a simple reverse postorder, but keeping SCCs contiguous is desireable
(for one thing, the existing code does it-- furthermore,
it provides some nice well-behavedness between the two passes
that I wouldn't be able to prove otherwise).
Tarjan's algorithm is preferable to Kosaraju's since it doesn't require
explicitly creating the reverse graph.

One other issue/question-- I was assuming a recursive implementation
would be out of the question (due to increased runtime stack requirement);
but I'm finding that an iterative implementation of Tarjan's SCC algorithm
is... well, really unpleasant.
I can do it, but it's no fun, and it won't be any fun
to review or maintain it either.  A recursive implementation
would be MUCH cleaner, simpler, more readable and maintainable.
Any thoughts/feelings on whether it's okay to make it recursive?

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]