This is the mail archive of the binutils@sources.redhat.com mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: parallelized 'ld'?


Well, I was being a bit facetious in suggesting that linking is just
additions, of course -- obviously there is a good deal of bookkeeping
and bit-shuffling which is collectively likely to take more time than
the actual relocation additions proper.

('ld' being a core and mature GNU utility, I'm presuming all
the low-hanging fruit in terms of monoprocessor performance
tuning has long since been picked.  Am I being naive?)

I should pipe down and generate some harder data before wasting much
more of this list's bandwidth, but in round numbers I believe I'm
seeing somewhat less than 10 minutes wallclock time for the final
link, producing a 200MB unstripped executable, vs about 30 minutes
wallclock time (with eightfold make parallelism) for a basic
compile+link ("basic" => excluding time needed for frippery like "make
depend," which does a wide variety of app-specific code generation
lying outside the normally understood bounds of "compiling").

As with most builds of large programs, the input to the final link is
mainly .a archives produced by previous links, so the total build time
spent in the linker might be significantly higher:  I need better
instrumentation than I currently have in place to say more.

Anyhow, with the valuable pointers contributed by you folks as
guide, I should spend some time analysing more carefully before
wasting any more bandwidth here.

 -- Cynbe

Paul Hilfinger <hilfingr@EECS.Berkeley.EDU> writes:

> The more interesting question you should ask is "what makes our ld's
> so slow?", to be asked just after "how slow is it?".  
> 
> You're doing a bunch of additions to relocate stuff, but in a
> 10,000,000-line C program, how much time are we talking about?
> Suppose it's 100,000,000 additions (which I think is a very high
> estimate).  Just how much processor time does it take to extract a bit
> field, increment its contents, and store it back 100,000,000 times?
> I'm glad you asked: on my several-year old UltraSparc, about 4
> seconds.  How about the time to perform, let's say, 10,000,000
> external-symbol lookups and definitions using a hashtable?  Using G++
> strings and hash_maps, perhaps 25 seconds on the same equipment.  How
> about to read and write 20MB of .o and executable files?  About a
> second.  What kinds of numbers are you seeing for object file sizes
> and ld times?
> 
> Paul Hilfinger


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]