This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Piecemeal library loading causes slow startup of big apps


libc-alpha-owner@sources.redhat.com wrote on 09/13/2005 11:26:32 AM:

> Hi,
> 
> as my google SoC project I have been working on improving GNOME startup 
> time, and I see that dynamic linking is one of the culprits.
> 
> GNOME startup is mainly I/O bound, i.e. most of the time is spent 
> waiting for disk seeks. Proof-of-concept work I have done has reduced 
> the disk seeks caused by GNOME itself, but now I have reached the point 
> that most of the disk seeks are caused by ld.so loading dynamic 
libraries.
> 
> This is because libraries are not loaded immediately in one big 
> sequential read, but in bits and pieces. (I think this is because ld.so 
> mmap()s the library and only page faults the bits it needs into RAM.) 
> For example, gtk+ (~9MB) is loaded piecemeal in about 30 separate out-of 

> order reads:
> 
> > (gdm-binary/3150): /usr/local/gnome/lib/libgtk-x11-2.0.so.0 0-7
> > (gdm-binary/3150): /usr/local/gnome/lib/libgtk-x11-2.0.so.0 687-718
> > (gdm-binary/3150): /usr/local/gnome/lib/libgtk-x11-2.0.so.0 653-684
> > (gdm-binary/3150): /usr/local/gnome/lib/libgtk-x11-2.0.so.0 34-65
> > (gdm-binary/3150): /usr/local/gnome/lib/libgtk-x11-2.0.so.0 8-33
> > [...]
> > (battstat-applet/4143): /usr/local/gnome/lib/libgtk-x11-2.0.so.0 
447-475
> 
> These are real disk reads traced by hooking into the ext3 block read 
> function using a kernel patch. The format is:
> (process/pid): filename start_4k_block-end_4k_block
> 
It looks like the kernel is already doing some (8-32) pages of read-ahead 
for you. And this is not enough? how big is this library (in megabytes?)

> This way of loading libraries visibly hurts performance. If I cat the 
> most frequently-used libraries to /dev/null early in the startup 
> process, I can shave about 10% (~2s) off startup time: reading the 
> libraries puts them in the buffer cache, and when the linker mmaps them 
> it doesn't end up causing seeks.  This is obviously a hack, but I think 
> the process could be made a lot smarter than this.
> 
> For example, would LD_BIND_NOW help me (I suspect not)? Is there a 
> compile-time hint that can tell the linker load the whole library using 
> read() instead of mmap()? If not, could it be implemented?
> 

LD_BIND_NOW will initialize the PLT sooner but will not touch any 
additional text pages (which is what I think you want).

Most of the time we want to defer loading pages until we know we need 
them. So replacing mmap with read is not a general solution.

It seems like you need something more like madvise(POSIX_MADV_SEQUENTIAL) 
or POSIX_MADV_WILLNEED to get the effect you want. This will probably 
require some loader (ld.so) changes and some way to indicate this desire 
in the ELF (special ELF note to ask the loader to call madvise(
POSIX_MADV_WILLNEED) on the text segment, or an new PT_LOAD flag?)

Steven J. Munroe
Linux on Power Toolchain Architect
IBM Corporation, Linux Technology Center


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]