This is the mail archive of the libc-hacker@sources.redhat.com mailing list for the glibc project.
Note that libc-hacker is a closed list. You may look at the archives of this list, but subscription and posting are not open.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
To improve the start up time of the dynamic linker, I followed an idea the BSD folks implemented (my view is i386 centric but everything can be generalized): Large applications, especially C++ ones, have a number of PC32 relocations against symbols. The lookup of these symbols is quite expensive. I'm caching these lookups during the initial relocation of an object (in dl-relocate) now and have decreased the relocation time by 45 % in one large application (konqueror). For comparison, here's the normal linker: gee:/builds/glibc/cross:[254]$ LD_DEBUG=statistics elf/ld-linux.so.2 --library-path :math:linuxthreads:login:/usr/lib:/lib /opt/kde2/bin/konqueror -/ 16634: 16634: runtime linker statistics: 16634: total startup time in dynamic loader: 699430072 clock cycles 16634: time needed for relocation: 694116497 clock cycles (99.2) 16634: number of relocations: 54130 16634: time needed to load objects: 4958742 clock cycles (.7) konqueror: Unknown option '-/'. konqueror: Use --help to get a list of available command line options. And this is my version, note that number of relocations from cache is a subset of the other relocations): gee:/builds/glibc/main-gcc-2.95:[254]$ LD_DEBUG=statistics elf/ld-linux.so.2 --library-path :math:linuxthreads:login:/usr/lib:/lib /opt/kde2/bin/konqueror -/ 22277: 22277: runtime linker statistics: 22277: total startup time in dynamic loader: 392157353 clock cycles 22277: time needed for relocation: 386720405 clock cycles (98.6) 22277: number of relocations: 54130 22277: number of relocations from cache: 33770 22277: time needed to load objects: 5044967 clock cycles (1.2) konqueror: Unknown option '-/'. konqueror: Use --help to get a list of available command line options. For smaller applications the cache overhead is too expensive and the application startup time is unfortunatly increased (here 7 %): Normal: gee:/builds/glibc/cross:[0]$ LD_DEBUG=statistics elf/ld-linux.so.2 --library-path :math:linuxthreads:login:/usr/lib:/lib /bin/ls / 22298: 22298: runtime linker statistics: 22298: total startup time in dynamic loader: 1280514 clock cycles 22298: time needed for relocation: 892054 clock cycles (69.6) 22298: number of relocations: 362 22298: time needed to load objects: 245332 clock cycles (19.1) With cache: gee:/builds/glibc/main-gcc-2.95:[0]$ LD_DEBUG=statistics elf/ld-linux.so.2 --library-path :math:linuxthreads:login:/usr/lib:/lib /bin/ls / 22299: 22299: runtime linker statistics: 22299: total startup time in dynamic loader: 1399818 clock cycles 22299: time needed for relocation: 960693 clock cycles (68.6) 22299: number of relocations: 362 22299: number of relocations from cache: 100 22299: time needed to load objects: 232376 clock cycles (16.6) I'm appending a first version as proof of concept to get your comments. I do have the following open questions/things to do: - Provide patches for platforms beside i386 - _dl_symcache and _dl_maxchain should be passed as parameters to _dl_lookup_symbol/_dl_lookup_versioned_symbol instead of making them global. - The code needs reindentation. - Do I need the version number in the cache? I don't think so (and it works without) but I'm not sure whether I'm missing something. - Any other ideas to improve this? - Is this a post 2.2.4 project? I do think so. How shall I continue with this? Btw. I do get at the moment segvs in dlfcn/errmsg1 and elf/loadfail that I need to investigate, any help/hints would be appreciated. This work was influenced by Waldo's paper (discussed on libc-alpha) and I got already some good advice from Andreas Schwab. Andreas 2001-07-31 Andreas Jaeger <aj@suse.de> * sysdeps/i386/dl-machine.h (elf_machine_rel): Pass r_info to RESOLVE. * sysdeps/generic/ldsodefs.h (Elf_r_info): New. Adjust declarations of _dl_lookup_symbol and _dl_lookup_versioned_symbol. * elf/rtld.c (print_statistics): Print cache relocations. * elf/dl-reloc.c (_dl_relocate_object): Set up cache. * elf/dl-lookup.c: New variable _dl_num_cache_relocations. (_dl_setup_hash): Set l_nchain. (_dl_lookup_symbol): Change parameter to use r_info. Use symbol cache. (_dl_lookup_versioned_symbol_skip): Likewise. ============================================================ Index: elf/dl-lookup.c --- elf/dl-lookup.c 2001/07/06 04:54:46 1.78 +++ elf/dl-lookup.c 2001/07/31 14:18:46 @@ -60,6 +60,7 @@ struct sym_val /* Statistics function. */ unsigned long int _dl_num_relocations; +unsigned long int _dl_num_cache_relocations; /* During the program run we must not modify the global data of loaded shared object simultanously in two threads. Therefore we @@ -80,6 +81,8 @@ __libc_lock_define (extern, _dl_load_loc #define VERSIONED 1 #include "do-lookup.h" +extern struct symcache *_dl_symcache; +extern int _dl_maxchain; /* Add extra dependency on MAP to UNDEF_MAP. */ static int @@ -200,18 +203,30 @@ lookup_t internal_function _dl_lookup_symbol (const char *undef_name, struct link_map *undef_map, const ElfW(Sym) **ref, struct r_scope_elem *symbol_scope[], - int reloc_type, int explicit) + Elf_r_info r_info, int explicit) { const char *reference_name = undef_map ? undef_map->l_name : NULL; const unsigned long int hash = _dl_elf_hash (undef_name); struct sym_val current_value = { NULL, NULL }; struct r_scope_elem **scope; int protected; + int reloc_type = ELFW(R_TYPE) (r_info); int noexec = elf_machine_lookup_noexec_p (reloc_type); int noplt = elf_machine_lookup_noplt_p (reloc_type); + int symbol_num = ELFW(R_SYM) (r_info); ++_dl_num_relocations; + assert (symbol_num < _dl_maxchain); + if (_dl_symcache != NULL && _dl_symcache[symbol_num].symbol != NULL) + { + current_value.s = _dl_symcache[symbol_num].symbol; + current_value.m = _dl_symcache[symbol_num].object; + ++_dl_num_cache_relocations; + } + else + { + /* Search the relevant loaded objects for a definition. */ for (scope = symbol_scope; *scope; ++scope) if (do_lookup (undef_name, hash, *ref, ¤t_value, *scope, 0, NULL, @@ -233,7 +248,7 @@ _dl_lookup_symbol (const char *undef_nam /* Something went wrong. Perhaps the object we tried to reference was just removed. Try finding another definition. */ return _dl_lookup_symbol (undef_name, undef_map, ref, symbol_scope, - reloc_type, 0); + r_info, 0); break; } @@ -250,7 +265,8 @@ _dl_lookup_symbol (const char *undef_nam *ref = NULL; return 0; } - + } + protected = *ref && ELFW(ST_VISIBILITY) ((*ref)->st_other) == STV_PROTECTED; if (__builtin_expect (_dl_debug_mask & DL_DEBUG_BINDINGS, 0)) @@ -261,6 +277,13 @@ _dl_lookup_symbol (const char *undef_nam ? current_value.m->l_name : _dl_argv[0], protected ? "protected" : "normal", undef_name); + + if (_dl_symcache != NULL && _dl_symcache[symbol_num].symbol == NULL) + { + _dl_symcache[symbol_num].symbol = current_value.s; + _dl_symcache[symbol_num].object = current_value.m; + } + if (__builtin_expect (protected == 0, 1)) { *ref = current_value.s; @@ -378,18 +401,30 @@ _dl_lookup_versioned_symbol (const char struct link_map *undef_map, const ElfW(Sym) **ref, struct r_scope_elem *symbol_scope[], const struct r_found_version *version, - int reloc_type, int explicit) + Elf_r_info r_info, int explicit) { const char *reference_name = undef_map ? undef_map->l_name : NULL; const unsigned long int hash = _dl_elf_hash (undef_name); struct sym_val current_value = { NULL, NULL }; struct r_scope_elem **scope; int protected; + int reloc_type = ELFW(R_TYPE) (r_info); int noexec = elf_machine_lookup_noexec_p (reloc_type); int noplt = elf_machine_lookup_noplt_p (reloc_type); + int symbol_num = ELFW(R_SYM) (r_info); ++_dl_num_relocations; + assert (symbol_num < _dl_maxchain); + if (_dl_symcache != NULL && _dl_symcache[symbol_num].symbol != NULL) + { + current_value.s = _dl_symcache[symbol_num].symbol; + current_value.m = _dl_symcache[symbol_num].object; + ++_dl_num_cache_relocations; + } + else + + { /* Search the relevant loaded objects for a definition. */ for (scope = symbol_scope; *scope; ++scope) { @@ -414,7 +449,7 @@ _dl_lookup_versioned_symbol (const char was just removed. Try finding another definition. */ return _dl_lookup_versioned_symbol (undef_name, undef_map, ref, symbol_scope, version, - reloc_type, 0); + r_info, 0); break; } @@ -438,6 +473,8 @@ _dl_lookup_versioned_symbol (const char return 0; } } + } + if (__builtin_expect (current_value.s == NULL, 0)) { @@ -464,6 +501,12 @@ _dl_lookup_versioned_symbol (const char protected ? "protected" : "normal", undef_name, version->name); + if (_dl_symcache != NULL && _dl_symcache[symbol_num].symbol == NULL) + { + _dl_symcache[symbol_num].symbol = current_value.s; + _dl_symcache[symbol_num].object = current_value.m; + } + if (__builtin_expect (protected == 0, 1)) { *ref = current_value.s; @@ -591,14 +634,13 @@ internal_function _dl_setup_hash (struct link_map *map) { Elf_Symndx *hash; - Elf_Symndx nchain; if (!map->l_info[DT_HASH]) return; hash = (void *)(map->l_addr + map->l_info[DT_HASH]->d_un.d_ptr); map->l_nbuckets = *hash++; - nchain = *hash++; + map->l_nchain = *hash++; map->l_buckets = hash; hash += map->l_nbuckets; map->l_chain = hash; @@ -614,7 +656,7 @@ _dl_do_lookup (const char *undef_name, u struct link_map *skip, int noexec, int noplt) { return do_lookup (undef_name, hash, ref, result, scope, i, skip, noexec, - noplt); + noplt); } static int ============================================================ Index: elf/dl-reloc.c --- elf/dl-reloc.c 2001/07/06 04:54:46 1.54 +++ elf/dl-reloc.c 2001/07/31 14:18:46 @@ -27,6 +27,9 @@ #include "dynamic-link.h" +struct symcache *_dl_symcache; +int _dl_maxchain; + void _dl_relocate_object (struct link_map *l, struct r_scope_elem *scope[], int lazy, int consider_profiling) @@ -64,12 +67,15 @@ cannot make segment writable for relocat } } + _dl_symcache = alloca ((l->l_nchain+1) * sizeof (struct symcache)); + memset (_dl_symcache, 0, (l->l_nchain+1) * sizeof (struct symcache)); + _dl_maxchain = l->l_nchain; { /* Do the actual relocation of the object's GOT and other data. */ /* String table object symbols. */ const char *strtab = (const void *) D_PTR (l, l_info[DT_STRTAB]); - + /* This macro is used as a callback from the ELF_DYNAMIC_RELOCATE code. */ #define RESOLVE_MAP(ref, version, flags) \ (ELFW(ST_BIND) ((*ref)->st_info) != STB_LOCAL \ @@ -91,6 +97,8 @@ cannot make segment writable for relocat #include "dynamic-link.h" ELF_DYNAMIC_RELOCATE (l, lazy, consider_profiling); + _dl_symcache = NULL; + if (__builtin_expect (_dl_profile != NULL, 0)) { /* Allocate the array which will contain the already found ============================================================ Index: elf/rtld.c --- elf/rtld.c 2001/07/26 00:25:54 1.200 +++ elf/rtld.c 2001/07/31 14:18:46 @@ -127,6 +127,7 @@ static hp_timing_t relocate_time; static hp_timing_t load_time; #endif extern unsigned long int _dl_num_relocations; /* in dl-lookup.c */ +extern unsigned long int _dl_num_cache_relocations; /* in dl-lookup.c */ static ElfW(Addr) _dl_start_final (void *arg, struct link_map *bootstrap_map_p, hp_timing_t start_time); @@ -1522,6 +1523,8 @@ print_statistics (void) #endif _dl_debug_printf (" number of relocations: %lu\n", _dl_num_relocations); + _dl_debug_printf (" number of relocations from cache: %lu\n", + _dl_num_cache_relocations); #ifndef HP_TIMING_NONAVAIL /* Time spend while loading the object and the dependencies. */ ============================================================ Index: include/link.h --- include/link.h 2001/07/26 00:24:13 1.13 +++ include/link.h 2001/07/31 14:18:47 @@ -1,6 +1,6 @@ /* Data structure for communication from the run-time dynamic linker for loaded ELF shared objects. - Copyright (C) 1995-1999, 2000 Free Software Foundation, Inc. + Copyright (C) 1995-1999, 2000, 2001 Free Software Foundation, Inc. This file is part of the GNU C Library. The GNU C Library is free software; you can redistribute it and/or @@ -158,7 +158,7 @@ struct link_map struct link_map *l_loader; /* Symbol hash table. */ - Elf_Symndx l_nbuckets; + Elf_Symndx l_nbuckets, l_nchain; const Elf_Symndx *l_buckets, *l_chain; unsigned int l_opencount; /* Reference count for dlopen/dlclose. */ ============================================================ Index: sysdeps/generic/ldsodefs.h --- sysdeps/generic/ldsodefs.h 2001/07/06 04:55:49 1.27 +++ sysdeps/generic/ldsodefs.h 2001/07/31 14:18:47 @@ -38,6 +38,14 @@ __BEGIN_DECLS `ElfW(TYPE)' is used in place of `Elf32_TYPE' or `Elf64_TYPE'. */ #define ELFW(type) _ElfW (ELF, __ELF_NATIVE_CLASS, type) +#if __ELF_NATIVE_CLASS == 32 +# define Elf_r_info Elf32_Word +#elif __ELF_NATIVE_CLASS == 32 +# define Elf_r_info Elf64_Xword +#else +# error "__ELF_NATIVE_CLASS unknown." +#endif + /* All references to the value of l_info[DT_PLTGOT], l_info[DT_STRTAB], l_info[DT_SYMTAB], l_info[DT_RELA], l_info[DT_REL], l_info[DT_JMPREL], and l_info[VERSYMIDX (DT_VERSYM)] @@ -136,6 +144,14 @@ struct libname_list }; +/* Data structure to cache symbol lookups. */ +struct symcache + { + const ElfW (Sym) *symbol; /* Symbol table entry. */ + struct link_map *object; /* Object defining symbol. */ + }; + + /* Test whether given NAME matches any of the names of the given object. */ static __inline int __attribute__ ((unused)) @@ -332,7 +348,7 @@ extern lookup_t _dl_lookup_symbol (const struct link_map *undef_map, const ElfW(Sym) **sym, struct r_scope_elem *symbol_scope[], - int reloc_type, int explicit) + Elf_r_info r_info, int explicit) internal_function; /* Lookup versioned symbol. */ @@ -341,7 +357,7 @@ extern lookup_t _dl_lookup_versioned_sym const ElfW(Sym) **sym, struct r_scope_elem *symbol_scope[], const struct r_found_version *version, - int reloc_type, int explicit) + Elf_r_info r_info, int explicit) internal_function; /* For handling RTLD_NEXT we must be able to skip shared objects. */ ============================================================ Index: sysdeps/i386/dl-machine.h --- sysdeps/i386/dl-machine.h 2001/07/06 04:55:52 1.84 +++ sysdeps/i386/dl-machine.h 2001/07/31 14:18:47 @@ -322,7 +322,7 @@ elf_machine_rel (struct link_map *map, c #ifndef RTLD_BOOTSTRAP const Elf32_Sym *const refsym = sym; #endif - Elf32_Addr value = RESOLVE (&sym, version, ELF32_R_TYPE (reloc->r_info)); + Elf32_Addr value = RESOLVE (&sym, version, reloc->r_info); if (sym) value += sym->st_value; -- Andreas Jaeger SuSE Labs aj@suse.de private aj@arthur.inka.de http://www.suse.de/~aj
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |