This is the mail archive of the libc-hacker@sources.redhat.com mailing list for the glibc project.

Note that libc-hacker is a closed list. You may look at the archives of this list, but subscription and posting are not open.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

RFC: Caching of lookup results during startup



To improve the start up time of the dynamic linker, I followed an idea
the BSD folks implemented (my view is i386 centric but everything can
be generalized):

Large applications, especially C++ ones, have a number of PC32
relocations against symbols.  The lookup of these symbols is quite
expensive.  I'm caching these lookups during the initial relocation of
an object (in dl-relocate) now and have decreased the relocation time
by 45 % in one large application (konqueror).

For comparison, here's the normal linker:
gee:/builds/glibc/cross:[254]$ LD_DEBUG=statistics elf/ld-linux.so.2 --library-path :math:linuxthreads:login:/usr/lib:/lib /opt/kde2/bin/konqueror -/
16634:
16634:  runtime linker statistics:
16634:    total startup time in dynamic loader: 699430072 clock cycles
16634:              time needed for relocation: 694116497 clock cycles (99.2)
16634:                   number of relocations: 54130
16634:             time needed to load objects: 4958742 clock cycles (.7)
konqueror: Unknown option '-/'.
konqueror: Use --help to get a list of available command line options.

And this is my version, note that number of relocations from cache is
a subset of the other relocations):
gee:/builds/glibc/main-gcc-2.95:[254]$ LD_DEBUG=statistics elf/ld-linux.so.2 --library-path :math:linuxthreads:login:/usr/lib:/lib /opt/kde2/bin/konqueror -/
22277:
22277:  runtime linker statistics:
22277:    total startup time in dynamic loader: 392157353 clock cycles
22277:              time needed for relocation: 386720405 clock cycles (98.6)
22277:                   number of relocations: 54130
22277:                   number of relocations from cache: 33770
22277:             time needed to load objects: 5044967 clock cycles (1.2)
konqueror: Unknown option '-/'.
konqueror: Use --help to get a list of available command line options.

For smaller applications the cache overhead is too expensive and the
application startup time is unfortunatly increased (here 7 %):

Normal:
gee:/builds/glibc/cross:[0]$ LD_DEBUG=statistics elf/ld-linux.so.2 --library-path :math:linuxthreads:login:/usr/lib:/lib /bin/ls /
22298:
22298:  runtime linker statistics:
22298:    total startup time in dynamic loader: 1280514 clock cycles
22298:              time needed for relocation: 892054 clock cycles (69.6)
22298:                   number of relocations: 362
22298:             time needed to load objects: 245332 clock cycles (19.1)

With cache:
gee:/builds/glibc/main-gcc-2.95:[0]$ LD_DEBUG=statistics elf/ld-linux.so.2 --library-path :math:linuxthreads:login:/usr/lib:/lib /bin/ls /
22299:
22299:  runtime linker statistics:
22299:    total startup time in dynamic loader: 1399818 clock cycles
22299:              time needed for relocation: 960693 clock cycles (68.6)
22299:                   number of relocations: 362
22299:                   number of relocations from cache: 100
22299:             time needed to load objects: 232376 clock cycles (16.6)

I'm appending a first version as proof of concept to get your
comments.

I do have the following open questions/things to do:
- Provide patches for platforms beside i386
- _dl_symcache and _dl_maxchain should be passed as parameters to
  _dl_lookup_symbol/_dl_lookup_versioned_symbol instead of making them
  global.
- The code needs reindentation.
- Do I need the version number in the cache?  I don't think so (and it
  works without) but I'm not sure whether I'm missing something.
- Any other ideas to improve this?
- Is this a post 2.2.4 project?  I do think so.

How shall I continue with this?

Btw. I do get at the moment segvs in dlfcn/errmsg1 and elf/loadfail
that I need to investigate, any help/hints would be appreciated.

This work was influenced by Waldo's paper (discussed on libc-alpha)
and I got already some good advice from Andreas Schwab.

Andreas

2001-07-31  Andreas Jaeger  <aj@suse.de>

	* sysdeps/i386/dl-machine.h (elf_machine_rel): Pass r_info to
	RESOLVE.

	* sysdeps/generic/ldsodefs.h (Elf_r_info): New.
	Adjust declarations of _dl_lookup_symbol and
	_dl_lookup_versioned_symbol.

	* elf/rtld.c (print_statistics): Print cache relocations.

	* elf/dl-reloc.c (_dl_relocate_object): Set up cache.

	* elf/dl-lookup.c: New variable _dl_num_cache_relocations.
	(_dl_setup_hash): Set l_nchain.
	(_dl_lookup_symbol): Change parameter to use r_info.
	Use symbol cache.
	(_dl_lookup_versioned_symbol_skip): Likewise.

============================================================
Index: elf/dl-lookup.c
--- elf/dl-lookup.c	2001/07/06 04:54:46	1.78
+++ elf/dl-lookup.c	2001/07/31 14:18:46
@@ -60,6 +60,7 @@ struct sym_val
 
 /* Statistics function.  */
 unsigned long int _dl_num_relocations;
+unsigned long int _dl_num_cache_relocations;
 
 /* During the program run we must not modify the global data of
    loaded shared object simultanously in two threads.  Therefore we
@@ -80,6 +81,8 @@ __libc_lock_define (extern, _dl_load_loc
 #define VERSIONED	1
 #include "do-lookup.h"
 
+extern struct symcache *_dl_symcache;
+extern int _dl_maxchain;     
 
 /* Add extra dependency on MAP to UNDEF_MAP.  */
 static int
@@ -200,18 +203,30 @@ lookup_t
 internal_function
 _dl_lookup_symbol (const char *undef_name, struct link_map *undef_map,
 		   const ElfW(Sym) **ref, struct r_scope_elem *symbol_scope[],
-		   int reloc_type, int explicit)
+		   Elf_r_info r_info, int explicit)
 {
   const char *reference_name = undef_map ? undef_map->l_name : NULL;
   const unsigned long int hash = _dl_elf_hash (undef_name);
   struct sym_val current_value = { NULL, NULL };
   struct r_scope_elem **scope;
   int protected;
+  int reloc_type = ELFW(R_TYPE) (r_info);
   int noexec = elf_machine_lookup_noexec_p (reloc_type);
   int noplt = elf_machine_lookup_noplt_p (reloc_type);
+  int symbol_num = ELFW(R_SYM) (r_info);
 
   ++_dl_num_relocations;
 
+  assert (symbol_num < _dl_maxchain);
+  if (_dl_symcache != NULL && _dl_symcache[symbol_num].symbol != NULL)
+    {
+      current_value.s = _dl_symcache[symbol_num].symbol;
+      current_value.m = _dl_symcache[symbol_num].object;
+      ++_dl_num_cache_relocations;
+    }
+  else
+    {
+
   /* Search the relevant loaded objects for a definition.  */
   for (scope = symbol_scope; *scope; ++scope)
     if (do_lookup (undef_name, hash, *ref, &current_value, *scope, 0, NULL,
@@ -233,7 +248,7 @@ _dl_lookup_symbol (const char *undef_nam
 	  /* Something went wrong.  Perhaps the object we tried to reference
 	     was just removed.  Try finding another definition.  */
 	  return _dl_lookup_symbol (undef_name, undef_map, ref, symbol_scope,
-				    reloc_type, 0);
+				    r_info, 0);
 
 	break;
       }
@@ -250,7 +265,8 @@ _dl_lookup_symbol (const char *undef_nam
       *ref = NULL;
       return 0;
     }
-
+    }
+  
   protected = *ref && ELFW(ST_VISIBILITY) ((*ref)->st_other) == STV_PROTECTED;
 
   if (__builtin_expect (_dl_debug_mask & DL_DEBUG_BINDINGS, 0))
@@ -261,6 +277,13 @@ _dl_lookup_symbol (const char *undef_nam
 		       ? current_value.m->l_name : _dl_argv[0],
 		       protected ? "protected" : "normal", undef_name);
 
+
+  if (_dl_symcache != NULL && _dl_symcache[symbol_num].symbol == NULL)
+    {
+      _dl_symcache[symbol_num].symbol = current_value.s;
+      _dl_symcache[symbol_num].object = current_value.m;
+    }
+
   if (__builtin_expect (protected == 0, 1))
     {
       *ref = current_value.s;
@@ -378,18 +401,30 @@ _dl_lookup_versioned_symbol (const char 
 			     struct link_map *undef_map, const ElfW(Sym) **ref,
 			     struct r_scope_elem *symbol_scope[],
 			     const struct r_found_version *version,
-			     int reloc_type, int explicit)
+			     Elf_r_info r_info, int explicit)
 {
   const char *reference_name = undef_map ? undef_map->l_name : NULL;
   const unsigned long int hash = _dl_elf_hash (undef_name);
   struct sym_val current_value = { NULL, NULL };
   struct r_scope_elem **scope;
   int protected;
+  int reloc_type = ELFW(R_TYPE) (r_info);
   int noexec = elf_machine_lookup_noexec_p (reloc_type);
   int noplt = elf_machine_lookup_noplt_p (reloc_type);
+  int symbol_num = ELFW(R_SYM) (r_info);
 
   ++_dl_num_relocations;
 
+  assert (symbol_num < _dl_maxchain);
+  if (_dl_symcache != NULL && _dl_symcache[symbol_num].symbol != NULL)
+    {
+      current_value.s = _dl_symcache[symbol_num].symbol;
+      current_value.m = _dl_symcache[symbol_num].object;
+      ++_dl_num_cache_relocations;
+    }
+  else
+
+    {
   /* Search the relevant loaded objects for a definition.  */
   for (scope = symbol_scope; *scope; ++scope)
     {
@@ -414,7 +449,7 @@ _dl_lookup_versioned_symbol (const char 
 	       was just removed.  Try finding another definition.  */
 	    return _dl_lookup_versioned_symbol (undef_name, undef_map, ref,
 						symbol_scope, version,
-						reloc_type, 0);
+						r_info, 0);
 
 	  break;
 	}
@@ -438,6 +473,8 @@ _dl_lookup_versioned_symbol (const char 
 	  return 0;
 	}
     }
+    }
+  
 
   if (__builtin_expect (current_value.s == NULL, 0))
     {
@@ -464,6 +501,12 @@ _dl_lookup_versioned_symbol (const char 
 		       protected ? "protected" : "normal",
 		      undef_name, version->name);
 
+  if (_dl_symcache != NULL && _dl_symcache[symbol_num].symbol == NULL)
+    {
+      _dl_symcache[symbol_num].symbol = current_value.s;
+      _dl_symcache[symbol_num].object = current_value.m;
+    }
+
   if (__builtin_expect (protected == 0, 1))
     {
       *ref = current_value.s;
@@ -591,14 +634,13 @@ internal_function
 _dl_setup_hash (struct link_map *map)
 {
   Elf_Symndx *hash;
-  Elf_Symndx nchain;
 
   if (!map->l_info[DT_HASH])
     return;
   hash = (void *)(map->l_addr + map->l_info[DT_HASH]->d_un.d_ptr);
 
   map->l_nbuckets = *hash++;
-  nchain = *hash++;
+  map->l_nchain = *hash++;
   map->l_buckets = hash;
   hash += map->l_nbuckets;
   map->l_chain = hash;
@@ -614,7 +656,7 @@ _dl_do_lookup (const char *undef_name, u
 	       struct link_map *skip, int noexec, int noplt)
 {
   return do_lookup (undef_name, hash, ref, result, scope, i, skip, noexec,
-  		    noplt);
+		    noplt);
 }
 
 static int
============================================================
Index: elf/dl-reloc.c
--- elf/dl-reloc.c	2001/07/06 04:54:46	1.54
+++ elf/dl-reloc.c	2001/07/31 14:18:46
@@ -27,6 +27,9 @@
 #include "dynamic-link.h"
 
 
+struct symcache *_dl_symcache;
+int _dl_maxchain;
+
 void
 _dl_relocate_object (struct link_map *l, struct r_scope_elem *scope[],
 		     int lazy, int consider_profiling)
@@ -64,12 +67,15 @@ cannot make segment writable for relocat
 	  }
     }
 
+  _dl_symcache = alloca ((l->l_nchain+1) * sizeof (struct symcache));
+  memset (_dl_symcache, 0, (l->l_nchain+1) * sizeof (struct symcache));
+  _dl_maxchain = l->l_nchain;
   {
     /* Do the actual relocation of the object's GOT and other data.  */
 
     /* String table object symbols.  */
     const char *strtab = (const void *) D_PTR (l, l_info[DT_STRTAB]);
-
+    
     /* This macro is used as a callback from the ELF_DYNAMIC_RELOCATE code.  */
 #define RESOLVE_MAP(ref, version, flags) \
     (ELFW(ST_BIND) ((*ref)->st_info) != STB_LOCAL			      \
@@ -91,6 +97,8 @@ cannot make segment writable for relocat
 #include "dynamic-link.h"
     ELF_DYNAMIC_RELOCATE (l, lazy, consider_profiling);
 
+    _dl_symcache = NULL;
+    
     if (__builtin_expect (_dl_profile != NULL, 0))
       {
 	/* Allocate the array which will contain the already found
============================================================
Index: elf/rtld.c
--- elf/rtld.c	2001/07/26 00:25:54	1.200
+++ elf/rtld.c	2001/07/31 14:18:46
@@ -127,6 +127,7 @@ static hp_timing_t relocate_time;
 static hp_timing_t load_time;
 #endif
 extern unsigned long int _dl_num_relocations;	/* in dl-lookup.c */
+extern unsigned long int _dl_num_cache_relocations;	/* in dl-lookup.c */
 
 static ElfW(Addr) _dl_start_final (void *arg, struct link_map *bootstrap_map_p,
 				   hp_timing_t start_time);
@@ -1522,6 +1523,8 @@ print_statistics (void)
 #endif
   _dl_debug_printf ("                 number of relocations: %lu\n",
 		    _dl_num_relocations);
+  _dl_debug_printf ("                 number of relocations from cache: %lu\n",
+		    _dl_num_cache_relocations);
 
 #ifndef HP_TIMING_NONAVAIL
   /* Time spend while loading the object and the dependencies.  */
============================================================
Index: include/link.h
--- include/link.h	2001/07/26 00:24:13	1.13
+++ include/link.h	2001/07/31 14:18:47
@@ -1,6 +1,6 @@
 /* Data structure for communication from the run-time dynamic linker for
    loaded ELF shared objects.
-   Copyright (C) 1995-1999, 2000 Free Software Foundation, Inc.
+   Copyright (C) 1995-1999, 2000, 2001 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
 
    The GNU C Library is free software; you can redistribute it and/or
@@ -158,7 +158,7 @@ struct link_map
     struct link_map *l_loader;
 
     /* Symbol hash table.  */
-    Elf_Symndx l_nbuckets;
+    Elf_Symndx l_nbuckets, l_nchain;
     const Elf_Symndx *l_buckets, *l_chain;
 
     unsigned int l_opencount;	/* Reference count for dlopen/dlclose.  */
============================================================
Index: sysdeps/generic/ldsodefs.h
--- sysdeps/generic/ldsodefs.h	2001/07/06 04:55:49	1.27
+++ sysdeps/generic/ldsodefs.h	2001/07/31 14:18:47
@@ -38,6 +38,14 @@ __BEGIN_DECLS
    `ElfW(TYPE)' is used in place of `Elf32_TYPE' or `Elf64_TYPE'.  */
 #define ELFW(type)	_ElfW (ELF, __ELF_NATIVE_CLASS, type)
 
+#if __ELF_NATIVE_CLASS == 32
+# define Elf_r_info Elf32_Word
+#elif __ELF_NATIVE_CLASS == 32
+# define Elf_r_info Elf64_Xword
+#else
+# error "__ELF_NATIVE_CLASS unknown."
+#endif
+
 /* All references to the value of l_info[DT_PLTGOT],
   l_info[DT_STRTAB], l_info[DT_SYMTAB], l_info[DT_RELA],
   l_info[DT_REL], l_info[DT_JMPREL], and l_info[VERSYMIDX (DT_VERSYM)]
@@ -136,6 +144,14 @@ struct libname_list
   };
 
 
+/* Data structure to cache symbol lookups.  */
+struct symcache
+  {
+    const ElfW (Sym) *symbol;		/* Symbol table entry.  */
+    struct link_map *object;	/* Object defining symbol.  */
+  };
+
+
 /* Test whether given NAME matches any of the names of the given object.  */
 static __inline int
 __attribute__ ((unused))
@@ -332,7 +348,7 @@ extern lookup_t _dl_lookup_symbol (const
 				   struct link_map *undef_map,
 				   const ElfW(Sym) **sym,
 				   struct r_scope_elem *symbol_scope[],
-				   int reloc_type, int explicit)
+				   Elf_r_info r_info, int explicit)
      internal_function;
 
 /* Lookup versioned symbol.  */
@@ -341,7 +357,7 @@ extern lookup_t _dl_lookup_versioned_sym
 					     const ElfW(Sym) **sym,
 					     struct r_scope_elem *symbol_scope[],
 					     const struct r_found_version *version,
-					     int reloc_type, int explicit)
+					     Elf_r_info r_info, int explicit)
      internal_function;
 
 /* For handling RTLD_NEXT we must be able to skip shared objects.  */
============================================================
Index: sysdeps/i386/dl-machine.h
--- sysdeps/i386/dl-machine.h	2001/07/06 04:55:52	1.84
+++ sysdeps/i386/dl-machine.h	2001/07/31 14:18:47
@@ -322,7 +322,7 @@ elf_machine_rel (struct link_map *map, c
 #ifndef RTLD_BOOTSTRAP
       const Elf32_Sym *const refsym = sym;
 #endif
-      Elf32_Addr value = RESOLVE (&sym, version, ELF32_R_TYPE (reloc->r_info));
+      Elf32_Addr value = RESOLVE (&sym, version, reloc->r_info);
       if (sym)
 	value += sym->st_value;
 

-- 
 Andreas Jaeger
  SuSE Labs aj@suse.de
   private aj@arthur.inka.de
    http://www.suse.de/~aj


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]