This is the mail archive of the binutils@sources.redhat.com mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [Patch] skipping import libraries for performance reasons - direct auto-import of dll's


Okay, I've built and tested with this patch, and looked at the code a little more closely than previously. I've attached a modified version of Ralf's patch [but don't commit my version; wait for Ralf to regen].

The modified patch fixes a number of compiler warnings (use "%lx" not "%x" for long int variables; avoid use of uninitialized variables, etc). I've also fixed up the formatting, and added a few changes to correct a misfeature that I discovered.

---------------------------------------------------------------

I've initialized the [data|bss]_[start|end] variables as:

/* Initialization with start > end guarantees that is_data will not be
set by mistake, and avoids compiler warning */
unsigned long data_start = 1;
unsigned long data_end = 0;
unsigned long bss_start = 1;
unsigned long bss_end = 0;

so that the statement below doesn't cause a compiler warning...or a run time error. It is possible (I think) for a DLL to not have a .bss section at all, in which case bss_[start|end] never get initialized without the change above.

is_data = (func_rva >= data_start && func_rva < data_end )
|| (func_rva >= bss_start && func_rva < bss_end);

---------------------------------------------------------------

At one point, Ralf uses the following

/* skip unwanted symbols, which are exported in buggy auto-import
releases */
if (strstr(erva + name_rva,"_nm_") == 0)

What's the real purpose of this? It disallows my_symbol_nm_foo, as well
as _nm_foo or _imp_nm_foo or whatever it is that you're trying to screen out. Would it be better to use something like this, instead:

if (strncmp(erva+name_rva,"_nm_",4) != 0)

which would screen out only those symbols that *begin* with _nm_? My modified patch does NOT make this change, but I wonder if it should.

---------------------------------------------------------------

For the most part, it works as advertised. I did run in to one problem though. If I create a file structure like this:

/usr/local/bin/cygfoo.dll
/usr/local/lib/libfoo.dll.a -> /usr/local/bin/cygfoo.dll

Which seems like a logical thing to do, given that we're using the DLL to "substitute" for a true import lib. This way, you can do

gcc -o bar.exe bar.o -L/usr/local/lib -lfoo

and ld will use the symlink libfoo.dll.a to satisfy the dependency. Unfortunately, this doesn't work, because ld doesn't realize that "libfoo.dll.a" is actually a (symlink to) a DLL, and the pe_implied_import_dll routine is never called.

I know there are OTHER ways to set up the filesystem so that the gcc command above will work, such as:

/usr/local/bin/cygfoo.dll
/usr/local/lib/libfoo.dll -> /usr/local/bin/cygfoo.dll

or even

/usr/local/bin/cygfoo.dl
/usr/local/lib/cygfoo.dll -> /usr/local/bin/cygfoo.dll

But my point is that the original filesystem setup *should* work but does not. The problem is in emultempl/pe.em (line 1395):

if (bfd_get_format (entry->the_bfd) == bfd_object)
{
const char *ext = entry->filename + strlen (entry->filename) - 4;
if (strcmp (ext, ".dll") == 0 || strcmp (ext, ".DLL") == 0)
return pe_implied_import_dll (entry->filename);
}
#endif
return false;
}

As you can see, pe_implied_import_dll is only called if the filename ends in .dll or .DLL. We know that the DLL itself must have a name that ends in .dll(.DLL), but the linker ought to be able to recognize a symlink-to-a-dll as well(*). The stuff above should be replaced by something like the following:

if (bfd_get_format (entry->the_bfd) == bfd_object)
{
char fbuf[PATH_MAX];
const char *ext;
if (realpath(entry->filename,fbuf) == NULL)
strncpy(fbuf,entry->filename,PATH_MAX);
ext = fbuf + strlen (fbuf) - 4;
if (strcmp (ext, ".dll") == 0 || strcmp (ext, ".DLL") == 0)
return pe_implied_import_dll (entry->filename);
}
#endif
return false;
}

Only problem: there's no guarantee that realpath or PATH_MAX is available, so we need to jump thru some hoops to define LD_PATHMAX to PATH_MAX or MAXPATHLEN or whatever, depending on what headers are available...

So, we have to play games in ld/sysdep.h, and modify configure.in (and run autoconf and autoheader) ...but once that's done, the /usr/local/lib/libfoo.dll.a -> /usr/local/bin/cygfoo.dll scenario works.

- - - - - - - - - - - - - - - - - - - -
(*) symlink-to-a-dll would be INVALID without this change (already in Ralf's patch):

+ /* use internal dll name instead of filename
+ to enable symbolic dll linking */
+ dll_name = pe_as32 (expdata + 12) + erva ;

Without it, the symlink's name would get embedded into the target as a dependency -- and the Windows Runtime Loader would get really confused since it doesn't understant symlinks, and only loads files that DO end in .dll. So that's why this "problem" never came up before; it's only worth consideration given Ralf's change...but Ralf's change should be accompanied by the configure.in/config.in changes.
- - - - - - - - - - - - - - - - - - - -

---------------------------------------------------------------

I've split the patch into two pieces:
ld-auto-import-dll.patch-csw
the main changes
ld-auto-import-dll.patch-csw2.gz
the configure and config.in changes created by running
autoconf and autoheader.

Any comments on the revised patch? Is there a better way to handle the realpath()/REALPATH() thing?

2002-11-28 Ralf Habacker <Ralf.Habacker@freenet.de>
Charles Wilson <cwilson@ece.gatech.edu>

* ld/config.in: regenerate
* ld/configure: regenerate
* ld/configure.in: add check for realpath function
* ld/deffile.h: add .data field to def_file_import
structure
* ld/pe-dll.c (pe_proces_import_defs): use .data
field of def_file_import structure to initialize
flag_data field of def_file_export structure
(pe_implied_import_dll): new variables exp_funcbase
and [data|bss]_[start|end]. Use DLL's internal name
to set dll_name, not filename (which may be a symlink).
Scan the sections and initialize [data|bss]_[start|end].
When scanning the export table, skip _nm_ symbols, and
mark any symbols whose rva indicates that it is in the
.bss or .data sections as data.
* ld/sysdep.h: include limits.h and sys/param.h, and
define LD_PATHMAX as appropriate. Also define REALPATH
as realpath if it exists, NULL otherwise
* ld/emultempl/pe.em (gld_${EMULATION_NAME}_after_open):
call pe_process_import_defs before pe_find_data_imports,
so that auto-import will check the virtual implib as well
as "real" implibs.
(gld_${EMULATION_NAME}_recognized_file): use REALPATH to
follow symlinks to their target; check that the target's
extension is .dll before calling pe_implied_import_dll(),
not the filename itself (which may be a symlink).

--Chuck
Index: ld/configure.in
===================================================================
RCS file: /cvs/src/src/ld/configure.in,v
retrieving revision 1.20
diff -u -r1.20 configure.in
--- ld/configure.in	12 Nov 2002 10:08:23 -0000	1.20
+++ ld/configure.in	28 Nov 2002 20:54:49 -0000
@@ -83,7 +83,7 @@
 AC_SUBST(NATIVE_LIB_DIRS)
 
 AC_CHECK_HEADERS(string.h strings.h stdlib.h unistd.h)
-AC_CHECK_FUNCS(sbrk)
+AC_CHECK_FUNCS(sbrk realpath)
 AC_HEADER_DIRENT
 
 BFD_BINARY_FOPEN
Index: ld/deffile.h
===================================================================
RCS file: /cvs/src/src/ld/deffile.h,v
retrieving revision 1.5
diff -u -r1.5 deffile.h
--- ld/deffile.h	13 Mar 2001 06:14:27 -0000	1.5
+++ ld/deffile.h	28 Nov 2002 20:54:50 -0000
@@ -52,6 +52,7 @@
   def_file_module *module;	/* always set */
   char *name;			/* may be NULL; either this or ordinal will be set */
   int ordinal;			/* may be -1 */
+  int data;			/* = 1 if data */
 } def_file_import;
 
 typedef struct def_file {
Index: ld/pe-dll.c
===================================================================
RCS file: /cvs/src/src/ld/pe-dll.c,v
retrieving revision 1.48
diff -u -r1.48 pe-dll.c
--- ld/pe-dll.c	14 Nov 2002 18:03:16 -0000	1.48
+++ ld/pe-dll.c	28 Nov 2002 20:56:18 -0000
@@ -2418,7 +2418,7 @@
 		exp.hint = exp.ordinal >= 0 ? exp.ordinal : 0;
 		exp.flag_private = 0;
 		exp.flag_constant = 0;
-		exp.flag_data = 0;
+		exp.flag_data = pe_def_file->imports[i].data;
 		exp.flag_noname = exp.name ? 0 : 1;
 		one = make_one (&exp, output_bfd);
 		add_bfd_to_link (one, one->filename, link_info);
@@ -2490,11 +2490,18 @@
 {
   bfd *dll;
   unsigned long pe_header_offset, opthdr_ofs, num_entries, i;
-  unsigned long export_rva, export_size, nsections, secptr, expptr;
+  unsigned long export_rva, export_size, nsections, secptr, expptr,exp_funcbase;
   unsigned char *expdata, *erva;
   unsigned long name_rvas, ordinals, nexp, ordbase;
   const char *dll_name;
 
+  /* Initialization with start > end guarantees that is_data will not be
+     set by mistake, and avoids compiler warning */
+  unsigned long data_start = 1;
+  unsigned long data_end   = 0;
+  unsigned long bss_start  = 1;
+  unsigned long bss_end    = 0;
+
   /* No, I can't use bfd here.  kernel32.dll puts its export table in
      the middle of the .rdata section.  */
   dll = bfd_openr (filename, pe_details->target_name);
@@ -2511,11 +2518,8 @@
       return false;
     }
 
-  dll_name = filename;
-  for (i = 0; filename[i]; i++)
-    if (filename[i] == '/' || filename[i] == '\\' || filename[i] == ':')
-      dll_name = filename + i + 1;
-
+  /* get pe-header, optional header and numbers  
+     of export entries */
   pe_header_offset = pe_get32 (dll, 0x3c);
   opthdr_ofs = pe_header_offset + 4 + 20;
   num_entries = pe_get32 (dll, opthdr_ofs + 92);
@@ -2530,6 +2534,7 @@
 	    pe_get16 (dll, pe_header_offset + 4 + 16));
   expptr = 0;
 
+  /* get the rva and size of the export section */ 
   for (i = 0; i < nsections; i++)
     {
       char sname[8];
@@ -2550,6 +2555,43 @@
 	}
     }
 
+  /* scan sections and store the base and 
+     size of the data and bss segments in 
+     data/base_start/end */ 
+  for (i = 0; i < nsections; i++)
+    {
+      unsigned long secptr1 = secptr + 40 * i;
+
+      unsigned long vsize = pe_get32 (dll, secptr1 + 8);
+      unsigned long vaddr = pe_get32 (dll, secptr1 + 12);
+      unsigned long flags = pe_get32 (dll, secptr1 + 36);
+      char sec_name[9];
+
+      sec_name[8] = '\0';
+      bfd_seek (dll, (file_ptr) secptr1 + 0, SEEK_SET);
+      bfd_bread (sec_name, (bfd_size_type) 8, dll);
+
+      if (strcmp(sec_name,".data") == 0)
+	{
+	  data_start = vaddr;
+	  data_end = vaddr + vsize;
+	  if (pe_dll_extra_pe_debug)
+	    {
+	      printf("%s %s: 0x%08lx-0x%08lx (0x%08lx)\n",__FUNCTION__,sec_name,vaddr,vaddr+vsize,flags);
+	    
+	    }
+        }
+      else if (strcmp(sec_name,".bss") == 0)
+	{
+	  bss_start = vaddr;
+	  bss_end = vaddr + vsize;
+	  if (pe_dll_extra_pe_debug)
+	    {
+	      printf("%s %s: 0x%08lx-0x%08lx (0x%08lx)\n",__FUNCTION__,sec_name,vaddr,vaddr+vsize,flags);
+	    }
+	}
+    }
+
   expdata = (unsigned char *) xmalloc (export_size);
   bfd_seek (dll, (file_ptr) expptr, SEEK_SET);
   bfd_bread (expdata, (bfd_size_type) export_size, dll);
@@ -2562,16 +2604,42 @@
   name_rvas = pe_as32 (expdata + 32);
   ordinals = pe_as32 (expdata + 36);
   ordbase = pe_as32 (expdata + 16);
+  exp_funcbase = pe_as32 (expdata + 28);
+
+  /* use internal dll name instead of filename 
+     to enable symbolic dll linking */ 
+  dll_name = pe_as32 (expdata + 12) + erva ;
 
+  /* iterate through the list of symbols */ 
   for (i = 0; i < nexp; i++)
     {
+      /* pointer to the names vector */ 
       unsigned long name_rva = pe_as32 (erva + name_rvas + i * 4);
       def_file_import *imp;
 
-      imp = def_file_add_import (pe_def_file, erva + name_rva, dll_name,
-				 i, 0);
-    }
+      /* pointer to the function address vector */ 
+      unsigned long func_rva = pe_as32 (erva + exp_funcbase + i * 4);
+      int is_data = 0;
 
+      /* skip unwanted symbols, which are exported in buggy auto-import releases */
+      if (strstr(erva + name_rva,"_nm_") == 0)
+	{
+	  /* is_data is true if the address is in the data or bss segment */ 
+	  is_data = (func_rva >= data_start && func_rva < data_end )
+		    || (func_rva >= bss_start && func_rva < bss_end);
+	  imp = def_file_add_import (pe_def_file, erva + name_rva, dll_name,
+		i, 0);
+
+	  /* mark symbole type */
+	  imp->data = is_data;
+
+	  if (pe_dll_extra_pe_debug)
+	    {
+	      printf("%s dll-name: %s sym: %s addr: 0x%lx %s\n",
+		     __FUNCTION__, dll_name, erva + name_rva, func_rva, is_data ? "(data)" : "");
+	    }
+	}
+    }
   return true;
 }
 
Index: ld/sysdep.h
===================================================================
RCS file: /cvs/src/src/ld/sysdep.h,v
retrieving revision 1.2
diff -u -r1.2 sysdep.h
--- ld/sysdep.h	13 Mar 2001 06:14:27 -0000	1.2
+++ ld/sysdep.h	28 Nov 2002 20:56:19 -0000
@@ -48,6 +48,30 @@
 #include <unistd.h>
 #endif
 
+/* for PATH_MAX */
+#ifdef HAVE_LIMITS_H
+#include <limits.h>
+#endif
+/* for MAXPATHLEN */
+#ifdef HAVE_SYS_PARAM_H
+#include <sys/param.h>
+#endif
+#ifdef PATH_MAX
+# define LD_PATHMAX PATH_MAX
+#else
+# ifdef MAXPATHLEN
+#  define LD_PATHMAX MAXPATHLEN
+# else
+#  define LD_PATHMAX 1024
+# endif
+#endif
+
+#ifdef HAVE_REALPATH
+# define REALPATH(a,b) realpath(a,b)
+#else
+# define REALPATH(a,b) NULL
+#endif
+
 #ifdef USE_BINARY_FOPEN
 #include "fopen-bin.h"
 #else
Index: ld/emultempl/pe.em
===================================================================
RCS file: /cvs/src/src/ld/emultempl/pe.em,v
retrieving revision 1.69
diff -u -r1.69 pe.em
--- ld/emultempl/pe.em	14 Nov 2002 18:03:16 -0000	1.69
+++ ld/emultempl/pe.em	28 Nov 2002 20:56:37 -0000
@@ -1009,9 +1009,10 @@
   if (pe_enable_stdcall_fixup) /* -1=warn or 1=disable */
     pe_fixup_stdcalls ();
 
+  pe_process_import_defs(output_bfd, &link_info);
+
   pe_find_data_imports ();
 
-  pe_process_import_defs(output_bfd, &link_info);
   if (link_info.shared)
     pe_dll_build_sections (output_bfd, &link_info);
 
@@ -1388,10 +1389,13 @@
 #endif
   if (bfd_get_format (entry->the_bfd) == bfd_object)
     {
-      const char *ext = entry->filename + strlen (entry->filename) - 4;
-
+      char fbuf[LD_PATHMAX];
+      const char *ext;
+      if (REALPATH(entry->filename,fbuf) == NULL)
+	strncpy(fbuf,entry->filename,LD_PATHMAX);
+      ext = fbuf + strlen (fbuf) - 4;
       if (strcmp (ext, ".dll") == 0 || strcmp (ext, ".DLL") == 0)
-	return pe_implied_import_dll (entry->filename);
+	return pe_implied_import_dll (fbuf);
     }
 #endif
   return false;

Attachment: ld-auto-import-dll.patch-csw2.gz
Description: GNU Zip compressed data


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]