This is the mail archive of the cygwin-apps@cygwin.com mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

[RFA] pei386 dll: auto-import patch


I have taken Paul Sokolovsky's auto-import and export-filtering patch, 
as modified by Robert Collins, and have updated it to current CVS.  I've 
added documentation concerning auto-import to ld.texinfo (under Machine 
Dependent section).  In addition to the extensive testing of the older 
version (as has been discussed on and off this list for the past several 
  weeks, esp. last week), I have rebuilt current CVS binutils with this 
patch, and it works as expected.  It adds desirable behavior to ld on 
x86 under windows-derived OS's, such as mingw and cygwin.  I believe all 
the issues raised by members of those communities have been addressed.

This patch actually commingles two behaviors: additional auto-EXport 
filtering and auto-import of DATA items.  However, the auto-EXport 
improvements are a prerequisite for the auto-import to work properly. 
Also, the auto-EXport filtering merely extends the current behavior 
using established hooks.  Thus, it's not really a "new" feature, and is 
combined with the novel auto-import patch for which it is necessary prereq.

Please apply.

--Charles Wilson

2001-07-31  Paul Sokolovsky  <paul.sokolovsky@technologist.com>

         * ld/pe-dll.c: new variable pe_dll_gory_debug. New static
         variable current_sec (static struct sec *)
         * ld/pe-dll.c (auto_export): change API, pass abfd for
         contextual filtering: don't export any symbols that come
         from libgcc, libstdc++, libmingw32. crt0.o, crt1.o
         or crt2.o.
         Do not export symbols starting with
            "_imp__" (e.g. don't re-export imported symbols)
            "__rtti_" or "__builtin_" (C++ support)
         Do not export the following symbols
            "DllMainCRTStartup@12" "_cygwin_dll_entry@12"
            "_cygwin_crt0_common@8" "_cygwin_noncygwin_dll_entry@12"
            "_fmode" "_impure_ptr" "cygwin_attach_dll" "cygwin_premain0"
            "cygwin_premain1" "cygwin_premain2" "cygwin_premain3"
	   "environ"
         Do not export symbols for specifying internal layout of DLLs
            exclude symbols starting with "_head_"
            exclude symbols ending with "_iname"
         * ld/pe-dll.c (process_def_file): Don't export undefined
	symbols. Do not export symbols starting with  "_imp__" (e.g.
	don't re-export imported symbols).  Call auto_export() with
	new API.
         * ld/pe-dll.c (pe_walk_relocs_of_symbol): New function.
         * ld/pe-dll.c (generate_reloc): add optional gory debugging
         * ld/pe-dll.c (pe_dll_generate_def_file): eliminate extraneous
         initial blank line in output
         * ld/pe-dll.c (make_one): enlarge symtab to make room for
         __nm__ symbols (DATA auto-import support).
         * ld/pe-dll.c (make_singleton_name_thunk): New function.
         * ld/pe-dll.c (make_import_fixup_mark): New function.
         * ld/pe-dll.c (make_import_fixup_entry): New function.
         * ld/pe-dll.c (pe_create_import_fixup): New function.
         * ld/pe-dll.c (add_bfd_to_link): make this function non-static.
	Specify that name argument is a CONST char *.

         * ld/pe-dll.h: declare new variables int pe_dll_auto_import and
         pe_dll_gory_debug; declare new functions
	pe_walk_relocs_of_symbol and pe_create_import_fixup.

         * ld/emultempl/pe.em: add options --disable-auto-import and
         --enable-gory-debug.  New variable data_import_dll.
         * ld/emultempl/pe.em (make_import_fixup): New function.
         * ld/emultempl/pe.em (pe_find_data_imports): New function.
         * ld/emultempl/pe.em (pr_sym): New function.
         * ld/emultempl/pe.em (gld_${EMULATION_NAME}_after_open): Add
	optional gory debugging. Call pe_find_data_imports.  .idata is
	NOT code, it is data; flag it so.

         * ld/ldlang.c (load_symbols): check if
         ldemul_unrecognized_file(entry) prior to calling
         bfd_get_error()

         * bfd/cofflink.c: new variable pe_dll_auto_import
         * bfd/cofflink.c (coff_link_check_ar_symbols): also search for
         __imp__symbol as well as _symbol.

         * bfd/linker.c (_bfd_generic_link_add_archive_symbols): also
         search for __imp__symbol as well as _symbol.

2001-07-31  Charles Wilson  <cwilson@ece.gatech.edu>
	
	* ld/gen-doc.texi: process I80386 arch-specific documentation in
	ld.texinfo

         * ld/ld.texinfo: add additional documentation for
         --export-all-symbols.  Document --out-implib,
         --enable-auto-image-base, --disable-auto-image-base,
         --dll-search-prefix, and --disable-auto-import.  Add ix86
         machine-specific documentation; currently, only documents
         the auto-import changes for DLLs.

Index: bfd/cofflink.c
===================================================================
RCS file: /cvs/src/src/bfd/cofflink.c,v
retrieving revision 1.24
diff -u -r1.24 cofflink.c
--- cofflink.c	2001/07/03 15:49:46	1.24
+++ cofflink.c	2001/07/31 16:17:47
@@ -28,6 +28,9 @@
 #include "coff/internal.h"
 #include "libcoff.h"
 
+/* Setting this to 0 disables PE autoimport support */
+int pe_dll_auto_import=1;
+
 static boolean coff_link_add_object_symbols
   PARAMS ((bfd *, struct bfd_link_info *));
 static boolean coff_link_check_archive_element
@@ -277,6 +280,14 @@
 	    return false;
 	  h = bfd_link_hash_lookup (info->hash, name, false, false, true);
 
+          /*PS*/
+          if (!h && pe_dll_auto_import)
+            {
+              if (!strncmp(name,"__imp_",6))
+              {
+                h = bfd_link_hash_lookup (info->hash, name+6, false, false, true);
+              }
+            }
 	  /* We are only interested in symbols that are currently
 	     undefined.  If a symbol is currently known to be common,
 	     COFF linkers do not bring in an object file which defines
Index: bfd/linker.c
===================================================================
RCS file: /cvs/src/src/bfd/linker.c,v
retrieving revision 1.10
diff -u -r1.10 linker.c
--- linker.c	2001/07/05 22:40:16	1.10
+++ linker.c	2001/07/31 16:17:51
@@ -1003,10 +1003,22 @@
       arh = archive_hash_lookup (&arsym_hash, h->root.string, false, false);
       if (arh == (struct archive_hash_entry *) NULL)
 	{
+         /* If we haven't found very symbol, let's look for its
+            import thunk */
+         extern int pe_dll_auto_import;
+
+          if (pe_dll_auto_import)
+            {
+              char *buf=alloca(strlen(h->root.string)+10);
+              sprintf(buf,"__imp_%s",h->root.string);
+              arh = archive_hash_lookup (&arsym_hash, buf, false, false);
+            }
+          if (arh == (struct archive_hash_entry *) NULL)
+    	  {
 	  pundef = &(*pundef)->next;
 	  continue;
 	}
-
+      }
       /* Look at all the objects which define this symbol.  */
       for (l = arh->defs; l != (struct archive_list *) NULL; l = l->next)
 	{
Index: ld/gen-doc.texi
===================================================================
RCS file: /cvs/src/src/ld/gen-doc.texi,v
retrieving revision 1.2
diff -u -r1.2 gen-doc.texi
--- gen-doc.texi	2000/06/20 13:29:06	1.2
+++ gen-doc.texi	2001/07/31 16:18:09
@@ -5,6 +5,7 @@
 @c 2. Specific target machines
 @set H8300
 @set I960
+@set I80386
 @set TICOFF
 
 @c 3. Properties of this configuration
Index: ld/ld.texinfo
===================================================================
RCS file: /cvs/src/src/ld/ld.texinfo,v
retrieving revision 1.42
diff -u -r1.42 ld.texinfo
--- ld.texinfo	2001/07/30 18:12:07	1.42
+++ ld.texinfo	2001/07/31 16:18:17
@@ -137,6 +137,9 @@
 @ifset I960
 * i960::                        ld and the Intel 960 family
 @end ifset
+@ifset I80386
+* ix86::                        ld and the Intel x86 family
+@end ifset
 @ifset TICOFF
 * TI COFF::                     ld and the TI COFF
 @end ifset
@@ -1601,8 +1604,22 @@
 explicitly exported via DEF files or implicitly exported via function
 attributes, the default is to not export anything else unless this
 option is given.  Note that the symbols @code{DllMain@@12},
-@code{DllEntryPoint@@0}, and @code{impure_ptr} will not be automatically
-exported.
+@code{DllEntryPoint@@0}, @code{DllMainCRTStartup@@12}, and 
+@code{impure_ptr} will not be automatically
+exported.  Also, symbols imported from other DLLs will not be 
+re-exported, nor will symbols specifying the DLL's internal layout 
+such as those beginning with @code{_head_} or ending with 
+@code{_iname}.  In addition, no symbols from @code{libgcc}, 
+@code{libstd++}, @code{libmingw32}, or @code{crtX.o} will be exported.
+Symbols whose names begin with @code{__rtti_} or @code{__builtin_} will
+not be exported, to help with C++ DLLs.  Finally, there is an
+extensive list of cygwin-private symbols that are not exported 
+(obviously, this applies on when building DLLs for cygwin targets).
+These cygwin-excludes are: @code{_cygwin_dll_entry@@12}, 
+@code{_cygwin_crt0_common@@8}, @code{_cygwin_noncygwin_dll_entry@@12},
+@code{_fmode}, @code{_impure_ptr}, @code{cygwin_attach_dll}, 
+@code{cygwin_premain0}, @code{cygwin_premain1}, @code{cygwin_premain2},
+@code{cygwin_premain3}, and @code{environ}. 
 
 @kindex --exclude-symbols
 @item --exclude-symbols @var{symbol},@var{symbol},...
@@ -1672,6 +1689,48 @@
 library with @code{dlltool} or may be used as a reference to
 automatically or implicitly exported symbols.
 
+@cindex DLLs, creating
+@kindex --out-implib
+@item --out-implib @var{file}
+The linker will create the file @var{file} which will contain an
+import lib corresponding to the DLL the linker is generating. This
+import lib (which should be called @code{*.dll.a} or @code{*.a}
+may be used to link clients against the generated DLL; this behavior
+makes it possible to skip a separate @code{dlltool} import library
+creation step.
+
+@cindex DLLs, creating
+@kindex --enable-auto-image-base
+@item --enable-auto-image-base
+Automatically choose the image base for DLLs, unless one is specified
+using the @code{--image-base} argument.  By using a hash generated
+from the dllname to create unique image bases for each DLL, in-memory
+collisions and relocations which can delay program execution are
+avoided.
+
+@cindex DLLs, creating
+@kindex --disable-auto-image-base
+@item --disable-auto-image-base
+Do not automatically generate a unique image base.  If there is no
+user-specified image base (@code{--image-base}) then use the platform
+default.
+
+@cindex DLLs, linking to
+@kindex --dll-search-prefix
+@item --dll-search-prefix @var{string}
+When linking dynamically to a dll without an import library, i
+search for @code{<string><basename>.dll} in preference to 
+@code{lib<basename>.dll}. This behavior allows easy distinction
+between DLLs built for the various "subplatforms": native, cygwin,
+uwin, pw, etc.  For instance, cygwin DLLs typically use
+@code{--dll-search-prefix=cyg}. 
+
+@cindex DLLs, linking to
+@kindex --disable-auto-import
+@item --disable-auto-import
+Do not do sophisticalted linking of @code{_symbol} to 
+@code{__imp__symbol} for DATA references.
+
 @kindex --section-alignment
 @item --section-alignment
 Sets the section alignment.  Sections in memory will always begin at
@@ -4057,6 +4116,7 @@
 @menu
 * H8/300::                      @code{ld} and the H8/300
 * i960::                        @code{ld} and the Intel 960 family
+* ix86::                        @code{ld} and the Intel x86 family
 * ARM::				@code{ld} and the ARM family
 * HPPA ELF32::                  @code{ld} and HPPA 32-bit ELF
 @ifset TICOFF
@@ -4115,6 +4175,106 @@
 these chips.
 @end ifset
 @end ifclear
+
+@ifset I80386
+@ifclear GENERIC
+@raisesections
+@end ifclear
+
+@node ix86
+@section @code{ld} and the Intel x86 Family
+
+@table @emph
+@cindex ix86 DLL support
+@code{ld} can create DLLs that operate with various runtimes available
+on a common x86 operating system.  These runtimes include native (using 
+the mingw "platform"), cygwin, and pw.
+
+@cindex DLLs, creating
+@cindex DLLs, linking to
+@item auto-import from DLLs 
+@enumerate
+@item
+With this feature on, DLL clients can import variables from DLL 
+without any concern from their side (for example, without any source
+code modifications).  Auto-import is on by default, and the behavior
+is disable via the @code{--disable-auto-import} flag.
+
+@item
+This is done completely in bounds of the PE specification (to be fair,
+there's a minor violation of the spec at one point, but in practice 
+auto-import works on all known variants of that common x86 operating
+system)  So, the resulting DLL can be used with any other PE 
+compiler/linker.
+
+@item
+Auto-import is fully compatible with standard import method, in which
+variables are decorated using attribute modifiers. Libraries of either
+type may be mixed together.
+
+@item
+Overhead (space): 8 bytes per imported symbol, plus 20 for each
+reference to it; Overhead (load time): negligible; Overhead 
+(virtual/physical memory): should be less than effect of DLL 
+relocation.
+@end enumerate
+
+Motivation
+
+The obvious and only way to get rid of dllimport insanity is 
+to make client access variable directly in the DLL, bypassing 
+the extra dereference imposed by ordinary DLL runtime linking.
+I.e., whenever client contains someting like
+
+@code{mov dll_var,%eax,}
+
+address of dll_var in the command should be relocated to point 
+into loaded DLL. The aim is to make OS loader do so, and than 
+make ld help with that.  Import section of PE made following 
+way: there's a vector of structures each describing imports 
+from particular DLL. Each such structure points to two other 
+parellel vectors: one holding imported names, and one which 
+will hold address of corresponding imported name. So, the 
+solution is de-vectorize these structures, making import 
+locations be sparse and pointing directly into code.
+
+Implementation
+
+For each reference of data symbol to be imported from DLL (to 
+set of which belong symbols with name <sym>, if __imp_<sym> is 
+found in implib), the import fixup entry is generated. That 
+entry is of type IMAGE_IMPORT_DESCRIPTOR and stored in .idata$3 
+subsection. Each fixup entry contains pointer to symbol's address 
+within .text section (marked with __fuN_<sym> symbol, where N is 
+integer), pointer to DLL name (so, DLL name is referenced by 
+multiple entries), and pointer to symbol name thunk. Symbol name 
+thunk is singleton vector (__nm_th_<symbol>) pointing to 
+IMAGE_IMPORT_BY_NAME structure (__nm_<symbol>) directly containing 
+imported name. Here comes that "om the edge" problem mentioned above: 
+PE specification rambles that name vector (OriginalFirstThunk) should 
+run in parallel with addresses vector (FirstThunk), i.e. that they 
+should have same number of elements and terminated with zero. We violate
+this, since FirstThunk points directly into machine code. But in 
+practice, OS loader implemented the sane way: it goes thru 
+OriginalFirstThunk and puts addresses to FirstThunk, not something 
+else. It once again should be noted that dll and symbol name 
+structures are reused across fixup entries and should be there 
+anyway to support standard import stuff, so sustained overhead is 
+20 bytes per reference. Other question is whether having several 
+IMAGE_IMPORT_DESCRIPTORS for the same DLL is possible. Answer is yes, 
+it is done even by native compiler/linker (libth32's functions are in 
+fact resident in windows9x kernel32.dll, so if you use it, you have 
+two IMAGE_IMPORT_DESCRIPTORS for kernel32.dll). Yet other question is 
+whether referencing the same PE structures several times is valid. 
+The answer is why not, prohibiting that (detecting violation) would 
+require more work on behalf of loader than not doing it.
+
+@end table
+
+@ifclear GENERIC
+@lowersections
+@end ifclear
+@end ifset
 
 @ifset I960
 @ifclear GENERIC
Index: ld/ldlang.c
===================================================================
RCS file: /cvs/src/src/ld/ldlang.c,v
retrieving revision 1.55
diff -u -r1.55 ldlang.c
--- ldlang.c	2001/07/19 16:21:39	1.55
+++ ldlang.c	2001/07/31 16:18:22
@@ -1452,12 +1452,12 @@
       bfd_error_type err;
       lang_statement_list_type *hold;
       boolean bad_load = true;
-      
-      err = bfd_get_error ();
 
       /* See if the emulation has some special knowledge.  */
       if (ldemul_unrecognized_file (entry))
-	return true;
+        return true;
+      
+      err = bfd_get_error ();
 
       if (err == bfd_error_file_ambiguously_recognized)
 	{
Index: ld/pe-dll.c
===================================================================
RCS file: /cvs/src/src/ld/pe-dll.c,v
retrieving revision 1.23
diff -u -r1.23 pe-dll.c
--- pe-dll.c	2001/03/13 06:14:27	1.23
+++ pe-dll.c	2001/07/31 16:18:24
@@ -54,6 +54,84 @@
 
  ************************************************************************/
 
+/************************************************************************
+
+ Auto-import feature by Paul Sokolovsky
+
+ Quick facts:
+
+ 1. With this feature on, DLL clients can import variables from DLL
+ without any concern from their side (for example, without any source
+ code modifications).
+
+ 2. This is done completely in bounds of the PE specification (to be fair,
+ there's a place where it pokes nose out of, but in practise it works).
+ So, resulting module can be used with any other PE compiler/linker.
+
+ 3. Auto-import is fully compatible with standard import method and they
+ can be mixed together.
+
+ 4. Overheads: space: 8 bytes per imported symbol, plus 20 for each
+ reference to it; load time: negligible; virtual/physical memory: should be
+ less than effect of DLL relocation, and I sincerely hope it doesn't affect
+ DLL sharability (too much).
+
+ Idea
+
+ The obvious and only way to get rid of dllimport insanity is to make client
+ access variable directly in the DLL, bypassing extra dereference. I.e.,
+ whenever client contains someting like
+
+ mov dll_var,%eax,
+
+ address of dll_var in the command should be relocated to point into loaded
+ DLL. The aim is to make OS loader do so, and than make ld help with that.
+ Import section of PE made following way: there's a vector of structures
+ each describing imports from particular DLL. Each such structure points
+ to two other parellel vectors: one holding imported names, and one which
+ will hold address of corresponding imported name. So, the solution is
+ de-vectorize these structures, making import locations be sparse and
+ pointing directly into code. Before continuing, it is worth a note that,
+ while authors strives to make PE act ELF-like, there're some other people
+ make ELF act PE-like: elfvector, ;-) .
+
+ Implementation
+
+ For each reference of data symbol to be imported from DLL (to set of which
+ belong symbols with name <sym>, if __imp_<sym> is found in implib), the
+ import fixup entry is generated. That entry is of type
+ IMAGE_IMPORT_DESCRIPTOR and stored in .idata$3 subsection. Each
+ fixup entry contains pointer to symbol's address within .text section
+ (marked with __fuN_<sym> symbol, where N is integer), pointer to DLL name
+ (so, DLL name is referenced by multiple entries), and pointer to symbol
+ name thunk. Symbol name thunk is singleton vector (__nm_th_<symbol>)
+ pointing to IMAGE_IMPORT_BY_NAME structure (__nm_<symbol>) directly
+ containing imported name. Here comes that "om the edge" problem mentioned
+ above: PE specification rambles that name vector (OriginalFirstThunk)
+ should run in parallel with addresses vector (FirstThunk), i.e. that they
+ should have same number of elements and terminated with zero. We violate
+ this, since FirstThunk points directly into machine code. But in practise,
+ OS loader implemented the sane way: it goes thru OriginalFirstThunk and
+ puts addresses to FirstThunk, not something else. It once again should be
+ noted that dll and symbol name structures are reused across fixup entries
+ and should be there anyway to support standard import stuff, so sustained
+ overhead is 20 bytes per reference. Other question is whether having several
+ IMAGE_IMPORT_DESCRIPTORS for the same DLL is possible. Answer is yes, it is
+ done even by native compiler/linker (libth32's functions are in fact reside
+ in windows9x kernel32.dll, so if you use it, you have two
+ IMAGE_IMPORT_DESCRIPTORS for kernel32.dll). Yet other question is whether
+ referencing the same PE structures several times is valid. The answer is why
+ not, prohibitting that (detecting violation) would require more work on
+ behalf of loader than not doing it.
+
+
+ See also: ld/emultempl/pe.em
+
+ ************************************************************************/
+
+void
+add_bfd_to_link (bfd *abfd, CONST char *name, struct bfd_link_info *link_info);
+
 /* for emultempl/pe.em */
 
 def_file *pe_def_file = 0;
@@ -63,6 +141,7 @@
 int pe_dll_stdcall_aliases = 0;
 int pe_dll_warn_dup_exports = 0;
 int pe_dll_compat_implib = 0;
+int pe_dll_gory_debug = 0;
 
 /************************************************************************
 
@@ -231,24 +310,103 @@
   free (local_copy);
 }
 
+/*
+   abfd is a bfd containing n (or NULL)
+   It can be used for contextual checks.
+*/
 static int
-auto_export (d, n)
+auto_export (abfd, d, n)
+     bfd *abfd;
      def_file *d;
      const char *n;
 {
   int i;
   struct exclude_list_struct *ex;
+
+  /* we should not re-export imported stuff */
+  if (strncmp (n, "_imp__",6) == 0)
+    return 0;
+
   for (i = 0; i < d->num_exports; i++)
     if (strcmp (d->exports[i].name, n) == 0)
       return 0;
   if (pe_dll_do_default_excludes)
     {
+if (pe_dll_gory_debug) printf("considering exporting: %s "
+	"abfd=%x, abfd->my_arc=%x\n",n,abfd,abfd->my_archive);
+      /* First of all, make context checks:
+         Don't export anything from libgcc */
+
+      if (abfd
+          && abfd->my_archive)
+        {
+          /* Do not specify suffix explicitly, to allow for dllized versions */
+          if (strstr(abfd->my_archive->filename,"libgcc.")) return 0;
+          if (strstr(abfd->my_archive->filename,"libstdc++.")) return 0;
+          if (strstr(abfd->my_archive->filename,"libmingw32.")) return 0;
+        }
+
+      {
+        /* skip .*crt\d\.o */
+        char *p;
+        if (abfd
+            && (p=strstr(abfd->filename,"crt"))
+            && (isdigit(p[3]) && p[4]=='.' && p[5]=='o' && p[6]==0)) return 0;
+      }
+
+#if 0
+      /* Don't export any 'reserved' symbols */
+      if (*n && *n=='_' && n[1]=='_') return 0;
+#endif
+
       if (strcmp (n, "DllMain@12") == 0)
 	return 0;
       if (strcmp (n, "DllEntryPoint@0") == 0)
 	return 0;
+      if (strcmp (n, "DllMainCRTStartup@12") == 0)
+	return 0;
       if (strcmp (n, "impure_ptr") == 0)
 	return 0;
+
+      if (strncmp (n, "__rtti_", 7) == 0)
+	return 0;
+      if (strncmp (n, "__builtin_", 10) == 0)
+	return 0;
+
+      /* Cygwin specific: don't export imported cygwin symbols */
+      if (strncmp (n, "_cygwin_dll_entry@12", 20) == 0)
+	return 0;
+      if (strncmp (n, "_cygwin_crt0_common@8", 21) == 0)
+        return 0;
+      if (strncmp (n, "_cygwin_noncygwin_dll_entry@12", 30) == 0)
+        return 0;
+      if (strncmp (n, "_fmode", 6) == 0)
+        return 0;
+      if (strncmp (n, "_impure_ptr", 11) == 0)
+        return 0;
+      if (strncmp (n, "cygwin_attach_dll", 17) == 0)
+        return 0;
+      if (strncmp (n, "cygwin_premain0", 15) == 0)
+        return 0;
+      if (strncmp (n, "cygwin_premain1", 15) == 0)
+        return 0;
+      if (strncmp (n, "cygwin_premain2", 15) == 0)
+        return 0;
+      if (strncmp (n, "cygwin_premain3", 15) == 0)
+        return 0;
+      if (strncmp (n, "environ", 7) == 0)
+        return 0;
+
+      /* Don't export symbols for specifying DLL's internal layout */
+
+      if (strncmp (n, "_head_", 6) == 0)
+	return 0;
+
+      {
+        int len = strlen(n);
+        if (len>6 && strncmp (n + len - 6, "_iname", 7) == 0)
+	  return 0;
+      }
     }
   for (ex = excludes; ex; ex = ex->next)
     if (strcmp (n, ex->string) == 0)
@@ -302,14 +460,29 @@
 	  for (j = 0; j < nsyms; j++)
 	    {
 	      /* We should export symbols which are either global or not
-	         anything at all.  (.bss data is the latter)  */
-	      if ((symbols[j]->flags & BSF_GLOBAL)
-		  || (symbols[j]->flags == BSF_NO_FLAGS))
+	         anything at all.  (.bss data is the latter)
+                 We should not export undefined symbols
+              */
+	      if (symbols[j]->section!=&bfd_und_section
+                  && ((symbols[j]->flags & BSF_GLOBAL)
+		      || (symbols[j]->flags == BFD_FORT_COMM_DEFAULT_VALUE)))
 		{
 		  const char *sn = symbols[j]->name;
+
+                  /* we should not re-export imported stuff */
+                  {
+                    char *name = (char *) xmalloc (strlen (sn) + 2 + 6);
+                    sprintf (name, "%s%s", U("_imp_"), sn);
+                    blhe = bfd_link_hash_lookup (info->hash, name,
+                        			 false, false, false);
+                    free (name);
+
+                    if (blhe && blhe->type == bfd_link_hash_defined) continue;
+                  }
+
 		  if (*sn == '_')
 		    sn++;
-		  if (auto_export (pe_def_file, sn))
+		  if (auto_export (b, pe_def_file, sn))
                     {
                       def_file_export *p;
                       p=def_file_add_export (pe_def_file, sn, 0, -1);
@@ -350,7 +523,7 @@
 	    {
 	      char *tmp = xstrdup (pe_def_file->exports[i].name);
 	      *(strchr (tmp, '@')) = 0;
-	      if (auto_export (pe_def_file, tmp))
+	      if (auto_export (NULL, pe_def_file, tmp))
 		def_file_add_export (pe_def_file, tmp,
 				     pe_def_file->exports[i].internal_name, -1);
 	      else
@@ -731,6 +904,58 @@
     }
 }
 
+
+static struct sec *current_sec;
+
+void
+pe_walk_relocs_of_symbol (info, name, cb)
+     struct bfd_link_info *info;
+     CONST char *name;
+     int (*cb)(arelent*);
+{
+  bfd *b;
+  struct sec *s;
+
+  for (b = info->input_bfds; b; b = b->link_next)
+    {
+      arelent **relocs;
+      int relsize, nrelocs, i;
+
+      for (s = b->sections; s; s = s->next)
+	{
+	  asymbol **symbols;
+	  int nsyms, symsize;
+	int flags = bfd_get_section_flags (b, s);
+
+	/* Skip discarded linkonce sections */
+	if (flags & SEC_LINK_ONCE
+	    && s->output_section == bfd_abs_section_ptr)
+	  continue;
+
+          current_sec=s;
+
+	  symsize = bfd_get_symtab_upper_bound (b);
+	  symbols = (asymbol **) xmalloc (symsize);
+	  nsyms = bfd_canonicalize_symtab (b, symbols);
+
+	  relsize = bfd_get_reloc_upper_bound (b, s);
+	  relocs = (arelent **) xmalloc ((size_t) relsize);
+	  nrelocs = bfd_canonicalize_reloc (b, s, relocs, symbols);
+
+	  for (i = 0; i < nrelocs; i++)
+	    {
+              struct symbol_cache_entry *sym = *relocs[i]->sym_ptr_ptr;
+              if (!strcmp(name,sym->name)) cb(relocs[i]);
+	    }
+	  free (relocs);
+	  /* Warning: the allocated symbols are remembered in BFD and reused
+	     later, so don't free them! */
+	  /* free (symbols); */
+	}
+    }
+
+}
+
 /************************************************************************
 
  Gather all the relocations and build the .reloc section
@@ -801,6 +1026,10 @@
 
 	  for (i = 0; i < nrelocs; i++)
 	    {
+if (pe_dll_gory_debug) {
+struct symbol_cache_entry *sym = *relocs[i]->sym_ptr_ptr;
+printf("rel: %s\n",sym->name);
+}
 	      if (!relocs[i]->howto->pc_relative
 		  && relocs[i]->howto->type != pe_details->imagebase_reloc)
 		{
@@ -1039,7 +1268,7 @@
 
       if (pe_def_file->num_exports > 0)
 	{
-	  fprintf (out, "\nEXPORTS\n\n");
+	  fprintf (out, "EXPORTS\n");
 	  for (i = 0; i < pe_def_file->num_exports; i++)
 	    {
 	      def_file_export *e = pe_def_file->exports + i;
@@ -1445,7 +1674,7 @@
   bfd_set_arch_mach (abfd, pe_details->bfd_arch, 0);
 
   symptr = 0;
-  symtab = (asymbol **) xmalloc (10 * sizeof (asymbol *));
+  symtab = (asymbol **) xmalloc (11 * sizeof (asymbol *));
   tx  = quick_section (abfd, ".text",    SEC_CODE|SEC_HAS_CONTENTS, 2);
   id7 = quick_section (abfd, ".idata$7", SEC_HAS_CONTENTS, 2);
   id5 = quick_section (abfd, ".idata$5", SEC_HAS_CONTENTS, 2);
@@ -1455,6 +1684,9 @@
     quick_symbol (abfd, U (""), exp->internal_name, "", tx, BSF_GLOBAL, 0);
   quick_symbol (abfd, U ("_head_"), dll_symname, "", UNDSEC, BSF_GLOBAL, 0);
   quick_symbol (abfd, U ("_imp__"), exp->internal_name, "", id5, BSF_GLOBAL, 0);
+  /* symbol to reference ord/name of imported symbol, used to implement
+     auto-import */
+  quick_symbol (abfd, U("_nm__"), exp->internal_name, "", id6, BSF_GLOBAL, 0);
   if (pe_dll_compat_implib)
     quick_symbol (abfd, U ("__imp_"), exp->internal_name, "",
 		  id5, BSF_GLOBAL, 0);
@@ -1553,6 +1785,175 @@
   return abfd;
 }
 
+static bfd *
+make_singleton_name_thunk (import, parent)
+     char *import;
+     bfd *parent;
+{
+  /* name thunks go to idata$4 */
+
+  asection *id4;
+  unsigned char *d4;
+  char *oname;
+  bfd *abfd;
+
+  oname = (char *) xmalloc (20);
+  sprintf (oname, "nmth%06d.o", tmp_seq);
+  tmp_seq++;
+
+  abfd = bfd_create (oname, parent);
+  bfd_find_target (pe_details->object_target, abfd);
+  bfd_make_writable (abfd);
+
+  bfd_set_format (abfd, bfd_object);
+  bfd_set_arch_mach (abfd, pe_details->bfd_arch, 0);
+
+  symptr = 0;
+  symtab = (asymbol **) xmalloc (3 * sizeof (asymbol *));
+  id4 = quick_section (abfd, ".idata$4", SEC_HAS_CONTENTS, 2);
+  quick_symbol (abfd, U("_nm_thnk_"), import, "", id4, BSF_GLOBAL, 0);
+  quick_symbol (abfd, U("_nm_"), import, "", UNDSEC, BSF_GLOBAL, 0);
+
+  bfd_set_section_size (abfd, id4, 8);
+  d4 = (unsigned char *) xmalloc (4);
+  id4->contents = d4;
+  memset (d4, 0, 8);
+  quick_reloc (abfd, 0, BFD_RELOC_RVA, 2);
+  save_relocs (id4);
+
+  bfd_set_symtab (abfd, symtab, symptr);
+
+  bfd_set_section_contents (abfd, id4, d4, 0, 8);
+
+  bfd_make_readable (abfd);
+  return abfd;
+}
+
+char *
+make_import_fixup_mark (rel)
+     arelent *rel;
+{
+  /* we convert reloc to symbol, for later reference */
+  static int counter;
+  static char fixup_name[300];
+
+  struct symbol_cache_entry *sym = *rel->sym_ptr_ptr;
+
+  bfd *abfd=bfd_asymbol_bfd(sym);
+  struct coff_link_hash_entry *myh=NULL;
+
+  sprintf(fixup_name,"__fu%d_%s",counter++,sym->name);
+  bfd_coff_link_add_one_symbol(&link_info,
+                                   abfd,
+                                   fixup_name,
+                                   BSF_GLOBAL,
+                                   current_sec, //sym->section,
+                                   rel->address,
+                                   NULL,
+                                   true,
+                                   false,
+                                   (struct bfd_link_hash_entry **) &myh);
+
+/*printf("type:%d\n",myh->type);
+printf("%s\n",myh->root.u.def.section->name);
+*/
+  return fixup_name;
+}
+
+
+/*
+ *	.section	.idata$3
+ *	.rva		__nm_thnk_SYM (singleton thunk with name of func)
+ *	.long		0
+ *	.long		0
+ *	.rva		__my_dll_iname (name of dll)
+ *	.rva		__fuNN_SYM (pointer to reference (address) in text)
+ *
+ */
+
+static bfd *
+make_import_fixup_entry (name,fixup_name,dll_symname,parent)
+     char *name;
+     char *fixup_name;
+     char *dll_symname;
+     bfd *parent;
+{
+  asection *id3;
+  unsigned char *d3;
+  char *oname;
+  bfd *abfd;
+
+  oname = (char *) xmalloc (20);
+  sprintf (oname, "fu%06d.o", tmp_seq);
+  tmp_seq++;
+
+  abfd = bfd_create (oname, parent);
+  bfd_find_target (pe_details->object_target, abfd);
+  bfd_make_writable (abfd);
+
+  bfd_set_format (abfd, bfd_object);
+  bfd_set_arch_mach (abfd, pe_details->bfd_arch, 0);
+
+  symptr = 0;
+  symtab = (asymbol **) xmalloc (6 * sizeof (asymbol *));
+  id3 = quick_section (abfd, ".idata$3", SEC_HAS_CONTENTS, 2);
+//  quick_symbol (abfd, U("_head_"), dll_symname, "", id2, BSF_GLOBAL, 0);
+
+  quick_symbol (abfd, U("_nm_thnk_"), name, "", UNDSEC, BSF_GLOBAL, 0);
+  quick_symbol (abfd, U(""), dll_symname, "_iname", UNDSEC, BSF_GLOBAL, 0);
+  quick_symbol (abfd, "", fixup_name, "", UNDSEC, BSF_GLOBAL, 0);
+
+  bfd_set_section_size (abfd, id3, 20);
+  d3 = (unsigned char *) xmalloc (20);
+  id3->contents = d3;
+  memset (d3, 0, 20);
+
+  quick_reloc (abfd,  0, BFD_RELOC_RVA, 1);
+  quick_reloc (abfd, 12, BFD_RELOC_RVA, 2);
+  quick_reloc (abfd, 16, BFD_RELOC_RVA, 3);
+  save_relocs (id3);
+
+  bfd_set_symtab (abfd, symtab, symptr);
+
+  bfd_set_section_contents (abfd, id3, d3, 0, 20);
+
+  bfd_make_readable (abfd);
+  return abfd;
+}
+
+void
+pe_create_import_fixup (rel)
+     arelent *rel;
+{
+  char buf[300];
+  struct symbol_cache_entry *sym = *rel->sym_ptr_ptr;
+  struct bfd_link_hash_entry *name_thunk_sym;
+  CONST char *name = sym->name;
+  char *fixup_name = make_import_fixup_mark(rel);
+
+  sprintf(buf,U("_nm_thnk_%s"),name);
+
+  name_thunk_sym =
+           bfd_link_hash_lookup (link_info.hash, buf, 0, 0, 1);
+
+  if (!name_thunk_sym || name_thunk_sym->type != bfd_link_hash_defined)
+  {
+    bfd *b=make_singleton_name_thunk (name, output_bfd);
+    add_bfd_to_link (b, b->filename, &link_info);
+
+    /* If we ever use autoimport, we have to cast text section writable */
+    config.text_read_only=false;
+  }
+
+  {
+    extern char *data_import_dll;
+    bfd *b=make_import_fixup_entry (name,fixup_name,data_import_dll,output_bfd);
+    add_bfd_to_link (b, b->filename, &link_info);
+  }
+
+}
+
+
 void
 pe_dll_generate_implib (def, impfilename)
      def_file *def;
@@ -1628,10 +2029,10 @@
     }
 }
 
-static void
+void
 add_bfd_to_link (abfd, name, link_info)
      bfd *abfd;
-     char *name;
+     CONST char *name;
      struct bfd_link_info *link_info;
 {
   lang_input_statement_type *fake_file;
Index: ld/pe-dll.h
===================================================================
RCS file: /cvs/src/src/ld/pe-dll.h,v
retrieving revision 1.3
diff -u -r1.3 pe-dll.h
--- pe-dll.h	2001/03/13 06:14:27	1.3
+++ pe-dll.h	2001/07/31 16:18:24
@@ -33,6 +33,10 @@
 extern int pe_dll_stdcall_aliases;
 extern int pe_dll_warn_dup_exports;
 extern int pe_dll_compat_implib;
+/* This resides in bfd */
+//_BFD_IMPORT 
+extern int pe_dll_auto_import;
+extern int pe_dll_gory_debug;
 
 extern void pe_dll_id_target PARAMS ((const char *));
 extern void pe_dll_add_excludes PARAMS ((const char *));
@@ -45,4 +49,9 @@
 extern void pe_dll_fill_sections PARAMS ((bfd *, struct bfd_link_info *));
 extern void pe_exe_fill_sections PARAMS ((bfd *, struct bfd_link_info *));
 
+extern void pe_walk_relocs_of_symbol PARAMS ((struct bfd_link_info *info,
+                                              CONST char *name,
+                                              int (*cb)(arelent*)));
+
+extern void pe_create_import_fixup PARAMS ((arelent *rel));
 #endif /* PE_DLL_H */
Index: ld/emultempl/pe.em
===================================================================
RCS file: /cvs/src/src/ld/emultempl/pe.em,v
retrieving revision 1.45
diff -u -r1.45 pe.em
--- pe.em	2001/07/11 08:11:16	1.45
+++ pe.em	2001/07/31 16:18:29
@@ -146,6 +146,7 @@
     ldfile_output_architecture = bfd_arch_${ARCH};
   output_filename = "${EXECUTABLE_NAME:-a.exe}";
 #ifdef DLL_SUPPORT
+  config.dynamic_link = true;
   config.has_shared = 1;
 
 #if (PE_DEF_SUBSYSTEM == 9) || (PE_DEF_SUBSYSTEM == 2)
@@ -191,6 +192,8 @@
 #define OPTION_DISABLE_AUTO_IMAGE_BASE	(OPTION_ENABLE_AUTO_IMAGE_BASE + 1)
 #define OPTION_DLL_SEARCH_PREFIX	(OPTION_DISABLE_AUTO_IMAGE_BASE + 1)
 #define OPTION_NO_DEFAULT_EXCLUDES	(OPTION_DLL_SEARCH_PREFIX + 1)
+#define OPTION_DLL_DISABLE_AUTO_IMPORT	(OPTION_NO_DEFAULT_EXCLUDES + 1)
+#define OPTION_DLL_ENABLE_GORY_DEBUG	(OPTION_DLL_DISABLE_AUTO_IMPORT + 1)
 
 static struct option longopts[] = {
   /* PE options */
@@ -228,6 +231,8 @@
   {"disable-auto-image-base", no_argument, NULL, OPTION_DISABLE_AUTO_IMAGE_BASE},
   {"dll-search-prefix", required_argument, NULL, OPTION_DLL_SEARCH_PREFIX},
   {"no-default-excludes", no_argument, NULL, OPTION_NO_DEFAULT_EXCLUDES},
+  {"disable-auto-import", no_argument, NULL, OPTION_DLL_DISABLE_AUTO_IMPORT},
+  {"enable-gory-debug", no_argument, NULL, OPTION_DLL_ENABLE_GORY_DEBUG},
 #endif
   {NULL, no_argument, NULL, 0}
 };
@@ -313,6 +318,8 @@
   fprintf (file, _("  --dll-search-prefix=<string>       When linking dynamically to a dll witout an\n"));
   fprintf (file, _("                                       importlib, use <string><basename>.dll \n"));
   fprintf (file, _("                                       in preference to lib<basename>.dll \n"));
+  fprintf (file, _("  --disable-auto-import              Do not do sophistcated linking of _sym to \n"));
+  fprintf (file, _("                                       __imp_sym for DATA references\n"));
 #endif
 }
 
@@ -583,6 +590,12 @@
     case OPTION_NO_DEFAULT_EXCLUDES:
       pe_dll_do_default_excludes = 0;
       break;
+    case OPTION_DLL_DISABLE_AUTO_IMPORT:
+      pe_dll_auto_import = 0;
+      break;
+    case OPTION_DLL_ENABLE_GORY_DEBUG:
+      pe_dll_gory_debug = 1;
+      break;
 #endif
     }
   return 1;
@@ -733,6 +746,8 @@
   static int gave_warning_message = 0;
   struct bfd_link_hash_entry *undef, *sym;
   char *at;
+  if (pe_dll_gory_debug) printf(__FUNCTION__"\n");
+
   for (undef = link_info.hash->undefs; undef; undef=undef->next)
     if (undef->type == bfd_link_hash_undefined)
     {
@@ -791,11 +806,101 @@
       }
     }
 }
+
+static int
+make_import_fixup (rel)
+  arelent *rel;
+{
+  struct symbol_cache_entry *sym = *rel->sym_ptr_ptr;
+//  bfd *b;
+
+  if (pe_dll_gory_debug) printf("arelent: %s@%#x: add=%li\n",sym->name,(int)rel->address,rel->addend);
+  pe_create_import_fixup(rel);
+  return 1;
+}
+
+char *data_import_dll;
+
+static void
+pe_find_data_imports ()
+{
+  struct bfd_link_hash_entry *undef, *sym;
+  for (undef = link_info.hash->undefs; undef; undef=undef->next)
+    if (undef->type == bfd_link_hash_undefined)
+    {
+      /* C++ symbols are *long* */
+      char buf[4096];
+if (pe_dll_gory_debug) printf(__FUNCTION__":%s\n",undef->root.string);
+      sprintf(buf,"__imp_%s",undef->root.string);
+
+	sym = bfd_link_hash_lookup (link_info.hash, buf, 0, 0, 1);
+	if (sym && sym->type == bfd_link_hash_defined)
+	{
+          einfo (_("Warning: resolving %s by linking to %s (auto-import)\n"),
+    	     undef->root.string, buf);
+
+          {
+            bfd *b=sym->u.def.section->owner;
+            asymbol **symbols;
+            int nsyms, symsize, i;
+     
+            symsize = bfd_get_symtab_upper_bound (b);
+            symbols = (asymbol **) xmalloc (symsize);
+            nsyms = bfd_canonicalize_symtab (b, symbols);
+
+            for (i = 0; i < nsyms; i++)
+	    {
+              if (memcmp(symbols[i]->name,"__head_",sizeof("__head_")-1))
+              	continue;
+if (pe_dll_gory_debug) printf("->%s\n",symbols[i]->name);
+              data_import_dll=(char*)(symbols[i]->name+sizeof("__head_")-1);
+              break;
+            }
+          }
+
+
+          pe_walk_relocs_of_symbol(&link_info, undef->root.string, make_import_fixup);
+
+          /* let's differentiate it somehow from defined */
+	  undef->type = bfd_link_hash_defweak;
+          /* we replace original name with __imp_ prefixed, this
+          1) may trash memory 2) leads to duplicate symbol generation.
+          Still, IMHO it's better than having name poluted. */
+	  undef->root.string = sym->root.string;
+	  undef->u.def.value = sym->u.def.value;
+	  undef->u.def.section = sym->u.def.section;
+      }
+    }
+}
 #endif /* DLL_SUPPORT */
 
+static boolean
+pr_sym (h, string)
+  struct bfd_hash_entry *h;
+  PTR string;
+{
+if (pe_dll_gory_debug) printf("+%s\n",h->string);
+  return true;
+}
+
+
 static void
 gld_${EMULATION_NAME}_after_open ()
 {
+
+if (pe_dll_gory_debug) 
+{
+  bfd *a;
+  struct bfd_link_hash_entry *sym;
+  printf(__FUNCTION__"()\n");
+
+  for (sym = link_info.hash->undefs; sym; sym=sym->next)
+    printf("-%s\n",sym->root.string);
+  bfd_hash_traverse(&link_info.hash->table,pr_sym,NULL);
+
+  for (a=link_info.input_bfds; a; a=a->link_next)
+    printf("*%s\n",a->filename);
+}
   /* Pass the wacky PE command line options into the output bfd.
      FIXME: This should be done via a function, rather than by
      including an internal BFD header.  */
@@ -810,6 +915,8 @@
   if (pe_enable_stdcall_fixup) /* -1=warn or 1=disable */
     pe_fixup_stdcalls ();
 
+  pe_find_data_imports (output_bfd, &link_info);
+
   pe_process_import_defs(output_bfd, &link_info);
   if (link_info.shared)
     pe_dll_build_sections (output_bfd, &link_info);
@@ -1251,6 +1358,16 @@
   if (pe_out_def_filename)
     pe_dll_generate_def_file (pe_out_def_filename);
 #endif /* DLL_SUPPORT */
+
+/* I don't know which bugger sets .idata as code */
+  {
+    asection *asec = bfd_get_section_by_name (output_bfd, ".idata");
+    if (asec)
+      {
+          asec->flags &= ~SEC_CODE;
+          asec->flags |= SEC_DATA;
+      }
+  }
 }
 
 

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]