This is the mail archive of the cygwin-apps mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [patch/rebase] Add a rebase database to keep track of DLL addresses


Hi Chuck,

On Jul  5 15:38, Charles Wilson wrote:
> On 7/2/2011 4:41 PM, Corinna Vinschen wrote:
> > On Jul  2 09:15, Corinna Vinschen wrote:
> >> On Jul  2 01:14, Charles Wilson wrote:
> >>> I'll take a look next week.
> 
> Comments inline.
> 
> > New patch attached with the following changes:
> > 
> > - I introduced a last minute bug into load_image_info.  Fixed.
> > - merge_image_info now eliminates duplicate DLLs given on the command
> > - print_image_info now also eliminates duplicates and handles the files> 
> > - What's more important, print_image_info now checks for overlapping DLLs
> 
> img_info_cmp: what if a or b is NULL? do we care?  Ditto for a->name,
> b->name -- POSIX doesn't mandate the behavior of strcmp if args are null...

Neither the a/b pointers, nor the name strings can become NULL.

> Also, what encoding do we end up with (on cygwin)? does that affect the
> use of strcmp() to compare names?

See Eric's reply.

> Since the image bases are stored (in our database and data structures)
> as ULONG, how do we support 64bit systems and rebasing above 4G?  Or is
> that fodder for a future patch?

I was thinking about that.  First of all, imagehelper does not support
64 bit files so far.  What we need is a ReBaseImage64 implementation.
Alternatively, we use Windows ReBaseImage64 and drop imagehelper.

Without this decision first, I didn't want to clutter the database
file format with stuff it doesn't need.

What's more, the rebase addresses of 32 and 64 bit DLLs have nothing to
do with each other.  Since they never share the same VM space, they are
disjunct.  I mulled over a combined database format for both versions,
but it doesn't make sense.  I think, what we should do is to keep two
database files, one for 32 and one for 64 bit DLLs, and handle them
independently.  Change the magic number and you're all set.

> > Index: rebase.c
> > ===================================================================
> 
> > +#define IMG_INFO_FILE	"/etc/rebase.image_info"
> > +char tmp_file[] = "/etc/rebase.image_info.XXXXXX";
> 
> Hmm.  I don't think this will work as expected in the native MinGW
> build, given the whole absolute pathname "problem".  Then, even on
> cygwin/msys it'd also be nice to have some sort of support for configure
> --sysconfdir and --prefix...
> 
> Perhaps something like...
> 
> #define xstr(s) str(s)
> #define str(s) #s
> #if defined __CYGWIN__ || defined __MSYS__
> # define IMG_INFO_FILE xstr(SYSCONFDIR) "/rebase.image_info"
> # char tmp_file[] = xstr(SYSCONFDIR) "/rebase.image_info.XXXXXX";
> #else
> /* psuedo-relocatable for native mingw build */
> # define IMG_INFO_FILE "/../etc/rebase.image_info"
> # char tmp_file[] = xstr(SYSCONFDIR) "/../etc/rebase.image_info.XXXXXX";

I don't understand.  What exactly is "pseudo-relocatable" in this
drive-absolute expression?  I added that to my patch, though.

> ...with appropriate -DSYSCONFDIR flags added in Makefile.in (for the
> cygwin|msys case), as well as a bit of code elsewhere, to prepend the
> installation directory of rebase.exe itself for the mingw case.

That's something which somebody else will have to add.

> Or maybe, for the mingw case, the database should be in cwd()? or
> directly in same directory as rebase.exe?

That's something which somebody else will have to add.  I really don't
care at all for the mingw case.

> > +  if (write (fd, &hdr, sizeof hdr) < 0)
> > +  else if (write (fd, img_info_list, img_info_size * sizeof (img_info_t)) < 0)
> 
> Hmm...is sizeof(struct) the same, on both 64bit and 32bit windows, and
> between cygwin|msys|mingw, and with/without -mms-bitfields and or
> __atrribute__((ms_struct))?

Skipping the 64 bit issue which needs another decision first, I don't
understand the question.  Whatever the sizeof img_info_t is, it's a
constant expression on the system for which rebase has been compiled.

> This would force MS-style struct padding and bitfield sizing for all
> three platforms...

Why?  You don't want to share the stuff.

> So...ok.  But there should probably be some documentation up at the top
> of rebase.c for the database format.

Ok, I add comments to the structure layout.  It's really not complicated
enough to ask for more.

> OK, so now that I look at it, I wonder about a few more things:
> 
> 1) should the header have a db_version field -- what if we change the
> format, either by rearranging, changing the "official" encoding for the
> name strings, or add new fields to the records?

Sure, why not.

> 2) Should we worry about the on-disk format being "the same" between
> cygwin and msys and native?

No.

> 3) It's nice to have constant size records for all DLLs "up front" and
> then the variable-sized strings "at the end", since that allows for
> random-access.

Actually, it has nothing to do with random access but with ease of
programming.

> But...we don't DO random access, so that's not much of
> an issue (although it does allow for easy direct inspection of a hexdump
> of the db).  The cost tho is that (a) we store a useless pointer in the
> DB, and (b) the data for the Nth DLL is separated from the NAME of that
> DLL, so it's hard to tell which DLL actually uses any given record.

a) 4 bytes per record.  The entire size of the database file is just
   a tiny bit over 20K on my machine, with 311 DLLs installed, most of
   it the string table.  When was the last time you had to worry about
   a waste of 1244 bytes on your hard drive?

b) Why do you want to look into the file?  That's what rebase -s -i
   is for.

> Maybe if rebase.exe had an explicit "--dump-db-for-debuggging" option

Still, rebase -s -i

> that reads in the db and prints it out (that is, unlike -s, it would
> totally ignore the actual DLLs), then there's no real need to "inspect"
> the on-disk data, and we can continue to use this nice -- fast to read,
> fairly compact with only 4 wasted bytes per entry -- binary format.

Again, why would you want to do that?  There's no information to gain.
rebase -s -i prints the database content, unless the real DLL differs
from the database storage.  If so, and if DLLs now overlap, you get
the information printed on the screen.  That's what you need to decide
if some more rebasing might be necessary.  What crucial information do
you gain from printing the database as is, if it doesn't match reality?

> > +load_image_info ()
> ...
> > +  /* Check the header. */
> > +  if (memcmp (hdr.magic, IMG_INFO_MAGIC, 4) != 0)
> > +    {
> > +      fprintf (stderr, "%s: \"%s\" is not a valid rebase database.\n",
> > +	       progname, IMG_INFO_FILE);
> > +      close (fd);
> > +      return -1;
> > +    }
> 
> And here's where the version number would be checked.  If we had
> multiple versions, I guess you'd dispatch to the appropriate 'reader'
> routine for each version here as well.

Ok.

> > +#ifdef __CYGWIN__
> > +	|| !strcmp (img_info_list[i].name, "/usr/bin/cygwin1.dll")
> > +#endif
> 
> replace with something like
> 
> #if defined __MSYS__
> 	|| endsWith (img_info_list[i].name, "/msys-1.0.dll")
> #elif defined __CYGWIN__
>         || endsWith (img_info_list[i].name, "/cygwin1.dll")
> #endif
> 
> endsWithI allows to handle both "/usr/bin/cygwin1.dll" and
> "/bin/cygwin1.dll" -- ...I for case-insensitive?

The /usr/bin path and the cygwin DLL name are always lower case.  The
code in collect_image_info converts the path so that all stuff under
/usr/bin will show up as /usr/bin, even if you used /bin on the command
line.  There's no reason to check for /bin.

> but perhaps that is made unnecessary by the "back and forth to POSIX"
> name manipulations; I'm not sure.

Right.

> Probably for clarity, all the places that deal with the cygwin DLL
> itself should be marked
> 
> #if defined(__CYGWIN__) || defined(__MSYS__)

Any chance I can concentrate on the Cygwin stuff and you add MSYS
later?

> >  done
> >  
> > +# Check if rebase database already exists.
> > +database_exists="no"
> > +[ -f "/etc/rebase.image_info" ] && database_exists="yes"
> 
> In this case, we can unconditionally use
> 
> [ -f "@SYSCONFDIR@/rebase.image_info" ]
> 
> because the scripts are only useful with cygwin and msys.


Ok.  New patch attached.  Thanks for reviewing.


Corinna


Index: Makefile.in
===================================================================
RCS file: /sourceware/projects/cygwin-apps-home/cvsfiles/rebase/Makefile.in,v
retrieving revision 1.4
diff -u -p -r1.4 Makefile.in
--- Makefile.in	28 Jun 2011 19:43:19 -0000	1.4
+++ Makefile.in	6 Jul 2011 08:33:30 -0000
@@ -56,7 +56,7 @@ FGREP = @FGREP@
 ASH = @ASH@
 
 DEFAULT_INCLUDES = -I. -I$(srcdir) -I$(srcdir)/imagehelper
-DEFS = @DEFS@ -DVERSION='"$(PACKAGE_VERSION)"' -DLIB_VERSION='"$(LIB_VERSION)"'
+DEFS = @DEFS@ -DVERSION='"$(PACKAGE_VERSION)"' -DLIB_VERSION='"$(LIB_VERSION)"' -DSYSCONFDIR='"$(sysconfdir)"'
 
 override CFLAGS+=-Wall -Werror
 override CXXFLAGS+=-Wall -Werror
@@ -109,6 +109,7 @@ getopt_long.$(O):: getopt_long.c getopt.
 # bindir and friends in your shell scripts"
 edit = sed \
 	-e 's|@bindir[@]|$(bindir)|g' \
+	-e 's|@sysconfdir[@]|$(sysconfdir)|g' \
 	-e 's|@pkgdatadir[@]|$(pkgdatadir)|g' \
 	-e 's|@prefix[@]|$(prefix)|g' \
 	-e 's|@exec_prefix[@]|$(exec_prefix)|g' \
Index: rebase.c
===================================================================
RCS file: /sourceware/projects/cygwin-apps-home/cvsfiles/rebase/rebase.c,v
retrieving revision 1.4
diff -u -p -r1.4 rebase.c
--- rebase.c	29 Jun 2011 14:58:55 -0000	1.4
+++ rebase.c	6 Jul 2011 08:33:30 -0000
@@ -19,20 +19,29 @@
 #include <stdlib.h>
 #include <sys/types.h>
 #include <sys/stat.h>
+#include <sys/param.h>
+#if defined(__CYGWIN__) || defined(__MSYS__)
+#include <sys/cygwin.h>
+#endif
 #include <fcntl.h>
 #include <string.h>
 #include <unistd.h>
 #include <locale.h>
 #include <getopt.h>
 #include <string.h>
+#include <errno.h>
 #include "imagehelper.h"
 
+BOOL save_image_info ();
+BOOL load_image_info ();
+BOOL merge_image_info ();
 BOOL collect_image_info (const char *pathname);
 void print_image_info ();
-BOOL rebase (const char *pathname, ULONG *new_image_base);
+BOOL rebase (const char *pathname, ULONG *new_image_base, BOOL down_flag);
 void parse_args (int argc, char *argv[]);
 unsigned long string_to_ulong (const char *string);
 void usage ();
+void help ();
 BOOL is_rebaseable (const char *pathname);
 FILE *file_list_fopen (const char *file_list);
 char *file_list_fgets (char *buf, int size, FILE *file);
@@ -42,25 +51,57 @@ void version ();
 ULONG image_base = 0;
 BOOL down_flag = FALSE;
 BOOL image_info_flag = FALSE;
+BOOL image_storage_flag = FALSE;
+BOOL force_rebase_flag = FALSE;
 ULONG offset = 0;
 int args_index = 0;
 int verbose = 0;
 const char *file_list = 0;
 const char *stdin_file_list = "-";
 
+const char *progname;
+
+const char IMG_INFO_MAGIC[4] = "rBiI";
+const ULONG IMG_INFO_VERSION = 1;
+
 ULONG ALLOCATION_SLOT;	/* Allocation granularity. */
 
+typedef struct _img_info_hdr
+{
+  char  magic[4];	/* Always IMG_INFO_MAGIC.                            */
+  ULONG version;	/* Database version, always set to IMG_INFO_VERSION. */
+  ULONG base;		/* Base address (-b) used to generate database.      */
+  ULONG offset;		/* Offset (-o) used to generate database.            */
+  BOOL  down_flag;	/* Always TRUE right now.                            */
+  ULONG count;		/* Number of img_info_t entries following header.    */
+} img_info_hdr_t;
+
 typedef struct _img_info
 {
-  const char *name;
-  ULONG base;
-  ULONG size;
+  char *name;		/* Absolute path to DLL.  The strings are stored     */
+			/* right after the img_info_t table, in the same     */
+			/* order as the img_info_t entries.                  */
+  ULONG name_size;	/* Length of name string including trailing NUL.     */
+  ULONG base;		/* Base address the DLL has been rebased to.         */
+  ULONG size;		/* Size of the DLL at rebased time.                  */
+  ULONG slot_size;	/* Size of the DLL rounded to allocation granularity.*/
+  struct {		/* Flags                                             */
+    unsigned needs_rebasing : 1; /* Set to 0 in the database.  Used only     */
+    				 /* during rebasing.                         */
+  } flag;
 } img_info_t;
 
 img_info_t *img_info_list = NULL;
 unsigned int img_info_size = 0;
+unsigned int img_info_rebase_start = 0;
 unsigned int img_info_max_size = 0;
 
+#if !defined (__CYGWIN__) && !defined (__MSYS__)
+#define SYSCONFDIR "/../etc"
+#endif
+#define IMG_INFO_FILE SYSCONFDIR "/rebase.image_info"
+char tmp_file[] =     SYSCONFDIR "/rebase.image_info.XXXXXX";
+
 #ifdef __CYGWIN__
 ULONG cygwin_dll_image_base = 0;
 ULONG cygwin_dll_image_size = 0;
@@ -75,11 +116,18 @@ main (int argc, char *argv[])
   BOOL status;
 
   setlocale (LC_ALL, "");
+  progname = (progname = strrchr (argv[0], '/')) ? progname + 1 : argv[0];
   parse_args (argc, argv);
-  new_image_base = image_base;
   GetSystemInfo (&si);
   ALLOCATION_SLOT = si.dwAllocationGranularity;
 
+  if (image_storage_flag)
+    {
+      if (load_image_info () < 0)
+      	return 2;
+      img_info_rebase_start = img_info_size;
+    }
+  new_image_base = image_base;
 #ifdef __CYGWIN__
   /* Fetch the Cygwin DLLs data to make sure that DLLs aren't rebased
      into the memory area taken by the Cygwin DLL. */
@@ -94,42 +142,481 @@ main (int argc, char *argv[])
   cygwin_dll_image_size += 3 * ALLOCATION_SLOT + 8 * ALLOCATION_SLOT;
 #endif
 
-  /* Rebase file list, if specified. */
+  /* Collect/Rebase file list, if specified. */
   if (file_list)
     {
-      status = TRUE;
       char filename[MAX_PATH + 2];
       FILE *file = file_list_fopen (file_list);
       if (!file)
-	exit (2);
+	return 2;
 
+      status = TRUE;
       while (file_list_fgets (filename, MAX_PATH + 2, file))
 	{
-	  status = image_info_flag ? collect_image_info (filename)
-				    : rebase (filename, &new_image_base);
+	  status = (image_info_flag || image_storage_flag)
+		   ? collect_image_info (filename)
+		   : rebase (filename, &new_image_base, down_flag);
 	  if (!status)
 	    break;
 	}
 
       file_list_fclose (file);
       if (!status)
-	exit (2);
+	return 2;
     }
 
-  /* Rebase command line arguments. */
+  /* Collect/Rebase command line arguments. */
   for (i = args_index; i < argc; i++)
     {
       const char *filename = argv[i];
-      status = image_info_flag ? collect_image_info (filename)
-				 : rebase (filename, &new_image_base);
+      status = (image_info_flag || image_storage_flag)
+	       ? collect_image_info (filename)
+	       : rebase (filename, &new_image_base, down_flag);
       if (!status)
-	exit (2);
+	return 2;
     }
 
   if (image_info_flag)
     print_image_info ();
+  else if (image_storage_flag && img_info_size > 0)
+    {
+      merge_image_info ();
+      status = TRUE;
+      for (i = 0; i < img_info_size; ++i)
+	if (img_info_list[i].flag.needs_rebasing)
+	  {
+	    new_image_base = img_info_list[i].base;
+	    status = rebase (img_info_list[i].name, &new_image_base, FALSE);
+	    if (status)
+	      img_info_list[i].flag.needs_rebasing = 0;
+	  }
+      if (save_image_info () < 0)
+	return 2;
+    }
+
+  return 0;
+}
+
+int
+img_info_cmp (const void *a, const void *b)
+{
+  ULONG abase = ((img_info_t *) a)->base;
+  ULONG bbase = ((img_info_t *) b)->base;
+
+  if (abase < bbase)
+    return -1;
+  if (abase > bbase)
+    return 1;
+  return strcmp (((img_info_t *) a)->name, ((img_info_t *) b)->name);
+}
 
-  exit (0);
+int
+img_info_name_cmp (const void *a, const void *b)
+{
+  return strcmp (((img_info_t *) a)->name, ((img_info_t *) b)->name);
+}
+
+int
+save_image_info ()
+{
+  int i, fd;
+  int ret = 0;
+  img_info_hdr_t hdr;
+
+  /* Remove all DLLs which couldn't be rebased from the list before storing
+     it in the database file. */
+  for (i = 0; i < img_info_size; ++i)
+    if (img_info_list[i].flag.needs_rebasing)
+      img_info_list[i--] = img_info_list[--img_info_size];
+  /* Create a temporary file to write to. */
+  fd = mkstemp (tmp_file);
+  if (fd < 0)
+    {
+      fprintf (stderr, "%s: failed to create temporary rebase database: %s\n",
+	       progname, strerror (errno));
+      return -1;
+    }
+  qsort (img_info_list, img_info_size, sizeof (img_info_t), img_info_name_cmp);
+  /* First write the number of entries. */
+  memcpy (hdr.magic, IMG_INFO_MAGIC, 4);
+  hdr.version = IMG_INFO_VERSION;
+  hdr.base = image_base;
+  hdr.offset = offset;
+  hdr.down_flag = down_flag;
+  hdr.count = img_info_size;
+  if (write (fd, &hdr, sizeof hdr) < 0)
+    {
+      fprintf (stderr, "%s: failed to write rebase database: %s\n",
+	       progname, strerror (errno));
+      ret = -1;
+    }
+  /* Write the list. */
+  else if (write (fd, img_info_list, img_info_size * sizeof (img_info_t)) < 0)
+    {
+      fprintf (stderr, "%s: failed to write rebase database: %s\n",
+	       progname, strerror (errno));
+      ret = -1;
+    }
+  else
+    {
+      int i;
+
+      /* Write all strings. */
+      for (i = 0; i < img_info_size; ++i)
+	if (write (fd, img_info_list[i].name,
+		   strlen (img_info_list[i].name) + 1) < 0)
+	  {
+	    fprintf (stderr, "%s: failed to write rebase database: %s\n",
+		     progname, strerror (errno));
+	    ret = -1;
+	    break;
+	  }
+    }
+  fchmod (fd, 0660);
+  close (fd);
+  if (ret < 0)
+    unlink (tmp_file);
+  else
+    {
+      if (unlink (IMG_INFO_FILE) < 0 && errno != ENOENT)
+	{
+	  fprintf (stderr,
+		   "%s: failed to remove old rebase database file \"%s\":\n"
+		   "%s\n"
+		   "The new rebase database is stored in \"%s\".\n"
+		   "Manually remove \"%s\" and rename \"%s\" to \"%s\",\n"
+		   "otherwise the new rebase database will be unusable.\n",
+		   progname, IMG_INFO_FILE,
+		   strerror (errno),
+		   tmp_file,
+		   IMG_INFO_FILE, tmp_file, IMG_INFO_FILE);
+	  ret = -1;
+	}
+      else if (rename (tmp_file, IMG_INFO_FILE) < 0)
+	{
+	  fprintf (stderr,
+		   "%s: failed to rename \"%s\" to \"%s\":\n"
+		   "%s\n"
+		   "Manually rename \"%s\" to \"%s\",\n"
+		   "otherwise the new rebase database will be unusable.\n",
+		   progname, tmp_file, IMG_INFO_FILE,
+		   strerror (errno),
+		   tmp_file, IMG_INFO_FILE);
+	  ret = -1;
+	}
+    }
+  return ret;
+}
+
+int
+load_image_info ()
+{
+  int fd;
+  int ret = 0;
+  int i;
+  img_info_hdr_t hdr;
+
+  fd = open (IMG_INFO_FILE, O_RDONLY);
+  if (fd < 0)
+    {
+      /* It's no error if the file doesn't exist.  However, in this case
+	 the -b option is mandatory. */
+      if (errno == ENOENT && image_base)
+      	return 0;
+      fprintf (stderr, "%s: failed to open rebase database \"%s\":\n%s\n",
+	       progname, IMG_INFO_FILE, strerror (errno));
+      return -1;
+    }
+  /* First read the header. */
+  if (read (fd, &hdr, sizeof hdr) < 0)
+    {
+      fprintf (stderr, "%s: failed to read rebase database \"%s\":\n%s\n",
+	       progname, IMG_INFO_FILE, strerror (errno));
+      close (fd);
+      return -1;
+    }
+  /* Check the header. */
+  if (memcmp (hdr.magic, IMG_INFO_MAGIC, 4) != 0)
+    {
+      fprintf (stderr, "%s: \"%s\" is not a valid rebase database.\n",
+	       progname, IMG_INFO_FILE);
+      close (fd);
+      return -1;
+    }
+  if (hdr.version != IMG_INFO_VERSION)
+    {
+      fprintf (stderr, "%s: \"%s\" is a version %lu rebase database.\n"
+		       "I can only handle versions up to %lu.\n",
+	       progname, IMG_INFO_FILE, hdr.version, IMG_INFO_VERSION);
+      close (fd);
+      return -1;
+    }
+  /* If no new image base has been specified, use the one from the header. */
+  if (image_base == 0)
+    {
+      image_base = hdr.base;
+      down_flag = hdr.down_flag;
+    }
+  if (offset == 0)
+    offset = hdr.offset;
+  /* Don't enforce rebasing if address and offset are unchanged or taken from
+     the file anyway. */
+  if (image_base == hdr.base && offset == hdr.offset)
+    force_rebase_flag = FALSE;
+  img_info_size = hdr.count;
+  /* Allocate memory for the image list. */
+  if (ret == 0)
+    {
+      img_info_max_size = roundup (img_info_size, 100);
+      img_info_list = (img_info_t *) calloc (img_info_max_size,
+					     sizeof (img_info_t));
+      if (!img_info_list)
+	{
+	  fprintf (stderr, "%s: Out of memory.\n", progname);
+	  ret = -1;
+	}
+    }
+  /* Now read the list. */
+  if (ret == 0
+      && read (fd, img_info_list, img_info_size * sizeof (img_info_t)) < 0)
+    {
+      fprintf (stderr, "%s: failed to read rebase database \"%s\":\n%s\n",
+	       progname, IMG_INFO_FILE, strerror (errno));
+      ret = -1;
+    }
+  /* Make sure all pointers are NULL. */
+  if (ret == 0)
+    for (i = 0; i < img_info_size; ++i)
+      img_info_list[i].name = NULL;
+  /* Eventually read the strings. */
+  if (ret == 0)
+    {
+      for (i = 0; i < img_info_size; ++i)
+	{
+	  img_info_list[i].name = (char *)
+				  malloc (img_info_list[i].name_size);
+	  if (!img_info_list[i].name)
+	    {
+	      fprintf (stderr, "%s: Out of memory.\n", progname);
+	      ret = -1;
+	      break;
+	    }
+	  if (read (fd, img_info_list[i].name,
+		    img_info_list[i].name_size) < 0)
+	    {
+	      fprintf (stderr, "%s: failed to read rebase database \"%s\": "
+		       "%s\n", progname, IMG_INFO_FILE, strerror (errno));
+	      ret = -1;
+	      break;
+	    }
+	}
+    }
+  close (fd);
+  /* On failure, free all allocated memory and set list pointer to NULL. */
+  if (ret < 0)
+    {
+      for (i = 0; i < img_info_size && img_info_list[i].name; ++i)
+	free (img_info_list[i].name);
+      free (img_info_list);
+      img_info_list = NULL;
+      img_info_size = 0;
+      img_info_max_size = 0;
+    }
+  return ret;
+}
+
+int
+merge_image_info ()
+{
+  int i, end;
+  img_info_t *match;
+  ULONG floating_image_base;
+
+  /* Sort new files from command line by name. */
+  qsort (img_info_list + img_info_rebase_start,
+	 img_info_size - img_info_rebase_start, sizeof (img_info_t),
+	 img_info_name_cmp);
+  /* Iterate through new files and eliminate duplicates. */
+  for (i = img_info_rebase_start; i + 1 < img_info_size; ++i)
+    if ((img_info_list[i].name_size == img_info_list[i + 1].name_size
+	 && !strcmp (img_info_list[i].name, img_info_list[i + 1].name))
+#ifdef __CYGWIN__
+	|| !strcmp (img_info_list[i].name, "/usr/bin/cygwin1.dll")
+#endif
+       )
+      {
+	free (img_info_list[i].name);
+	memmove (img_info_list + i, img_info_list + i + 1, 
+		 (img_info_size - i - 1) * sizeof (img_info_t));
+	--img_info_size;
+	--i;
+      }
+  /* Iterate through new files and see if they are already available in
+     existing database. */
+  if (img_info_rebase_start)
+    {
+      for (i = img_info_rebase_start; i < img_info_size; ++i)
+	{
+	  match = bsearch (&img_info_list[i], img_info_list,
+			   img_info_rebase_start, sizeof (img_info_t),
+			   img_info_name_cmp);
+	  if (match)
+	    {
+	      /* We found a match.  Now test if the "new" file is actually
+		 the old file, or if it at least fits into the memory slot
+		 of the old file.  If so, screw the new file into the old slot.
+		 Otherwise set base to 0 to indicate that this DLL needs a new
+		 base address. */
+	      if (match->base != img_info_list[i].base
+		  || match->slot_size < img_info_list[i].slot_size)
+		{
+		  /* Reuse the old address if possible. */
+		  if (match->slot_size < img_info_list[i].slot_size)
+		    match->base = 0;
+		  match->flag.needs_rebasing = 1;
+		}
+	      /* Unconditionally overwrite old with new size. */
+	      match->size = img_info_list[i].size;
+	      match->slot_size = img_info_list[i].slot_size;
+	      /* Remove new entry from array. */
+	      free (img_info_list[i].name);
+	      img_info_list[i--] = img_info_list[--img_info_size];
+	    }
+	  else
+	    /* Not in database yet.  Set base to 0 to choose a new one. */
+	    img_info_list[i].base = 0;
+	}
+      /* After eliminating the duplicates, check if the user requested
+	 a new base address on the command line.  If so, overwrite all
+	 base addresses with 0 and set img_info_rebase_start to 0, to
+	 skip any further test. */
+      if (force_rebase_flag)
+	img_info_rebase_start = 0;
+    }
+  if (!img_info_rebase_start)
+    {
+      /* No database yet or enforcing a new base address.  Set base of all
+	 DLLs to 0. */
+      for (i = 0; i < img_info_size; ++i)
+	img_info_list[i].base = 0;
+    }
+
+  /* Now sort the old part of the list by base address. */
+  if (img_info_rebase_start)
+    qsort (img_info_list, img_info_rebase_start, sizeof (img_info_t),
+	   img_info_cmp);
+  /* Perform several tests on the information fetched from the database
+     to match with reality. */
+  for (i = 0; i < img_info_rebase_start; ++i)
+    {
+      ULONG cur_base, cur_size, slot_size;
+
+      /* Files with the needs_rebasing flag set have been checked already. */
+      if (img_info_list[i].flag.needs_rebasing)
+	continue;
+      /* Check if the files in the old list still exist.  Drop non-existant
+	 or unaccessible files. */
+      if (access (img_info_list[i].name, F_OK) == -1
+	  || !GetImageInfos (img_info_list[i].name, &cur_base, &cur_size))
+      	{
+	  free (img_info_list[i].name);
+	  memmove (img_info_list + i, img_info_list + i + 1,
+		   (img_info_size - i - 1) * sizeof (img_info_t));
+	  --img_info_rebase_start;
+	  --img_info_size;
+	  continue;
+	}
+      slot_size = roundup2 (cur_size, ALLOCATION_SLOT);
+      /* If the file has been reinstalled, try to rebase to the same address
+	 in the first place. */
+      if (cur_base != img_info_list[i].base)
+	{
+	  img_info_list[i].flag.needs_rebasing = 1;
+	  /* Set cur_base to the old base to simplify subsequent tests. */
+	  cur_base = img_info_list[i].base;
+	}
+      /* However, if the DLL got bigger and doesn't fit into its slot
+	 anymore, rebase this DLL from scratch. */
+      if (i + 1 < img_info_rebase_start
+	  && cur_base + slot_size + offset >= img_info_list[i + 1].base)
+	img_info_list[i].base = 0;
+      /* Does the file match the base address requirements?  If not,
+	 rebase from scratch. */
+      else if ((down_flag && cur_base + slot_size + offset >= image_base)
+	       || (!down_flag && cur_base < image_base))
+	img_info_list[i].base = 0;
+      /* Unconditionally overwrite old with new size. */
+      img_info_list[i].size = cur_size;
+      img_info_list[i].slot_size = slot_size;
+      /* Make sure all DLLs with base address 0 have the needs_rebasing
+	 flag set. */
+      if (img_info_list[i].base == 0)
+	img_info_list[i].flag.needs_rebasing = 1;
+    }
+  /* The remainder of the function expects img_info_size to be > 0. */
+  if (img_info_size == 0)
+    return 0;
+
+  /* Now sort entire list by base address.  The files with address 0 will
+     be first. */
+  if (!force_rebase_flag)
+    qsort (img_info_list, img_info_size, sizeof (img_info_t), img_info_cmp);
+  /* Try to fit all DLLs with base address 0 into the given list. */
+  /* FIXME: This loop only implements the top-down case.  Implement a
+     bottom-up case, too, at one point. */
+  floating_image_base = image_base;
+  end = img_info_size - 1;
+  while (img_info_list[0].base == 0)
+    {
+      ULONG new_base;
+
+      /* Skip trailing entries as long as there is no hole. */
+       while (img_info_list[end].base + img_info_list[end].slot_size + offset
+	     >= floating_image_base)
+	{
+	  floating_image_base = img_info_list[end].base;
+	  --end;
+	}
+      /* Test if one of the DLLs with address 0 fits into the hole. */
+      for (i = 0, new_base = 0; img_info_list[i].base == 0; ++i, new_base = 0)
+	{
+	  new_base = floating_image_base - img_info_list[i].slot_size - offset;
+	  if (new_base >= img_info_list[end].base
+			  + img_info_list[end].slot_size
+#ifdef __CYGWIN__
+	      /* Don't overlap the Cygwin DLL. */
+	      && (new_base >= cygwin_dll_image_base + cygwin_dll_image_size
+		  || new_base + img_info_list[i].slot_size
+		     <= cygwin_dll_image_base)
+#endif
+	     )
+	    break;
+	}
+      /* Found a match.  Mount into list. */
+      if (new_base)
+	{
+	  img_info_t tmp = img_info_list[i];
+	  tmp.base = new_base;
+	  memmove (img_info_list + i, img_info_list + i + 1,
+		   (end - i) * sizeof (img_info_t));
+	  img_info_list[end] = tmp;
+	  continue;
+	}
+      /* Nothing matches.  Set floating_image_base to the start of the
+	 uppermost DLL at this point and try again. */
+#ifdef __CYGWIN__
+      if (floating_image_base >= cygwin_dll_image_base + cygwin_dll_image_size
+	  && img_info_list[end].base < cygwin_dll_image_base)
+	floating_image_base = cygwin_dll_image_base;
+      else
+#endif
+	{
+	  floating_image_base = img_info_list[end].base;
+	  --end;
+	}
+    }
+
+  return 0;
 }
 
 BOOL
@@ -143,7 +630,7 @@ collect_image_info (const char *pathname
       return TRUE;
     }
 
-  if (img_info_size <= img_info_max_size)
+  if (img_info_size >= img_info_max_size)
     {
       img_info_max_size += 100;
       img_info_list = (img_info_t *) realloc (img_info_list,
@@ -151,45 +638,126 @@ collect_image_info (const char *pathname
 					      * sizeof (img_info_t));
       if (!img_info_list)
 	{
-	  fprintf (stderr, "Out of memory.\n");
+	  fprintf (stderr, "%s: Out of memory.\n", progname);
 	  exit (2);
 	}
     }
 
   if (GetImageInfos (pathname, &img_info_list[img_info_size].base,
 			       &img_info_list[img_info_size].size))
-    img_info_list[img_info_size++].name = strdup (pathname);
+    {
+      img_info_list[img_info_size].slot_size
+	= roundup2 (img_info_list[img_info_size].size, ALLOCATION_SLOT);
+      img_info_list[img_info_size].flag.needs_rebasing = 1;
+      /* This back and forth from POSIX to Win32 is a way to get a full path
+	 more thoroughly.  For instance, the difference between /bin and
+	 /usr/bin will be eliminated. */
+#if defined (__MSYS__)
+      {
+	char w32_path[MAX_PATH];
+	char full_path[MAX_PATH];
+      	cygwin_conv_to_full_win32_path (pathname, w32_path);
+	cygwin_conv_to_full_posix_path (w32_path, full_path);
+	img_info_list[img_info_size].name = strdup (full_path);
+	img_info_list[img_info_size].name_size = strlen (full_path) + 1;
+      }
+#elif defined (__CYGWIN__)
+      {
+	PWSTR w32_path = cygwin_create_path (CCP_POSIX_TO_WIN_W, pathname);
+	img_info_list[img_info_size].name
+	  = cygwin_create_path (CCP_WIN_W_TO_POSIX, w32_path);
+	free (w32_path);
+	img_info_list[img_info_size].name_size
+	  = strlen (img_info_list[img_info_size].name) + 1;
+      }
+#else
+      {
+	char full_path[MAX_PATH];
+	GetFullPathName (pathname, MAX_PATH, full_path, NULL);
+	img_info_list[img_info_size].name = strdup (full_path);
+	img_info_list[img_info_size].name_size = strlen (full_path) + 1;
+      }
+#endif
+      ++img_info_size;
+    }
   return TRUE;
 }
 
-int
-img_info_cmp (const void *a, const void *b)
-{
-  ULONG abase = ((img_info_t *) a)->base;
-  ULONG bbase = ((img_info_t *) b)->base;
-
-  if (abase < bbase)
-    return -1;
-  if (abase > bbase)
-    return 1;
-  return strcmp (((img_info_t *) a)->name, ((img_info_t *) b)->name);
-}
-
 void
 print_image_info ()
 {
   unsigned int i;
 
+  /* Sort list by name. */
+  qsort (img_info_list, img_info_size, sizeof (img_info_t), img_info_name_cmp);
+  /* Iterate through list and eliminate duplicates. */
+  for (i = 0; i + 1 < img_info_size; ++i)
+    if (img_info_list[i].name_size == img_info_list[i + 1].name_size
+	&& !strcmp (img_info_list[i].name, img_info_list[i + 1].name))
+      {
+	/* Remove duplicate, but prefer one from the command line over one
+	   from the database, because the one from the command line reflects
+	   the reality, while the database is wishful thinking. */
+	if (img_info_list[i].flag.needs_rebasing == 0)
+	  {
+	    free (img_info_list[i].name);
+	    memmove (img_info_list + i, img_info_list + i + 1, 
+		     (img_info_size - i - 1) * sizeof (img_info_t));
+	  }
+	else
+	  {
+	    free (img_info_list[i + 1].name);
+	    if (i + 2 < img_info_size)
+	      memmove (img_info_list + i + 1, img_info_list + i + 2, 
+		       (img_info_size - i - 2) * sizeof (img_info_t));
+	  }
+	--img_info_size;
+	--i;
+      }
+  /* For entries loaded from database, collect image info to reflect reality.
+     Also, collect_image_info sets needs_rebasing to 1, so reset here. */
+  for (i = 0; i < img_info_size; ++i)
+    {
+      if (img_info_list[i].flag.needs_rebasing == 0)
+	{
+	  ULONG base, size;
+
+	  if (GetImageInfos (img_info_list[i].name, &base, &size))
+	    {
+	      img_info_list[i].base = base;
+	      img_info_list[i].size = size;
+	      img_info_list[i].slot_size
+		= roundup2 (img_info_list[i].size, ALLOCATION_SLOT);
+	    }
+	}
+      else
+	img_info_list[i].flag.needs_rebasing = 0;
+    }
+  /* Now sort by address. */
   qsort (img_info_list, img_info_size, sizeof (img_info_t), img_info_cmp);
   for (i = 0; i < img_info_size; ++i)
-    printf ("%-47s base 0x%08lx size 0x%08lx\n",
-	    img_info_list[i].name,
-	    img_info_list[i].base,
-	    img_info_list[i].size);
+    {
+      int tst;
+      ULONG end = img_info_list[i].base + img_info_list[i].slot_size;
+
+      /* Check for overlap and mark both DLLs. */
+      for (tst = i + 1;
+	   tst < img_info_size && img_info_list[tst].base < end;
+	   ++tst)
+	{
+	  img_info_list[i].flag.needs_rebasing = 1;
+	  img_info_list[tst].flag.needs_rebasing = 1;
+	}
+      printf ("%-45s base 0x%08lx size 0x%08lx %c\n",
+	      img_info_list[i].name,
+	      img_info_list[i].base,
+	      img_info_list[i].size,
+	      img_info_list[i].flag.needs_rebasing ? '*' : ' ');
+    }
 }
 
 BOOL
-rebase (const char *pathname, ULONG *new_image_base)
+rebase (const char *pathname, ULONG *new_image_base, BOOL down_flag)
 {
   ULONG old_image_size, old_image_base, new_image_size, prev_new_image_base;
   BOOL status, status2;
@@ -311,7 +879,7 @@ retry:
 void
 parse_args (int argc, char *argv[])
 {
-  const char *anOptions = "b:dio:T:vV";
+  const char *anOptions = "b:dhio:sT:vV";
   int anOption = 0;
 
   while ((anOption = getopt (argc, argv, anOptions)) != -1)
@@ -320,6 +888,7 @@ parse_args (int argc, char *argv[])
 	{
 	case 'b':
 	  image_base = string_to_ulong (optarg);
+	  force_rebase_flag = TRUE;
 	  break;
 	case 'd':
 	  down_flag = TRUE;
@@ -329,6 +898,12 @@ parse_args (int argc, char *argv[])
 	  break;
 	case 'o':
 	  offset = string_to_ulong (optarg);
+	  force_rebase_flag = TRUE;
+	  break;
+	case 's':
+	  image_storage_flag = TRUE;
+	  /* FIXME: For now enforce top-down rebasing when using the database.*/
+	  down_flag = TRUE;
 	  break;
 	case 'T':
 	  file_list = optarg;
@@ -336,6 +911,10 @@ parse_args (int argc, char *argv[])
 	case 'v':
 	  verbose = TRUE;
 	  break;
+	case 'h':
+	  help ();
+	  exit (1);
+	  break;
 	case 'V':
 	  version ();
 	  exit (1);
@@ -347,7 +926,7 @@ parse_args (int argc, char *argv[])
 	}
     }
 
-  if ((image_base == 0 && !image_info_flag)
+  if ((image_base == 0 && !image_info_flag && !image_storage_flag)
       || (image_base && image_info_flag))
     {
       usage ();
@@ -369,9 +948,40 @@ void
 usage ()
 {
   fprintf (stderr,
-	   "usage: rebase -b BaseAddress [-Vdv] [-o Offset] "
+	   "usage: %s -b BaseAddress [-Vdfsv] [-o Offset] "
 	   "[-T FileList | -] Files...\n"
-	   "       rebase -i [-T FileList | -] Files...\n");
+	   "       rebase -i [-s][-T FileList | -] Files...\n"
+	   "       rebase -h for full help text\n", progname);
+}
+
+void
+help ()
+{
+  printf ("\
+Usage: %s [OPTIONS] file(s)...\n\
+Rebase PE files, usually DLLs, to a specified address or address range.\n\
+\n\
+  -b BaseAddress     Specifies the base address at which to start rebasing.\n\
+  -s                 Utilize the rebase database to find unused memory slots\n\
+                     to rebase the files on the command line to. (Implies -d).\n\
+                     If -b is given, too, the database gets recreated.\n\
+  -i                 Rather then rebasing, just print the current base\n\
+                     address and size of a file.\n\
+\n\
+  One of the options -b, -s or -i is mandatory.  If no rebase database exists\n\
+  yet, -b is required together with -s.\n\
+\n\
+  -d                 Treat the BaseAddress as upper ceiling and rebase\n\
+                     files top-down from there.  Without this option the\n\
+                     files are rebased from BaseAddress bottom-up.\n\
+  -o Offset          Specify an offset between subsequent rebase addresses.\n\
+                     Default is no offset.\n\
+  -T FileList        Also rebase the files specified in the file \"FileList\".\n\
+                     The format of \"FileList\" is one file per line.\n\
+  -v                 Print some debug output.\n\
+  -V                 Print version info and exit.\n\
+  -h                 This help.\n",
+	  progname);
 }
 
 BOOL
Index: rebaseall.in
===================================================================
RCS file: /sourceware/projects/cygwin-apps-home/cvsfiles/rebase/rebaseall.in,v
retrieving revision 1.3
diff -u -p -r1.3 rebaseall.in
--- rebaseall.in	28 Jun 2011 19:43:19 -0000	1.3
+++ rebaseall.in	6 Jul 2011 08:33:30 -0000
@@ -13,7 +13,7 @@
 #
 # Written by Jason Tishler <jason@tishler.net>
 #
-# $Id: rebaseall.in,v 1.3 2011/06/28 19:43:19 corinna Exp $
+# $Id: rebaseall.in,v 1.2 2011/06/21 15:40:10 corinna Exp $
 #
 
 # Define constants
@@ -46,7 +46,7 @@ cleanup()
 trap cleanup 1 2 15
 
 # Set defaults
-BaseAddress=$DefaultBaseAddress
+BaseAddress=""
 Offset=$DefaultOffset
 Verbose=$DefaultVerbose
 FileList=$DefaultFileList
@@ -82,6 +82,17 @@ do
     esac
 done
 
+# Check if rebase database already exists.
+database_exists="no"
+[ -f "@sysconfdir@/rebase.image_info" ] && database_exists="yes"
+
+# If BaseAddress has not been specified, and the rebase database doesn't exist
+# yet, set BaseAddress to default.
+if [ -z "${BaseAddress}" -a "${database_exists}" != "yes" ]
+then
+  BaseAddress=$DefaultBaseAddress
+fi
+
 # Set temp directory
 TmpDir="${TMP:-${TEMP:-/tmp}}"
 
@@ -120,8 +131,12 @@ then
     cat "$FileList" >>"$TmpFile"
 fi
 
-# Rebase files
-rebase $Verbose -d -b $BaseAddress -o $Offset -T "$TmpFile"
+if [ -z "${BaseAddress}" ]
+then
+  ./rebase $Verbose -s -T "$TmpFile"
+else
+  ./rebase $Verbose -s -b $BaseAddress -o $Offset -T "$TmpFile"
+fi
 ExitCode=$?
 
 # Clean up

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]