This is the mail archive of the cygwin-apps mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [patch/rebase] Add a rebase database to keep track of DLL addresses


On Jul 23 13:01, Charles Wilson wrote:
> On 7/23/2011 6:11 AM, Corinna Vinschen wrote:
> > There's definitely a bug in the Mingw code, though.  I reviewed the
> > patch, comments inline.
> 
> :-(
> 
> >> I don't know why, but when I added O_BINARY everything was copacetic.
> > 
> > LF->CRLF conversion?
> 
> Well, I guess -- but there aren't (or shouldn't be) any LFs *or* CRs in
> the filenames of DLLs.

The file header is binary data.  It can also contain 0x10 bytes which
are converted to 0x13 0x10 at write time.  Maybe that somehow threw it
off.

>    read(fd, array[i].name, array[i].name_size)
> 
> returned success (or, at least, a non-negative value), but nothing was
> stored in array[i].name.  That's one reason I switched to calloc for the

How do you know there was nothing stored?  Maybe it was just not what
you expected.  Say, if read returned 1 and wrote a 0x0 byte, it can
easily be missed.

But actually that doesn't matter.  Using O_BINARY is the right thing to
do anyway, so I'm glad you stumbled over this problem before we made a
release :}

> >> When I say quick-n-dirty, I mean: lots of duplicated and only slightly
> >> modified code from rebase.c.  There's room for code consolidation, so
> >> this bit could be put off until later.
> > 
> > Right.  Especially all the db file information should go into a new
> > header file.
> 
> Ack -- but that's for another patch. the Q is, add quick-n-dirty now,
> then consolidate and reorganize later -- or reorganize first, and then
> (or simultaneous with) adding the new developer's utility.  Either way,
> doesn't matter to me.

I'd prefer to put just the type information into a header right from the
start.  The rest can be changed later.

> >> +override CXX_LDFLAGS+=@EXTRA_CXX_LDFLAG_OVERRIDES@
> > 
> > Why is CXX_LDFLAGS necessary?  I see what you do but I can't imagine the
> > msys compiler doesn't know -static-libstdc++.
> 
> That's precisely why. msys compiler is 3.4.x which supports only
> -static-libgcc.

Oh boy.

> >> +  mingw|msys )
> >> +    # Also exclude the commands we're using below, which is not ideal
> >> +    /bin/ps -s | /bin/sed -e '1d' | /bin/awk '{print $NF}' |\
> >> +      /bin/sed -e '/\/bin\/ps$/d'   -e '/\/bin\/ps\.exe$/d'  \
> >> +               -e '/\/bin\/sed$/d'  -e '/\/bin\/sed\.exe$/d' \
> >> +	       -e '/\/bin\/awk$/d'  -e '/\/bin\/awk\.exe$/d' \
> >> +	       -e '/\/bin\/sort$/d' -e '/\/bin\/sort\.exe$/d' \
> >> +	       -e '/\/bin\/uniq$/d' -e '/\/bin\/uniq\.exe$/d' \
> >> +               -e '/\/bin\/dash$/d' -e '/\/bin\/dash\.exe$/d' |\
> > 
> > What's the reason to do that?  I don't see how that should be necessary.
> > All the tools are used before peflags is called, so there's no problem
> > to change them as well, just as on Cygwin.
> [...]
> Since the procedure, on msys, to detect processes requires at minimum
> ps.exe, awk, and dash, we have to 'sed-out' those three applications --
> and sed.exe itself -- to avoid 'false positives'.  (uniq and sort are
> cosmetic, but if we use 'em, then we need to sed them out too).

The entire expression is just too complicated and error prone, IMHO.
AFAICS, all of what's needed can be done in a single invocation of
GA-"swiss army knife"-WK, and it's probably not even complicated...

[...time passes...]

What about this:

  /bin/ps -s | /bin/gawk '\
    # Count number of running ash or dash. \
    /\/bin\/(d)?ash(\.exe)?$/{ ash_cnt++; } \
    # Count number of ps and gawk. \
    /\/bin\/ps(\.exe)?$/{ cnt++; } \
    /\/bin\/gawk(\.exe)?$/{ cnt++; } \
    END{ \
      # Uncomment for testing: \
      # printf "TOTAL: %d CNT: %d ASH_CNT: %d\n", NR, cnt, ash_cnt; \
      # Only one of ps and gawk each may run. \
      if (cnt > 2) \
	exit 0; \
      # The total number of allowed processes is one less than the number \
      # of input lines.  The extra line is the ps header output. \
      if (NR - cnt - ash_cnt > 1) \
	exit 0; \
      # All is well. \
      exit 1; \
    }'
  ProcessResult=$?

It's more comment than code and it does what you need.

> > Does it have strtoll?  It should have since the function is available
> > in Cygwin since October 2001, which means it was available in Cygwin
> > 1.3.4 already.  Msys has been forked after that, afaics.
> > 
> > So, if we have strtoll, you could simply use that and cast the result to
> > uint64_t, rather than to paste some external strtoull implementation.
> 
> No, it doesn't have strtoll:
> 
> objdump -p /c/MinGW/msys/1.0/bin/msys-1.0.dll | grep strtoll
> 
> Anyway, here's the fork date for msys:
> 
> Wed Sep 12 01:03:36 2001  Christopher Faylor <...>
> 
> The strto[u]ll stuff was exported from cygwin two weeks later:
> Mon Oct 1 14:25:00 2001  Charles Wilson  <...>
> having been added to newlib the same day:
> 2001-10-01  Charles Wilson  <...>

Oh boy.

> > There's no reason to use calloc, it's overwritten by the subsequent
> > read() anyway.  This is differnt from the calloc for img_info_list,
> > which will only get partially filled by the read call.
> 
> No, that's not what happened on mingw.
> 
> I was getting a "successful" return (e.g. non-negative return value)
> from read(), but *nothing* was written into the .name buffer. So, later

Are you sure this wasn't when it used textmode by accident?  This sounds
*SO* unlikely to happen in O_BINARY mode, even for the stone-age old
code msys is based upon.

> >> +#else
> >> +  {
> >> +    /* Borrow cygwin code for extracting module path, but use ANSI */
> >> +    char exepath[PATH_MAX];
> > 
> > PATH_MAX?  How big is that in native Win32?  If it's equivalent to
> > MAX_PATH, you don't have to worry about long path prefixes.
> 
> limits.h:
> #define PATH_MAX        259
> 
> stdlib.h:
> #define       MAX_PATH        (260)

So long path names wouldn't work anyway.  What about just allocating
a buffer of size 32K to be on the safe side?  Or just ignore long
paths since they are probably not used anyway.

> > [...]
> > Fortunately the strcpy's are wrong, so the path is just what
> > GetModuleFileNameA returned, a plain Win32 path, which is what you need.
> > So, in fact you should just use the path returned by GetModuleFileNameA
> > and...
> > 
> >> +    /* strip off exename and trailing slash */
> > 
> > ..yes, exactly.
> 
> OK.  All that stuff with NT native paths is so much black magic to me,

It's everything but.  \?? is basically just a folder(*) in the native NT
namespace which contains symlinks to devices.  The names of these
symlinks constitute the Win32 device names.  For instance, Win32 drive
C:.  Prepend \??\ and you get the NT name of the device: \??\C:.
However, \??\C: is actually just a symlink to the NT device name
\Device\HarddiskVolume1.  Fetch the tool "winobj" from sysinternals.
It's pretty instructive.

As for Win32 long pathnames.  Those start with \\?, for instance \\?\C:.
The path conversion routine in ntdll.dll just exchanges the second
backslash with a question mark and, voila, it has the NT path.

Same for network paths.  The short name is \\server\share, the long
pathname is \\?\UNC\server\share, the NT name is \??\UNC\server\share.
And UNC is just a symlink to the network redirector device \Device\Mup.

(*) Actually \?? is a virtual folder which merges a system-wide folder
    (\GLOBAL??) and a per-session folder (\Sessions\0\DosDevices\$session-id). 

> >> +#if defined(__MSYS__)
> >> +/* implementation adapted from google andriod bionic libc's
> > 
> > s/andriod/android
> > 
> > However, as mentioned above, I'd remove this code and just create a
> > small wrapper around the strtoll function.
> 
> ....except we don't yet have that on msys, either.

Oh boy.

> >> +ProcessResult=0
> >> +case $Platform in
> >> +  mingw|msys )
> >> +    # Also exclude the commands we're using below, which is not ideal
> >> +    /bin/ps -s | /bin/sed -e '1d' | /bin/awk '{print $NF}' |\
> >> +      /bin/sed -e '/\/bin\/ps$/d'   -e '/\/bin\/ps\.exe$/d'  \
> >> +               -e '/\/bin\/sed$/d'  -e '/\/bin\/sed\.exe$/d' \
> >> +               -e '/\/bin\/awk$/d'  -e '/\/bin\/awk\.exe$/d' \
> >> +               -e '/\/bin\/sort$/d' -e '/\/bin\/sort\.exe$/d' \
> >> +               -e '/\/bin\/uniq$/d' -e '/\/bin\/uniq\.exe$/d' \
> >> +               -e '/\/bin\/dash$/d' -e '/\/bin\/dash\.exe$/d' |\
> >> +               sort | uniq | grep -E '.'
> > 
> > Same as in peflagsall.in, I don't see why this should be necessary.
> 
> As explained above, it /is/ necessary -- we want to bail if we detect
> ANY active cygwin/msys processes, but avoid false positives due to the
> commands used to DETECT those processes.

Just use a simple ps | gawk as outlined above.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]