This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: ASLR sometimes stops working on Vista with 1.7? [was: Re: Cygwin 1.7 release (was ...)]


[Caution, another long reply.  For those just looking for a simple
 workaround, skip to line 86 of the mail body.]

On Jun  5 10:56, Charles Wilson wrote:
> Corinna Vinschen wrote:
> > I can reproduce the "unable to remap" on W7RC by running `cygport
> > automake1.11-1.11-10 compile'.
> 
> Uhhh...I'm glad to hear that? Or not...
> 
> >  The culprit in my case is always the
> > same DLL, a run-time loaded perl DLL called Cwd.dll.  Even after
> > rebaseall, it still doesn't work because the Windows Loader tries to
> > load the DLL into an entirely different address.
> 
> You did reboot, right? IIRC Windows only calculates the new base

No.  Because *none* of my DLLs is marked to be ASLR compatible.  I'm
testing what happens OOTB.  The entire problem starts already in the
parent when the DLL base address is 0x6ee00000.  The parent inevitably
rebases the DLL to a very low address like 0xa00000 or even 0x900000 at
load time.  The child then fails to map the DLL to the same address.

However, if I rebase the DLL to some other spot, like 0x65000000, then
the DLL is loaded at that address exactly, and everything works fine.  I
still don't think this has anything to do with ASLR.  ASLR only
complicates the picture.  AFAICS, there's no guarantee that the address
computed by ASLR will help forever.  It only eases the underlying
problem by chance if the addresses happen to have a low chance for
collision.

The problem is not the fact that the DLL is rebased at all in the
parent.  Even though in my case the address range 0x6ee00000-0x6ee08000
isn't taken by another DLL, it could be taken by memory dynamically
allocated by one of the formerly loaded DLLs.

The real shit starts with the fact that W7 (and Vista, too, apparently)
rebases the DLL to an address which is so very low in the address space
of the application.  This is uncomfortably near to where the process
heap is expected to be.  When Vista was new, we had the problem already
in a somewhat different way.  Note this comment in Cygwin's heap.cc:

/* For some obscure reason Vista and 2003 sometimes reserve space after
   calls to CreateProcess overlapping the spot where the heap has been
   allocated.  This apparently spoils fork.  The behaviour looks quite
   arbitrary.  Experiments on Vista show a memory size of 0x37e000 or
   0x1fd000 overlapping the usual heap by at most 0x1ed000.  So what
   we do here is to allocate the heap with an extra slop of (by default)
   0x400000 and set the appropriate pointers to the start of the heap
   area + slop.  A forking child then creates its heap at the new start
   address and without the slop factor.  Since this is not entirely
   foolproof we add a registry setting "heap_slop_in_mb" so the slop
   factor can be influenced by the user if the need arises. */

This problem with dynamically linked DLLs looks quite similar.  Some
space after the heap is reserved in the child which wasn't reserved
in the parent.

If Vista/W7 would refrain from using the lowest available address in
the parent already, the entire problem might go away (aka "occurs
very, very seldom")

> > I think I'm going to ask MSFT if there's any workaround for
> > this problem.
> 
> If my understanding of ASLR is correct, then ASLR *ought* to have solved
> this problem, except for systems with a LOT of dynbase-marked DLLs that
> have been loaded during the same boot session, such that you "run out"
> of ASLR-tracked addresses (The ASLR mappings are shared across all
> processes, are persistent for the entire logon session, I think -- so
> you could eventually run out).

As I mentioned above, I don't think that ASLR can solve this problem
once and for all.  Whereever any DLL is rebased to by the ASLR
mechanism, there's a chance that the address is already taken in the
child when LoadLibrary is called for the dynamically loaded DLL.

> But IMO it is not working, for some reason, with the perl DLLs.  Note
> that it's not always Cwd.dll.  If you reboot, rename Cwd.dll to
> something else, and keep going, a few things will happen:
>  1) perl won't work quite right, because the Cwd.dll really is needed by
> the scripts that 'use Cwd;'
>  2) ignoring that, keep going. Eventually the remap problem will hit
> another perl DLL. In my case, Posix.dll.

Here's another thought:

I examined the address layout of the perl process again, and it struck
me as weird that the base addresses of all the DLLs which get dynamically
loaded by perl are so near together.  It looks like the problem is
actually tightened by the order in which the DLLs are rebased by rebaseall,
and the order in which the DLLs are loaded into the running process.
Some perl DLL (Dumper.dll?) allocates additional memory and that's right
after it's own image.  That's where Cwd.dll is based to.  Cwd.dll gets
rebased and ... poof.

What I did then was to change the offset to rebaseall:

ash$ rebaseall -o 0x20000   (default is 0x10000)

Then I reinstalled /bin/cyggmp-3.dll and reran cygport.  This time
it ran fine.  This is still w/o ASLR flags.

In this configuration, I can reproduce running cygport successfully
every time.

> Could it be possible that cygwin's dlopen (or fork) implementation is
> doing something that occasionally defeats ASLR, such that eventually a
> perl parent process [**] dlopen's Cwd.dll at the wrong memory location?

Not that I can see.  The memory for the data storing the loaded DLLs is
loaded from the parent memory into a stack slot.  There's no other
memory allocation going on.  Well, except when LoadLibrary already
failed.

> [**] obviously this perl "parent" process was itself invoked as a
> fork/exec from, say, bash, but we've long since gotten past the exec()
> for perl, if we're down to dlopening DLLs needed by virtue of 'use'
> statements in a particular .pl script
> 
> Hmmm...what if it's a race condition in fork/exec during a chain of
> perl's? Let's take a look at what happens in autoreconf...(note that
> this is all supposition. I hope it is accurate, and believe it is
> reasonable so, but I haven't explicitly straced the process)

What I see only affects one single perl parent and the forked child.
There's not a single perl process involved which had the dynamically
loaded DLLs loaded at the correct (aka "desired") spot in memory.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]