[PATCH RFC] fork: reduce chances for "address space is already occupied" errors

Michael Haubenwallner michael.haubenwallner@ssi-schaefer.com
Thu Mar 28 08:34:00 GMT 2019


Hi Corinna,

On 3/27/19 10:16 AM, Corinna Vinschen wrote:
> On Mar 27 09:26, Michael Haubenwallner wrote:
>> On 3/26/19 7:28 PM, Corinna Vinschen wrote:
>>> On Mar 26 19:25, Corinna Vinschen wrote:
>>>> On Mar 26 18:10, Michael Haubenwallner wrote:
>>>>> Hi Corinna,
>>>>>
>>>>> as I do still encounter fork errors (address space needed by <dll> is
>>>>> already occupied) with dynamically loaded dlls (but unrelated to
>>>>> replaced dlls), one of them repeating even upon multiple retries,
>>>>
>>>> Why didn't rebase fix that?
>>
>> As far as I understand, rebasing is about touching already installed
>> dlls as well, which would require to restart all Cygwin processes.
>> As the problem is about some dll built during a larger build job,
>> this is not something that feels useful to me.
> 
> Wait, let me understand what's going on.  IIUC you're building DLLs
> which are then used during the build job itself, right?

Exactly.
FWIW, the CI builds also set up a Cygwin instance from scratch,
as I'm also after testing Cygwin (v3) itself to some degree:
https://dev.azure.com/gentoo-prefix/ci-builds/_build

However, I've not found a commandline option for setup.exe to install
"test" versions...

> 
>>> Btw., is that 32 or 64 bit?  Both?
>>
>> I'm on 64bit only, can't say for 32bit.  And while in theory possible,
>> I'm not after supporting 32bit Cygwin in Gento Prefix at all...
> 
> If so, then I'm really curious how many DLLs are affected and why this
> occurs on 64 bit.
> 
> As you know, 64 bit has a defined memory layout.  Binutils ld is
> supposed to base the DLLs to a pseudo-random address in the area between
> 0x4:00000000 and 0x6:00000000.  This area is occupied by un-rebased DLLs
> only.  8 Gigs is a *lot* of space for DLLs.
> 
> That also means that the DLLs should not at all collide with windows
> objects (typically reserved in the lesser 2 Gigs area), unless they
> collide with themselves.  At least that's the idea.
> 
> Can you check what addresses the freshly built DLLs are based on by LD?
> Is there a chance that the algorithm used in LD is too dumb?

I've also added system_printf to dll_list::reserve_space() when a dynloaded
dll was relocated, and each new address was below 0x0:01000000. The attached
output also contains the preferred address, above 0x4:00000000 each.

> 
> Or, hmm.  Is there a chance that newer Windows loads dynamically loaded
> DLLs whereever it likes, ignoring the base address, ASLR-like, even
> if the DLL is marked as non-ASLR-aware?  But then again, we should have
> a lot more complaints on the list...

I've done this test on Windows Server 2012R2, but the problem exists on
2016 and 2019 as well (I'm not testing with other Windows versions).

>>>>>  I'm
>>>>> coming up with attached patch.
>>>>>
>>>>> What do you think about it?
>>>>
>>>> I'm not opposed to this patch but I don't quite follow the description.
>>>> threadinterface->Init only creates three event objects.  From what I can
>>>> tell, Events are stored in Paged and Nonpaged Pools, so they don't
>>>> affect the processes VM.  What am I missing?
>>
>> Honestly, I'm not completely sure whether this patch really does help:
>> Beyond the Events, there also is CreateNamedPipe and CreateFile used
>> in fhandler_pipe::create via sigproc_init, and these causing the address
>> conflicts with some dll actually is nothing more than a wild guess:
>> While their returned handles are below the conflicting dll address,
>> who can tell what these API calls do allocate internally?
> 
> The handles are not addresses.  If the sigproc_init stuff collides,
> I only see two chances for that, the process-local read/write buffers
> of the signal pipe, and the stack of the read_sig thread.
> 
> If this patch helps your situation, we can pull it in and test it,
> but I think your situation asks for more debugging along the lines
> of the DLL rebasing above.

With this patch collisions seem gone, yet the relocations do happen.

Thanks!
/haubi/
-------------- next part --------------
     29 [main] python2.7 51113 dll_list::reserve_space: libpython2.7.dll preferring 0x53BB50000 was loaded to 0xA80000 (\??\C:\cygwin64\home\haubi\test-20190327\gentoo-prefix\var\tmp\portage\dev-lang\python-2.7.16\work\x86_64-pc-cygwin\libpython2.7.dll)
      2 [main] python2.7 52526 dll_list::reserve_space: cygcrypto-1.1.dll preferring 0x41C650000 was loaded to 0x650000 (\??\C:\cygwin64\home\haubi\test-20190327\gentoo-prefix\usr\bin\cygcrypto-1.1.dll)
      2 [main] python2.7 54352 dll_list::reserve_space: cygcrypto-1.1.dll preferring 0x41C650000 was loaded to 0x6E0000 (\??\C:\cygwin64\home\haubi\test-20190327\gentoo-prefix\usr\bin\cygcrypto-1.1.dll)
      2 [main] python2.7 55760 dll_list::reserve_space: cygcrypto-1.1.dll preferring 0x41C650000 was loaded to 0x7C0000 (\??\C:\cygwin64\home\haubi\test-20190327\gentoo-prefix\usr\bin\cygcrypto-1.1.dll)
      3 [main] python2.7 55763 dll_list::reserve_space: cygcrypto-1.1.dll preferring 0x41C650000 was loaded to 0x7C0000 (\??\C:\cygwin64\home\haubi\test-20190327\gentoo-prefix\usr\bin\cygcrypto-1.1.dll)
      2 [main] python2.7 55766 dll_list::reserve_space: cygcrypto-1.1.dll preferring 0x41C650000 was loaded to 0x7C0000 (\??\C:\cygwin64\home\haubi\test-20190327\gentoo-prefix\usr\bin\cygcrypto-1.1.dll)
      2 [main] python2.7 55769 dll_list::reserve_space: cygcrypto-1.1.dll preferring 0x41C650000 was loaded to 0x7C0000 (\??\C:\cygwin64\home\haubi\test-20190327\gentoo-prefix\usr\bin\cygcrypto-1.1.dll)
      2 [main] python2.7 56079 dll_list::reserve_space: cygcrypto-1.1.dll preferring 0x41C650000 was loaded to 0x900000 (\??\C:\cygwin64\home\haubi\test-20190327\gentoo-prefix\usr\bin\cygcrypto-1.1.dll)
      2 [main] python2.7 56082 dll_list::reserve_space: cygcrypto-1.1.dll preferring 0x41C650000 was loaded to 0x900000 (\??\C:\cygwin64\home\haubi\test-20190327\gentoo-prefix\usr\bin\cygcrypto-1.1.dll)
      3 [main] python2.7 56085 dll_list::reserve_space: cygcrypto-1.1.dll preferring 0x41C650000 was loaded to 0x900000 (\??\C:\cygwin64\home\haubi\test-20190327\gentoo-prefix\usr\bin\cygcrypto-1.1.dll)
      3 [main] python2.7 56088 dll_list::reserve_space: cygcrypto-1.1.dll preferring 0x41C650000 was loaded to 0x900000 (\??\C:\cygwin64\home\haubi\test-20190327\gentoo-prefix\usr\bin\cygcrypto-1.1.dll)
      2 [main] python2.7 56091 dll_list::reserve_space: cygcrypto-1.1.dll preferring 0x41C650000 was loaded to 0x900000 (\??\C:\cygwin64\home\haubi\test-20190327\gentoo-prefix\usr\bin\cygcrypto-1.1.dll)
      2 [main] python2.7 59201 dll_list::reserve_space: cygcrypto-1.1.dll preferring 0x41C650000 was loaded to 0xC10000 (\??\C:\cygwin64\home\haubi\test-20190327\gentoo-prefix\usr\bin\cygcrypto-1.1.dll)
      3 [main] python3.6m 58103 dll_list::reserve_space: libpython3.6m.dll preferring 0x4288D0000 was loaded to 0xB00000 (\??\C:\cygwin64\home\haubi\test-20190327\gentoo-prefix\var\tmp\portage\dev-lang\python-3.6.8\work\Python-3.6.8\libpython3.6m.dll)
      4 [main] python2.7 8889 dll_list::reserve_space: cygcrypto-1.1.dll preferring 0x41C650000 was loaded to 0x730000 (\??\C:\cygwin64\home\haubi\test-20190327\gentoo-prefix\usr\bin\cygcrypto-1.1.dll)
      2 [main] python2.7 10208 dll_list::reserve_space: cygcrypto-1.1.dll preferring 0x41C650000 was loaded to 0x730000 (\??\C:\cygwin64\home\haubi\test-20190327\gentoo-prefix\usr\bin\cygcrypto-1.1.dll)
      2 [main] python2.7 10211 dll_list::reserve_space: cygcrypto-1.1.dll preferring 0x41C650000 was loaded to 0x730000 (\??\C:\cygwin64\home\haubi\test-20190327\gentoo-prefix\usr\bin\cygcrypto-1.1.dll)
      2 [main] python2.7 10214 dll_list::reserve_space: cygcrypto-1.1.dll preferring 0x41C650000 was loaded to 0x730000 (\??\C:\cygwin64\home\haubi\test-20190327\gentoo-prefix\usr\bin\cygcrypto-1.1.dll)
      2 [main] python2.7 10217 dll_list::reserve_space: cygcrypto-1.1.dll preferring 0x41C650000 was loaded to 0x730000 (\??\C:\cygwin64\home\haubi\test-20190327\gentoo-prefix\usr\bin\cygcrypto-1.1.dll)
      2 [main] python2.7 10467 dll_list::reserve_space: cygcrypto-1.1.dll preferring 0x41C650000 was loaded to 0x750000 (\??\C:\cygwin64\home\haubi\test-20190327\gentoo-prefix\usr\bin\cygcrypto-1.1.dll)
      2 [main] python2.7 10470 dll_list::reserve_space: cygcrypto-1.1.dll preferring 0x41C650000 was loaded to 0x750000 (\??\C:\cygwin64\home\haubi\test-20190327\gentoo-prefix\usr\bin\cygcrypto-1.1.dll)
      2 [main] python2.7 10473 dll_list::reserve_space: cygcrypto-1.1.dll preferring 0x41C650000 was loaded to 0x750000 (\??\C:\cygwin64\home\haubi\test-20190327\gentoo-prefix\usr\bin\cygcrypto-1.1.dll)
      2 [main] python2.7 10476 dll_list::reserve_space: cygcrypto-1.1.dll preferring 0x41C650000 was loaded to 0x750000 (\??\C:\cygwin64\home\haubi\test-20190327\gentoo-prefix\usr\bin\cygcrypto-1.1.dll)
      4 [main] python2.7 10479 dll_list::reserve_space: cygcrypto-1.1.dll preferring 0x41C650000 was loaded to 0x750000 (\??\C:\cygwin64\home\haubi\test-20190327\gentoo-prefix\usr\bin\cygcrypto-1.1.dll)
      2 [main] python2.7 11941 dll_list::reserve_space: cygcrypto-1.1.dll preferring 0x41C650000 was loaded to 0x8A0000 (\??\C:\cygwin64\home\haubi\test-20190327\gentoo-prefix\usr\bin\cygcrypto-1.1.dll)
      2 [main] python2.7 25387 dll_list::reserve_space: cygcrypto-1.1.dll preferring 0x41C650000 was loaded to 0xB20000 (\??\C:\cygwin64\home\haubi\test-20190327\gentoo-prefix\usr\bin\cygcrypto-1.1.dll)


More information about the Cygwin-patches mailing list