This is the mail archive of the cygwin-developers mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Extend faq.using to discuss fork failures


On Fri, Aug 19, 2011 at 09:43:10AM -0400, Ryan Johnson wrote:
>Hi all,
>
>I propose to add an entry to cygwin's faq.using which covers fork 
>failures. Frankly, I'm surprised it wasn't there years ago... it's 
>certainly frequently-asked, and the answer is always the same. Right now 
>users have to trawl the archives to figure out what to do (or more 
>likely, just blindly spam the list and get told to rebase and/or trawl 
>the list archives).
>
>Also, what is the status of "the spawn family of calls provided by 
>Cygwin" [1]? There's nothing about it at the API page [2], and a search 
>though the user guide [3] comes up empty as well. Searching /usr/include 
>turns up only /usr/include/process.h, which contains only the function 
>declarations and a single comment -- "This file comes with MSDOS and 
>WIN32 systems" -- indicating that Windows, not cygwin, provides the 
>functions (which, incidentally, are deprecated in favor of the 
>posix-compliant _spawn* instead [4]). Would it make sense to update the 
>docs to mention these are native Windows functions, and update the 
>headers to include the non-deprecated function signatures?

I appreciate that you're trying to do this.  I was actually going to
ask someone if they wanted to write a section like this but assumed
I wouldn't get any takers.

Wrt, the spawn function, they harken from a time when Cygwin was
confused about what API it was exporting.  They *are* deprecated.  I
don't see any pressing need to document them.

(And, yes, we know about the posix functions with _spawn in their names)

>[1] http://www.cygwin.com/cygwin-ug-net/highlights.html#ov-hi-process
>[2] http://cygwin.com/cygwin-api/
>[3] http://cygwin.com/cygwin-ug-net/cygwin-ug-net-nochunks.html.gz
>[4] http://msdn.microsoft.com/en-us/library/ms235383%28v=vs.80%29.aspx
>
>Seed text below...
>
>Thoughts?
>Ryan
>
>Why does fork fail so often on my system?

I'd prefer something like "Is there a way to fix fork failures?"

>..., and reports of fork failures are 
>probably the single most common thread topic in the cygwin mailing list.

I don't think comments like this are appropriate.  If there was a
magical time when we've fixed fork failures (and Corinna's proposed
changes to run rebase during setup.exe should at least cut back on them)
then this would be out-of-date.  It doesn't provide any useful information
to the user anyway.

>Common error messages include:
>- unable to remap $dll to same address as parent
>- couldn't allocate heap
>- died waiting for dll loading
>- child -1 - died waiting for longjmp before initialization
>- STATUS_ACCESS_VIOLATION
>- resource temporarily unavailable
>
>The problem often (re)appears or worsens after installing up updating 
>cygwin packages (which can undo the effects of rebaseall and peflagsall, 
>see below). Applications which dynamically compile and load dlls (e.g. 
>perl, ruby, some lisps, building gcc from sources) are also especially 
>prone to fork failures for the same reason. Fork failures in general 
>also became significantly more common with the introduction of Vista and 
>Win7, whose address space layout randomization (ASLR) often causes child 
>processes to spawn with dlls, thread stacks, heaps, and other memory 
>objects allocated in different locations than the parent. While cygwin 
>compensates for as many of these relocations as possible, there always 
>remains a possibility of fork failures.
>
>If you find that frequent fork failures interfere with normal use of 
>cygwin, please try the following steps:
>
>1. Disable or uninstall applications known to interfere with cygwin (see 
>http://cygwin.com/faq/faq.using.html#faq.using.bloda). Many of them 
>inject dlls into processes at inconsistent locations, which breaks 
>fork() semantics.
>
>2. Rebase your system (see /usr/share/doc/Cygwin/rebase-3.0.1.README). 
>Every dll in the system specifies a base address -- the preferred memory 
>location it should load at -- and the Windows loader does not break ties 
>consistently when it encounters base address conflicts.
>
>3. With Vista and later, use peflagsall to set the TS-aware bit on all 
>cygwin dlls (see /usr/share/doc/Cygwin/rebase-3.0.1.README, reboot 
>needed for changes to take effect). This exploits a side effect of 
>address space layout randomization which (ironically) causes dlls to 
>nearly always load at the same address.
>
>4. If you have access to the source code of the offending application 
>(this applies to all cygwin packages), consider replacing calls to 
>fork() with calls to the spawn family of functions. These are a native 
>(= reliable and highly efficient) replacement for fork+exec, which is by 
>far the most common usage of fork(), and are documented at 
>http://msdn.microsoft.com/en-us/library/20y988d2%28v=VS.100%29.aspx.

I appreciate your thoroughness but I think there are way too many words
above.  The FAQ should be solution-oriented.  If it is important to
discuss the details behind why fork() fails then maybe another section
could be added.  Otherwise, I'd prefer to see something which shows the
error messages and then, as briefly as possible, shows solutions.

While people do ask "Why does fork fail?", the majority of the askers
don't really care.  They are really asking "How do I make Cygwin fork
work?"  So, I don't think that it is really FAQ-appropriate to dive
too deep here.

And, again, we don't want to tell people to use non-POSIX solutions
except as a last resort.  Telling people to rewrite their source code
flies in the face of what Cygwin is trying to do.

(And, yes, I presciently can hear the argument to the above paragraph
coming)

cgf


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]