This is the mail archive of the cygwin-patches mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Improvements to fork handling (2/5)


On 24/05/2011 12:14 PM, Corinna Vinschen wrote:
On May 22 14:42, Ryan Johnson wrote:
On 21/05/2011 9:44 PM, Christopher Faylor wrote:
On Wed, May 11, 2011 at 02:31:37PM -0400, Ryan Johnson wrote:
Hi all,

This patch has the parent sort its dll list topologically by
dependencies. Previously, attempts to load a DLL_LOAD dll risked pulling
in dependencies automatically, and the latter would then not benefit
>from the code which "encourages" them to land in the right places. The
dependency tracking is achieved using a simple class which allows to
introspect a mapped dll image and pull out the dependencies it lists.
The code currently rebuilds the dependency list at every fork rather
than attempt to update it properly as modules are loaded and unloaded.
Note that the topsort optimization affects only cygwin dlls, so any
windows dlls which are pulled in dynamically (directly or indirectly)
will still impose the usual risk of address space clobbers.
This seems CPU and memory intensive during a time for which we already
know is very slow.  Is the benefit really worth it?  How much more robust
does it make forking?
Topological sorting is O(n), so there's no asymptotic change in
performance. Looking up dependencies inside a dll is *very* cheap
Btw., isn't the resulting dependency list identical to the list
maintained in the PEB_LDR_DATA member InInitializationOrderModuleList?
Or, in other words, can't we just use the data which is already
available?
I read somewhere that dll initialization is not guaranteed to happen in any particular order, and from what I've seen so far I believe it.

I think that's one reason (among many) why cygwin has to factor the user's initialization routines out from normal dll init function: they might call functions in other dlls which might not have been initialized yet. From what I can tell, though, mapping of all dlls in a batch completes before any initialization routines run.

Even assuming I'm wrong and dependency order === initialization order, we'd still have to find a way to isolate those dlls which are both cygwin-aware and dynamically loaded, because those are the only ones we care about. Doing that would also be expensive because we'd be searching the cygwin dll list for each dll in the PEB's list.

The best way to improve performance of this part of fork() would be to figure out how to force a dll to load in the right place on the first try. Achieving this admittedly "difficult" task would eliminate multiple syscalls per dll, the aggregate cost of which dominates the topsort into oblivion unless I'm very mistaken.

Ryan


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]