This is the mail archive of the
cygwin-developers
mailing list for the Cygwin project.
Re: About the dll search algorithm of dlopen
On 2016-08-22 14:15, Michael Haubenwallner wrote:
> Hi Corinna,
>
> On 08/20/2016 09:32 PM, Corinna Vinschen wrote:
>>>>>
>>>>> One way around YA code duplication could be some kind of path iterator
>>>>> class which could be used from find_exec as well as from
>>>>> get_full_path_of_dll.
>
>>>> 0001.patch is a draft for some new cygwin::pathfinder class, with
>>>> 0002.patch adding the executable's directory as searchpath, and
>>>> 0003.patch to search the PATH environment as well.
>>>>
>>>> Thoughts?
>>
>> Ok, that might be disappointing now because you already put so much work
>> into it, but I actually expected some more discussion first. I have two
>> problem with this.
>>
>> I'm not a big fan of templates.
>
> Never mind, it's been some template exercise to me anyway.
>
>> What I had in mind was a *simple* class which gets told if it searches
>> for libs or executables and then checks the different paths accordingly,
>> kind of a copy of find_exec as a class, just additionally handling the
>> prefix issue for DLLs.
>
> What I'm more interested in for such a class is the actual API for use
> by dlopen() and exec(), and the final list of files searched for - with
> these use cases coming to my mind:
>
> Libraries/dlls with final search path "/lib:/morelibs":
> L1) dlopen("libN.so")
> L2) dlopen("libN.dll")
> L3) dlopen("cygN.dll")
> L4) dlopen("N.so")
> L5) dlopen("N.dll")
> Executables with final search path "/bin:/moreexes"
> X1) exec("X")
> X2) exec("X.exe")
> X3) exec("X.com")
>
> Instead of API calls similar to:
> L1) find(dll, "N", ["/lib", "/morelibs"])
> L2) find(dll, "N", ["/lib", "/morelibs"])
> L3) find(dll, "N", ["/lib", "/morelibs"])
> L4) find(dll, "N", ["/lib", "/morelibs"])
> L5) find(dll, "N", ["/lib", "/morelibs"])
> X1) find(exe, "X", ["/bin", "/moreexes"])
> X2) find(exe, "X", ["/bin", "/moreexes"])
> X3) find(exe, "X", ["/bin", "/moreexes"])
> it feels necessary to support more explicit naming, as in:
> L1) find(["libN.so", "cygN.dll", "libN.dll"], ["/lib/../bin","/lib","/morelibs"])
> L2) find([ "cygN.dll", "libN.dll"], ["/lib/../bin","/lib","/morelibs"])
> L3) find([ "cygN.dll", "libN.dll"], ["/lib/../bin","/lib","/morelibs"])
> L4) find(["N.so", "N.dll" ], ["/lib/../bin","/lib","/morelibs"])
> L5) find([ "N.dll" ], ["/lib/../bin","/lib","/morelibs"])
> X1) find(["X", "X.exe","X.com"], ["/bin","/moreexes"])
> X2) find(["X", "X.exe" ], ["/bin","/moreexes"])
> X3) find(["X", "X.com" ], ["/bin","/moreexes"])
>
> Where the find method does not need to actually know whether it searches
> for a dll or an exe, but dlopen() and exec() instead define the file
> names to search for. This is what the patch draft does in dlopen.
>
>>>>>>>> *) The directory of the current main executable should be searched
>>>>>>>> after LD_LIBRARY_PATH and before /usr/bin:/usr/lib.
>>>>>>>> And PATH should be searched before /usr/bin:/usr/lib as well.
>>>>>>>
>>>>>>> Checking the executable path and $PATH are Windows concepts. dlopen
>>>>>>> doesn't do that on POSIX systems and we're not doing that either.
>>>>>>
>>>>>> Agreed, but POSIX also does have the concept of embedded RUNPATH,
>>>>>> which is completely missing in Cygwin as far as I can see.
>>>>>
>>>>> RPATH and RUNPATH are ELF dynamic loader features, not supported by
>>>>> PE/COFF.
>>>>
>>>> In any case, to me it does feel quite important to have the (almost) same
>>>> dll search algorithm with dlopen() as with CreateProcess().
>>
>> Last but not least I'm not yet convinced if it's *really* a good idea to
>> prepend the executable path to the DLL search path unconditionally. Be
>> it as it is in terms of DT_RUNPATH, why is the application dir a good
>> choice at all, unless we're talking Windows semantics? Which we don't.
>> Also, if loading from the applications dir from dlopen is important for
>> you, you can emulate it by adding the application dir to LD_LIBRAYR_PATH.
>
> As long as there is lack of a Cygwin specific dll loader to find the
> dlls to load during process startup, we're bound to Windows semantics.
>
> For dlopen, it is more important to find the same dll file as would be
> found when the exe was linked against that dll file, rather than using
> the Linux-known algorithm and environment variables - and differ from
> process startup: Both really should result in the same algorithm here,
> even if that means some difference compared to Linux.
>
> As far as I understand, lack of DT_RUNPATH (besides /etc/ld.so.conf)
> support during process start was the main reason for the dlls to install
> into /lib/../bin instead of /lib at all, to be found at process start
> because of residing in the application's bin dir:
> Why should that be different for dlopen?
>
>> I checked for the usage of DT_RUNPATH/DT_RPATH on Fedora 23 and only a
>> limited number of packages use it (texlive, samba, python, man-db,
>> swipl, and a few more). Some of them, like texlive, even use it wrongly,
>> with RPATH pointing to a non-existing build dir. There are also a few
>> stray "/usr/lib64" settings, but all in all it's not used to point to
>> the dir the application is installed to, but rather to some package specific
>> subdir, e.g. /usr/lib64/samba, /usr/lib64/swipl-7.2.3/lib/x86_64-linux,
>> etc.
>
> On Linux, the binaries installed in /usr usually rely on the Linux
> loader to be configured via /etc/ld.so.conf to find their runtime
> libs in /usr/lib.
>
> Please remember: This whole thing is not a problem with packages
> installed to /usr, but with packages installed to /somewhere/else
> that provide runtime libraries that are also available in /usr.
>
> Using LD_LIBRARY_PATH pointing to /somewhere/else/lib may break the
> binaries found in /usr/bin - and agreed, searching PATH doesn't make
> it better, as PATH is the "LD_LIBRARY_PATH" for Windows.
>
>> IMHO this means just adding the applications bin dir is most of the time
>> an unused or even wrong workaround.
>
> Although GetModuleHandle may reduce that pressure for dlopen - as long as
> the applications bin dir is searched at process start, it really should
> be searched by dlopen too, even if for /usr/bin/* this might indeed become
> redundant, as we always add /usr/bin in dlopen - which really mimics
> the /etc/ld.so.conf content actually, although that one is unavailable
> to process startup.
Just mentioning that changed dlopen semantics will in all likelihood affect
libtool. Do we have any libtool hackers around (besides me, I do not have
time) since Chuck disappeared?
Cheers,
Peter