[lldb-dev] dlsym() and RTLD_FIRST

Todd Fiala tfiala at google.com
Wed Aug 27 12:35:22 PDT 2014


I'm personally okay with this on the Linux side, with the caveat as always
that if we find an issue with this, we back out the change.

-Todd


On Wed, Aug 27, 2014 at 11:51 AM, Zachary Turner <zturner at google.com> wrote:

> Combined with the fact that plugins are essentially a dead codepath (going
> by Enrico's earlier comment), I think it's probably ok to just remove this
> and use LLVM's default DynamicLibrary loader.  Any objections?
>
>
> On Wed, Aug 27, 2014 at 11:49 AM, Zachary Turner <zturner at google.com>
> wrote:
>
>> The documentation of dlsym() says this:
>>
>> ------
>> The *dlsym*() function shall search for the named symbol in all objects
>> loaded automatically *as a result of loading the object referenced by *
>> *handle* (see *dlopen*()
>> <http://pubs.opengroup.org/onlinepubs/009695399/functions/dlopen.html>).
>> Load ordering is used in *dlsym*() operations upon the global symbol
>> object. The symbol resolution algorithm used shall be dependency order as
>> described in *dlopen*()
>> <http://pubs.opengroup.org/onlinepubs/009695399/functions/dlopen.html>.
>> ------
>>
>> So, it appears that this is perhaps a non-issue.  In other words, if you
>> dlopen plugin A and get back handle A, and then you dlopen plugin B and get
>> back handle B, there is no chance that dlsym against B would find a symbol
>> in plugin A.  It will search only B and B's direct and indirect
>> dependencies.
>>
>> There is a remote possibility that a plugin A could link against *another
>> plugin B.  *However, as long as A provides a LLDBPluginInitialize
>> method, it will always be found first as described by the section on
>> "dependency ordering" in the documentation of dlopen().
>>
>> -----
>> Dependency ordering uses a breadth-first order *starting with a given
>> object*, then all of its dependencies, then any dependents of those,
>> iterating until all dependencies are satisfied.
>>
>>
>> On Wed, Aug 27, 2014 at 11:32 AM, Greg Clayton <gclayton at apple.com>
>> wrote:
>>
>>>
>>> > On Aug 26, 2014, at 5:09 PM, Zachary Turner <zturner at google.com>
>>> wrote:
>>> >
>>> > I think my understanding of what it does is correct, but maybe Greg or
>>> someone can confirm.  Basically, it tries to dlopen() the module at the
>>> path specified, and then search for the symbols LLDBPluginInitialize and
>>> LLDBPluginTerminate.  If it finds them, it calls them.  If it doesn't, the
>>> plugin load fails.  According to the documentation of dladdr(), it appears
>>> that the process for locating this symbol involves first searching the
>>> module specified in the argument to dlopen(), and then searching any
>>> dependent modules.  If it is found in any of these, it succeeds.  This
>>> optimization (using RTLD_FIRST and the filename comparison), causes this
>>> search to fail if the symbol is found in a dependent module, but not the
>>> original module.
>>>
>>> Bingo. We don't want to call any other version of LLDBPluginInitialize
>>> or LLDBPluginTerminate from any other plug-in. It isn't clear that the
>>> dynamic linker sticks to dependent modules, I would need to check on that.
>>> It might search all loaded shared libraries in the current process. Not
>>> sure if that differs between Mac and Linux. Probably not.
>>>
>>> >
>>> > I will try to verify that this is correct with someone who knows more
>>> than me about Linux, Mac, and dynamic linking on these platforms, but if
>>> correct then it doesn't seem like there is any risk to removing this.  That
>>> said, I'm interested in who is actually use these plugins.  The best way to
>>> find out if it's going to break something is to talk to the people who
>>> depend on this code.
>>>
>>> Again, we _only_ want to call LLDBPluginInitialize or
>>> LLDBPluginTerminate from the exact shared library we are trying to ask for
>>> the symbols from, not from anywhere else.
>>>
>>> Greg
>>> >
>>> >
>>> > On Tue, Aug 26, 2014 at 5:01 PM, Todd Fiala <tfiala at google.com> wrote:
>>> > Ah ok.
>>> >
>>> > It's worth figuring out what it does (really) before we consider
>>> removing it.
>>> >
>>> >
>>> >
>>> > On Tue, Aug 26, 2014 at 4:53 PM, Zachary Turner <zturner at google.com>
>>> wrote:
>>> > That part is a good question, which I don't totally understand.
>>> There's a function Debugger::LoadPlugin() though, which accepts a path to a
>>> plugin to load.  It's called there.  This also appears to be exposed
>>> through the "plugin load" command.
>>> >
>>> >
>>> > On Tue, Aug 26, 2014 at 4:32 PM, Todd Fiala <tfiala at google.com> wrote:
>>> > I guess the thing to do is make sure we're certain we understand the
>>> behavior, which is perhaps best captured in a test.  (i.e. test it with the
>>> RTLD_FIRST behavior where it does something, then verify it does something
>>> different without the flag.  Then, once we agree it is not useful behavior
>>> for us, look at removing it).
>>> >
>>> > By valid plugin, you're referring to shared libraries, right?  (What
>>> plugins are we referring to here, at what load point?)
>>> >
>>> >
>>> > On Tue, Aug 26, 2014 at 4:29 PM, Zachary Turner <zturner at google.com>
>>> wrote:
>>> > Just as a counterpoint, unless I'm misunderstanding this code, I don't
>>> see it actually having a noticeable impact on stability.  The search
>>> limiting will only be a factor in a case where you attempt to load
>>> something that *isn't a valid plugin*.  It's already an error path.  In
>>> fact, this code worked fine before the change was made, and was only made
>>> to imitate what appears to have been an optimization that was
>>> Mac-specific.  The change for Mac doesn't seem to have been strictly
>>> necessary either, but just an optimization.  It's actually not an
>>> optimization for Linux, because the dynamic loader will still search every
>>> module on linux, it will just fail anyway.
>>> >
>>> >
>>> > On Tue, Aug 26, 2014 at 4:21 PM, Todd Fiala <tfiala at google.com> wrote:
>>> > Probably the way I'd look at this right now is that support in Linux
>>> is a bit dicey and we're doing our best to stabilize (starting with single
>>> path for remote/local debugging, and making that stable and fast).  In an
>>> effort to stabilize, I'd prefer to limit how much code change we do on the
>>> Linux end until we have a more stable product.
>>> >
>>> > So while we could potentially take that out, I'd rather avoid making
>>> changes just because it might be simpler, as it might also add yet another
>>> error scenario on the Linux side.  Right now I value similarity to MacOSX
>>> execution over code reduction.  Once we're a lot more stable on the Linux
>>> side, I'd be much more interested in revisiting with some actual use cases
>>> to see diffs in performance and scope of usage.
>>> >
>>> > Just my 2 cents...
>>> >
>>> >
>>> > On Tue, Aug 26, 2014 at 3:47 PM, Zachary Turner <zturner at google.com>
>>> wrote:
>>> > The review is up on the LLVM side.  One point which was raised, and
>>> which I agree with, is that the presence of the string makes the class much
>>> heavier.  This string is only needed to mimic MacOSX's RTLD_FIRST behavior
>>> on other posix platforms.  However, going back through the history of when
>>> this was added, I never actually saw a use case from anyone saying "we
>>> *need* this on Linux".  See the full original thread at [1].  But the TL;DR
>>> is that the flag is nice to have on MacOSX, and the filename comparison was
>>> added to Linux to maintain parity.
>>> >
>>> > If nobody actually knows of a specific example of why this is
>>> necessary on Linux, can we just remove this behavior on Linux?  My
>>> understanding is that the only thing which will change by removing this for
>>> Linux is the following: Imagine a plugin X is loaded, and X has a library
>>> dependency on Y and Z.  X doesn't contain the plugin Initialize or
>>> Terminate symbol, but Y or Z does.  With the filename comparison code,
>>> LoadPlugin would fail, and without it, it would succeed and use the symbol
>>> found in Y or Z.  I can understand that with the comparison the algorithm
>>> is a bit better, but it seems such an extremely unusual edge case that I
>>> don't think it's a big deal to remove it from the Linux side.
>>> >
>>> > Thoughts?
>>> >
>>> > [1] -
>>> http://thread.gmane.org/gmane.comp.debugging.lldb.devel/300/focus=302
>>> >
>>> >
>>> > On Thu, Aug 21, 2014 at 5:27 PM, Greg Clayton <gclayton at apple.com>
>>> wrote:
>>> > Sounds good to me. Hopefully if they don't want that they might accept
>>> an extra boolean argument that can specify to only look in the current
>>> shared library and then we can switch over to using LLVM's DynamicLibrary.
>>> >
>>> > > On Aug 21, 2014, at 4:22 PM, Zachary Turner <zturner at google.com>
>>> wrote:
>>> > >
>>> > > This seems like the only case we ever want, so I'm going to post a
>>> patch to LLVM's DynamicLibrary class to use RTLD_FIRST on Apple, and a
>>> similar method of checking the module filespec on other platforms, and see
>>> if they accept it.  If so, I will convert our Plugin code to use LLVM's
>>> DynamicLibrary and then delete our DynamicLibrary
>>> > >
>>> > >
>>> > > On Thu, Aug 21, 2014 at 4:08 PM, Greg Clayton <gclayton at apple.com>
>>> wrote:
>>> > >
>>> > > > On Aug 21, 2014, at 3:31 PM, Zachary Turner <zturner at google.com>
>>> wrote:
>>> > > >
>>> > > > Can someone explain this flag to me?
>>> > >
>>> > > It says "only look in this binary, don't look in any others. We are
>>> looking for a plug-in initialization function and we don't want to get one
>>> back from another dylib.
>>> > >
>>> > > As Enrico said, the email from a while back details this:
>>> > >
>>> > > http://comments.gmane.org/gmane.comp.debugging.lldb.devel/305
>>> > >
>>> > > >  I've read the documentation, but it's still not clear to me.  If
>>> you ask dlsym() to search some module X, why would it ever search modules
>>> other than X?
>>> > >
>>> > > I don't know but it does.
>>> > >
>>> > > >
>>> > > > The reason I ask about this is that llvm support library already
>>> has a DynamicLibrary class whose purpose almost exactly matches what we're
>>> using the Host::DynamicLibrary related functions for.  However, it doesn't
>>> use the RTLD_FIRST flag, and so I'm not sure what the implications are of
>>> us using it and deleting our own DynamicLibrary code.
>>> > >
>>> > > It would be nice if we could specify this flag so we either find the
>>> symbol from libx.dylib or we don't. We don't want to find the symbol in
>>> liby.dylib and call it in our case.
>>> > >
>>> >
>>> >
>>> >
>>> > _______________________________________________
>>> > lldb-dev mailing list
>>> > lldb-dev at cs.uiuc.edu
>>> > http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
>>> >
>>> >
>>> >
>>> >
>>> > --
>>> > Todd Fiala |   Software Engineer |     tfiala at google.com |
>>> 650-943-3180
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > --
>>> > Todd Fiala |   Software Engineer |     tfiala at google.com |
>>> 650-943-3180
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > --
>>> > Todd Fiala |   Software Engineer |     tfiala at google.com |
>>> 650-943-3180
>>> >
>>> >
>>>
>>>
>>
>


-- 
Todd Fiala | Software Engineer | tfiala at google.com | 650-943-3180
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20140827/4751f438/attachment.html>


More information about the lldb-dev mailing list