[lldb-dev] dlsym() and RTLD_FIRST

Zachary Turner zturner at google.com
Wed Aug 27 12:40:34 PDT 2014


I'm fine with that, although I would prefer to address it by fixing
DynamicLibrary in LLVM to support this behavior in LLVM if we find that it
breaks something.  Sharing as much code as possible should be an explicit
goal IMO, and especially duplicating entire classes should be shunned quite
strongly.  Nevertheless, if any fallout should arise, I will make sure to
deal with it until it's not broken.


On Wed, Aug 27, 2014 at 12:35 PM, Todd Fiala <tfiala at google.com> wrote:

> I'm personally okay with this on the Linux side, with the caveat as always
> that if we find an issue with this, we back out the change.
>
> -Todd
>
>
> On Wed, Aug 27, 2014 at 11:51 AM, Zachary Turner <zturner at google.com>
> wrote:
>
>> Combined with the fact that plugins are essentially a dead codepath
>> (going by Enrico's earlier comment), I think it's probably ok to just
>> remove this and use LLVM's default DynamicLibrary loader.  Any objections?
>>
>>
>> On Wed, Aug 27, 2014 at 11:49 AM, Zachary Turner <zturner at google.com>
>> wrote:
>>
>>> The documentation of dlsym() says this:
>>>
>>> ------
>>> The *dlsym*() function shall search for the named symbol in all objects
>>> loaded automatically *as a result of loading the object referenced by *
>>> *handle* (see *dlopen*()
>>> <http://pubs.opengroup.org/onlinepubs/009695399/functions/dlopen.html>).
>>> Load ordering is used in *dlsym*() operations upon the global symbol
>>> object. The symbol resolution algorithm used shall be dependency order as
>>> described in *dlopen*()
>>> <http://pubs.opengroup.org/onlinepubs/009695399/functions/dlopen.html>.
>>> ------
>>>
>>> So, it appears that this is perhaps a non-issue.  In other words, if you
>>> dlopen plugin A and get back handle A, and then you dlopen plugin B and get
>>> back handle B, there is no chance that dlsym against B would find a symbol
>>> in plugin A.  It will search only B and B's direct and indirect
>>> dependencies.
>>>
>>> There is a remote possibility that a plugin A could link against *another
>>> plugin B.  *However, as long as A provides a LLDBPluginInitialize
>>> method, it will always be found first as described by the section on
>>> "dependency ordering" in the documentation of dlopen().
>>>
>>> -----
>>> Dependency ordering uses a breadth-first order *starting with a given
>>> object*, then all of its dependencies, then any dependents of those,
>>> iterating until all dependencies are satisfied.
>>>
>>>
>>> On Wed, Aug 27, 2014 at 11:32 AM, Greg Clayton <gclayton at apple.com>
>>> wrote:
>>>
>>>>
>>>> > On Aug 26, 2014, at 5:09 PM, Zachary Turner <zturner at google.com>
>>>> wrote:
>>>> >
>>>> > I think my understanding of what it does is correct, but maybe Greg
>>>> or someone can confirm.  Basically, it tries to dlopen() the module at the
>>>> path specified, and then search for the symbols LLDBPluginInitialize and
>>>> LLDBPluginTerminate.  If it finds them, it calls them.  If it doesn't, the
>>>> plugin load fails.  According to the documentation of dladdr(), it appears
>>>> that the process for locating this symbol involves first searching the
>>>> module specified in the argument to dlopen(), and then searching any
>>>> dependent modules.  If it is found in any of these, it succeeds.  This
>>>> optimization (using RTLD_FIRST and the filename comparison), causes this
>>>> search to fail if the symbol is found in a dependent module, but not the
>>>> original module.
>>>>
>>>> Bingo. We don't want to call any other version of LLDBPluginInitialize
>>>> or LLDBPluginTerminate from any other plug-in. It isn't clear that the
>>>> dynamic linker sticks to dependent modules, I would need to check on that.
>>>> It might search all loaded shared libraries in the current process. Not
>>>> sure if that differs between Mac and Linux. Probably not.
>>>>
>>>> >
>>>> > I will try to verify that this is correct with someone who knows more
>>>> than me about Linux, Mac, and dynamic linking on these platforms, but if
>>>> correct then it doesn't seem like there is any risk to removing this.  That
>>>> said, I'm interested in who is actually use these plugins.  The best way to
>>>> find out if it's going to break something is to talk to the people who
>>>> depend on this code.
>>>>
>>>> Again, we _only_ want to call LLDBPluginInitialize or
>>>> LLDBPluginTerminate from the exact shared library we are trying to ask for
>>>> the symbols from, not from anywhere else.
>>>>
>>>> Greg
>>>> >
>>>> >
>>>> > On Tue, Aug 26, 2014 at 5:01 PM, Todd Fiala <tfiala at google.com>
>>>> wrote:
>>>> > Ah ok.
>>>> >
>>>> > It's worth figuring out what it does (really) before we consider
>>>> removing it.
>>>> >
>>>> >
>>>> >
>>>> > On Tue, Aug 26, 2014 at 4:53 PM, Zachary Turner <zturner at google.com>
>>>> wrote:
>>>> > That part is a good question, which I don't totally understand.
>>>> There's a function Debugger::LoadPlugin() though, which accepts a path to a
>>>> plugin to load.  It's called there.  This also appears to be exposed
>>>> through the "plugin load" command.
>>>> >
>>>> >
>>>> > On Tue, Aug 26, 2014 at 4:32 PM, Todd Fiala <tfiala at google.com>
>>>> wrote:
>>>> > I guess the thing to do is make sure we're certain we understand the
>>>> behavior, which is perhaps best captured in a test.  (i.e. test it with the
>>>> RTLD_FIRST behavior where it does something, then verify it does something
>>>> different without the flag.  Then, once we agree it is not useful behavior
>>>> for us, look at removing it).
>>>> >
>>>> > By valid plugin, you're referring to shared libraries, right?  (What
>>>> plugins are we referring to here, at what load point?)
>>>> >
>>>> >
>>>> > On Tue, Aug 26, 2014 at 4:29 PM, Zachary Turner <zturner at google.com>
>>>> wrote:
>>>> > Just as a counterpoint, unless I'm misunderstanding this code, I
>>>> don't see it actually having a noticeable impact on stability.  The search
>>>> limiting will only be a factor in a case where you attempt to load
>>>> something that *isn't a valid plugin*.  It's already an error path.  In
>>>> fact, this code worked fine before the change was made, and was only made
>>>> to imitate what appears to have been an optimization that was
>>>> Mac-specific.  The change for Mac doesn't seem to have been strictly
>>>> necessary either, but just an optimization.  It's actually not an
>>>> optimization for Linux, because the dynamic loader will still search every
>>>> module on linux, it will just fail anyway.
>>>> >
>>>> >
>>>> > On Tue, Aug 26, 2014 at 4:21 PM, Todd Fiala <tfiala at google.com>
>>>> wrote:
>>>> > Probably the way I'd look at this right now is that support in Linux
>>>> is a bit dicey and we're doing our best to stabilize (starting with single
>>>> path for remote/local debugging, and making that stable and fast).  In an
>>>> effort to stabilize, I'd prefer to limit how much code change we do on the
>>>> Linux end until we have a more stable product.
>>>> >
>>>> > So while we could potentially take that out, I'd rather avoid making
>>>> changes just because it might be simpler, as it might also add yet another
>>>> error scenario on the Linux side.  Right now I value similarity to MacOSX
>>>> execution over code reduction.  Once we're a lot more stable on the Linux
>>>> side, I'd be much more interested in revisiting with some actual use cases
>>>> to see diffs in performance and scope of usage.
>>>> >
>>>> > Just my 2 cents...
>>>> >
>>>> >
>>>> > On Tue, Aug 26, 2014 at 3:47 PM, Zachary Turner <zturner at google.com>
>>>> wrote:
>>>> > The review is up on the LLVM side.  One point which was raised, and
>>>> which I agree with, is that the presence of the string makes the class much
>>>> heavier.  This string is only needed to mimic MacOSX's RTLD_FIRST behavior
>>>> on other posix platforms.  However, going back through the history of when
>>>> this was added, I never actually saw a use case from anyone saying "we
>>>> *need* this on Linux".  See the full original thread at [1].  But the TL;DR
>>>> is that the flag is nice to have on MacOSX, and the filename comparison was
>>>> added to Linux to maintain parity.
>>>> >
>>>> > If nobody actually knows of a specific example of why this is
>>>> necessary on Linux, can we just remove this behavior on Linux?  My
>>>> understanding is that the only thing which will change by removing this for
>>>> Linux is the following: Imagine a plugin X is loaded, and X has a library
>>>> dependency on Y and Z.  X doesn't contain the plugin Initialize or
>>>> Terminate symbol, but Y or Z does.  With the filename comparison code,
>>>> LoadPlugin would fail, and without it, it would succeed and use the symbol
>>>> found in Y or Z.  I can understand that with the comparison the algorithm
>>>> is a bit better, but it seems such an extremely unusual edge case that I
>>>> don't think it's a big deal to remove it from the Linux side.
>>>> >
>>>> > Thoughts?
>>>> >
>>>> > [1] -
>>>> http://thread.gmane.org/gmane.comp.debugging.lldb.devel/300/focus=302
>>>> >
>>>> >
>>>> > On Thu, Aug 21, 2014 at 5:27 PM, Greg Clayton <gclayton at apple.com>
>>>> wrote:
>>>> > Sounds good to me. Hopefully if they don't want that they might
>>>> accept an extra boolean argument that can specify to only look in the
>>>> current shared library and then we can switch over to using LLVM's
>>>> DynamicLibrary.
>>>> >
>>>> > > On Aug 21, 2014, at 4:22 PM, Zachary Turner <zturner at google.com>
>>>> wrote:
>>>> > >
>>>> > > This seems like the only case we ever want, so I'm going to post a
>>>> patch to LLVM's DynamicLibrary class to use RTLD_FIRST on Apple, and a
>>>> similar method of checking the module filespec on other platforms, and see
>>>> if they accept it.  If so, I will convert our Plugin code to use LLVM's
>>>> DynamicLibrary and then delete our DynamicLibrary
>>>> > >
>>>> > >
>>>> > > On Thu, Aug 21, 2014 at 4:08 PM, Greg Clayton <gclayton at apple.com>
>>>> wrote:
>>>> > >
>>>> > > > On Aug 21, 2014, at 3:31 PM, Zachary Turner <zturner at google.com>
>>>> wrote:
>>>> > > >
>>>> > > > Can someone explain this flag to me?
>>>> > >
>>>> > > It says "only look in this binary, don't look in any others. We are
>>>> looking for a plug-in initialization function and we don't want to get one
>>>> back from another dylib.
>>>> > >
>>>> > > As Enrico said, the email from a while back details this:
>>>> > >
>>>> > > http://comments.gmane.org/gmane.comp.debugging.lldb.devel/305
>>>> > >
>>>> > > >  I've read the documentation, but it's still not clear to me.  If
>>>> you ask dlsym() to search some module X, why would it ever search modules
>>>> other than X?
>>>> > >
>>>> > > I don't know but it does.
>>>> > >
>>>> > > >
>>>> > > > The reason I ask about this is that llvm support library already
>>>> has a DynamicLibrary class whose purpose almost exactly matches what we're
>>>> using the Host::DynamicLibrary related functions for.  However, it doesn't
>>>> use the RTLD_FIRST flag, and so I'm not sure what the implications are of
>>>> us using it and deleting our own DynamicLibrary code.
>>>> > >
>>>> > > It would be nice if we could specify this flag so we either find
>>>> the symbol from libx.dylib or we don't. We don't want to find the symbol in
>>>> liby.dylib and call it in our case.
>>>> > >
>>>> >
>>>> >
>>>> >
>>>> > _______________________________________________
>>>> > lldb-dev mailing list
>>>> > lldb-dev at cs.uiuc.edu
>>>> > http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> > Todd Fiala |   Software Engineer |     tfiala at google.com |
>>>> 650-943-3180
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> > Todd Fiala |   Software Engineer |     tfiala at google.com |
>>>> 650-943-3180
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> > Todd Fiala |   Software Engineer |     tfiala at google.com |
>>>> 650-943-3180
>>>> >
>>>> >
>>>>
>>>>
>>>
>>
>
>
> --
> Todd Fiala | Software Engineer |  tfiala at google.com |  650-943-3180
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20140827/47208ead/attachment.html>


More information about the lldb-dev mailing list