[lldb-dev] dlsym() and RTLD_FIRST
Greg Clayton
gclayton at apple.com
Wed Aug 27 12:58:46 PDT 2014
Just checked out docs for dlsym() and they do match what you found:
dlsym() returns the address of the code or data location specified by the null-terminated character string
symbol. Which libraries and bundles are searched depends on the handle parameter.
If dlsym() is called with a handle, returned by dlopen() then only that image and any libraries it depends on are searched for symbol.
So go ahead and use the llvm version.
Greg
> On Aug 27, 2014, at 12:41 PM, Todd Fiala <tfiala at google.com> wrote:
>
> Deal.
>
>
> On Wed, Aug 27, 2014 at 12:40 PM, Zachary Turner <zturner at google.com> wrote:
> I'm fine with that, although I would prefer to address it by fixing DynamicLibrary in LLVM to support this behavior in LLVM if we find that it breaks something. Sharing as much code as possible should be an explicit goal IMO, and especially duplicating entire classes should be shunned quite strongly. Nevertheless, if any fallout should arise, I will make sure to deal with it until it's not broken.
>
>
> On Wed, Aug 27, 2014 at 12:35 PM, Todd Fiala <tfiala at google.com> wrote:
> I'm personally okay with this on the Linux side, with the caveat as always that if we find an issue with this, we back out the change.
>
> -Todd
>
>
> On Wed, Aug 27, 2014 at 11:51 AM, Zachary Turner <zturner at google.com> wrote:
> Combined with the fact that plugins are essentially a dead codepath (going by Enrico's earlier comment), I think it's probably ok to just remove this and use LLVM's default DynamicLibrary loader. Any objections?
>
>
> On Wed, Aug 27, 2014 at 11:49 AM, Zachary Turner <zturner at google.com> wrote:
> The documentation of dlsym() says this:
>
> ------
> The dlsym() function shall search for the named symbol in all objects loaded automatically as a result of loading the object referenced by handle (see dlopen()). Load ordering is used in dlsym() operations upon the global symbol object. The symbol resolution algorithm used shall be dependency order as described in dlopen().
> ------
>
> So, it appears that this is perhaps a non-issue. In other words, if you dlopen plugin A and get back handle A, and then you dlopen plugin B and get back handle B, there is no chance that dlsym against B would find a symbol in plugin A. It will search only B and B's direct and indirect dependencies.
>
> There is a remote possibility that a plugin A could link against another plugin B. However, as long as A provides a LLDBPluginInitialize method, it will always be found first as described by the section on "dependency ordering" in the documentation of dlopen().
>
> -----
> Dependency ordering uses a breadth-first order starting with a given object, then all of its dependencies, then any dependents of those, iterating until all dependencies are satisfied.
>
>
> On Wed, Aug 27, 2014 at 11:32 AM, Greg Clayton <gclayton at apple.com> wrote:
>
> > On Aug 26, 2014, at 5:09 PM, Zachary Turner <zturner at google.com> wrote:
> >
> > I think my understanding of what it does is correct, but maybe Greg or someone can confirm. Basically, it tries to dlopen() the module at the path specified, and then search for the symbols LLDBPluginInitialize and LLDBPluginTerminate. If it finds them, it calls them. If it doesn't, the plugin load fails. According to the documentation of dladdr(), it appears that the process for locating this symbol involves first searching the module specified in the argument to dlopen(), and then searching any dependent modules. If it is found in any of these, it succeeds. This optimization (using RTLD_FIRST and the filename comparison), causes this search to fail if the symbol is found in a dependent module, but not the original module.
>
> Bingo. We don't want to call any other version of LLDBPluginInitialize or LLDBPluginTerminate from any other plug-in. It isn't clear that the dynamic linker sticks to dependent modules, I would need to check on that. It might search all loaded shared libraries in the current process. Not sure if that differs between Mac and Linux. Probably not.
>
> >
> > I will try to verify that this is correct with someone who knows more than me about Linux, Mac, and dynamic linking on these platforms, but if correct then it doesn't seem like there is any risk to removing this. That said, I'm interested in who is actually use these plugins. The best way to find out if it's going to break something is to talk to the people who depend on this code.
>
> Again, we _only_ want to call LLDBPluginInitialize or LLDBPluginTerminate from the exact shared library we are trying to ask for the symbols from, not from anywhere else.
>
> Greg
> >
> >
> > On Tue, Aug 26, 2014 at 5:01 PM, Todd Fiala <tfiala at google.com> wrote:
> > Ah ok.
> >
> > It's worth figuring out what it does (really) before we consider removing it.
> >
> >
> >
> > On Tue, Aug 26, 2014 at 4:53 PM, Zachary Turner <zturner at google.com> wrote:
> > That part is a good question, which I don't totally understand. There's a function Debugger::LoadPlugin() though, which accepts a path to a plugin to load. It's called there. This also appears to be exposed through the "plugin load" command.
> >
> >
> > On Tue, Aug 26, 2014 at 4:32 PM, Todd Fiala <tfiala at google.com> wrote:
> > I guess the thing to do is make sure we're certain we understand the behavior, which is perhaps best captured in a test. (i.e. test it with the RTLD_FIRST behavior where it does something, then verify it does something different without the flag. Then, once we agree it is not useful behavior for us, look at removing it).
> >
> > By valid plugin, you're referring to shared libraries, right? (What plugins are we referring to here, at what load point?)
> >
> >
> > On Tue, Aug 26, 2014 at 4:29 PM, Zachary Turner <zturner at google.com> wrote:
> > Just as a counterpoint, unless I'm misunderstanding this code, I don't see it actually having a noticeable impact on stability. The search limiting will only be a factor in a case where you attempt to load something that *isn't a valid plugin*. It's already an error path. In fact, this code worked fine before the change was made, and was only made to imitate what appears to have been an optimization that was Mac-specific. The change for Mac doesn't seem to have been strictly necessary either, but just an optimization. It's actually not an optimization for Linux, because the dynamic loader will still search every module on linux, it will just fail anyway.
> >
> >
> > On Tue, Aug 26, 2014 at 4:21 PM, Todd Fiala <tfiala at google.com> wrote:
> > Probably the way I'd look at this right now is that support in Linux is a bit dicey and we're doing our best to stabilize (starting with single path for remote/local debugging, and making that stable and fast). In an effort to stabilize, I'd prefer to limit how much code change we do on the Linux end until we have a more stable product.
> >
> > So while we could potentially take that out, I'd rather avoid making changes just because it might be simpler, as it might also add yet another error scenario on the Linux side. Right now I value similarity to MacOSX execution over code reduction. Once we're a lot more stable on the Linux side, I'd be much more interested in revisiting with some actual use cases to see diffs in performance and scope of usage.
> >
> > Just my 2 cents...
> >
> >
> > On Tue, Aug 26, 2014 at 3:47 PM, Zachary Turner <zturner at google.com> wrote:
> > The review is up on the LLVM side. One point which was raised, and which I agree with, is that the presence of the string makes the class much heavier. This string is only needed to mimic MacOSX's RTLD_FIRST behavior on other posix platforms. However, going back through the history of when this was added, I never actually saw a use case from anyone saying "we *need* this on Linux". See the full original thread at [1]. But the TL;DR is that the flag is nice to have on MacOSX, and the filename comparison was added to Linux to maintain parity.
> >
> > If nobody actually knows of a specific example of why this is necessary on Linux, can we just remove this behavior on Linux? My understanding is that the only thing which will change by removing this for Linux is the following: Imagine a plugin X is loaded, and X has a library dependency on Y and Z. X doesn't contain the plugin Initialize or Terminate symbol, but Y or Z does. With the filename comparison code, LoadPlugin would fail, and without it, it would succeed and use the symbol found in Y or Z. I can understand that with the comparison the algorithm is a bit better, but it seems such an extremely unusual edge case that I don't think it's a big deal to remove it from the Linux side.
> >
> > Thoughts?
> >
> > [1] - http://thread.gmane.org/gmane.comp.debugging.lldb.devel/300/focus=302
> >
> >
> > On Thu, Aug 21, 2014 at 5:27 PM, Greg Clayton <gclayton at apple.com> wrote:
> > Sounds good to me. Hopefully if they don't want that they might accept an extra boolean argument that can specify to only look in the current shared library and then we can switch over to using LLVM's DynamicLibrary.
> >
> > > On Aug 21, 2014, at 4:22 PM, Zachary Turner <zturner at google.com> wrote:
> > >
> > > This seems like the only case we ever want, so I'm going to post a patch to LLVM's DynamicLibrary class to use RTLD_FIRST on Apple, and a similar method of checking the module filespec on other platforms, and see if they accept it. If so, I will convert our Plugin code to use LLVM's DynamicLibrary and then delete our DynamicLibrary
> > >
> > >
> > > On Thu, Aug 21, 2014 at 4:08 PM, Greg Clayton <gclayton at apple.com> wrote:
> > >
> > > > On Aug 21, 2014, at 3:31 PM, Zachary Turner <zturner at google.com> wrote:
> > > >
> > > > Can someone explain this flag to me?
> > >
> > > It says "only look in this binary, don't look in any others. We are looking for a plug-in initialization function and we don't want to get one back from another dylib.
> > >
> > > As Enrico said, the email from a while back details this:
> > >
> > > http://comments.gmane.org/gmane.comp.debugging.lldb.devel/305
> > >
> > > > I've read the documentation, but it's still not clear to me. If you ask dlsym() to search some module X, why would it ever search modules other than X?
> > >
> > > I don't know but it does.
> > >
> > > >
> > > > The reason I ask about this is that llvm support library already has a DynamicLibrary class whose purpose almost exactly matches what we're using the Host::DynamicLibrary related functions for. However, it doesn't use the RTLD_FIRST flag, and so I'm not sure what the implications are of us using it and deleting our own DynamicLibrary code.
> > >
> > > It would be nice if we could specify this flag so we either find the symbol from libx.dylib or we don't. We don't want to find the symbol in liby.dylib and call it in our case.
> > >
> >
> >
> >
> > _______________________________________________
> > lldb-dev mailing list
> > lldb-dev at cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
> >
> >
> >
> >
> > --
> > Todd Fiala | Software Engineer | tfiala at google.com | 650-943-3180
> >
> >
> >
> >
> >
> > --
> > Todd Fiala | Software Engineer | tfiala at google.com | 650-943-3180
> >
> >
> >
> >
> >
> > --
> > Todd Fiala | Software Engineer | tfiala at google.com | 650-943-3180
> >
> >
>
>
>
>
>
>
> --
> Todd Fiala | Software Engineer | tfiala at google.com | 650-943-3180
>
>
>
>
>
> --
> Todd Fiala | Software Engineer | tfiala at google.com | 650-943-3180
>
More information about the lldb-dev
mailing list