[LLVMdev] [lld] [mach-o]: RFC: representing LC_REEXPORT_DYLIB

Nick Kledzik kledzik at apple.com
Thu Jul 10 14:34:42 PDT 2014


On Jul 10, 2014, at 5:56 AM, Tim Northover <t.p.northover at gmail.com> wrote:
> On 3 July 2014 01:09, Nick Kledzik <kledzik at apple.com> wrote:
>> You don’t want to load the
>> indirect dylibs as each direct dylib is loaded because one of the indirect
>> ones may later turn out to be a direct one, and the order determines
>> the two-level-namespace ordinal used which we want to remain deterministic.
> 
> I've finally got back to this issue and I'm not sure what you mean
> here. My tests suggest that ld64 performs a depth-first search of the
> libraries and we *do* want to load them at the same time (or at least
> make sure they're considered at the same time for resolution
> purposes). For example (reproduced by tmp.sh attached):

Tim,  for that simple case, it does not matter if you do a depth-first or
breadth-first load of the dylibs. But things get more complicated.
I hope none of this makes your eyes bleed...

One issue is that before 10.6/iOS3.1 we did not have LC_REEXPORT_DYLIB.
Instead of the parent saying it re-exports a child, a child may have a load
command which said the name of the parent which re-exported it.  
Maybe it is been long enough that we can drop support for this in lld.  But
to implement support, you have to open every child dylib and look to see
if it says the parent re-exports it!  To tell if a dylib uses the new or old style
or re-export, the mach_header flag MH_NO_REEXPORTED_DYLIBS bit
is set on new style dylibs (with no LC_REEXPORT_DYLIB commands). 

Another feature is that re-exports are convenient for build time (less dylibs
to specify) but slow down runtime because dyld has to search multiple dylibs
for a symbol.  In your example, the two-level ordinal in main says that _foo
is in libwrapper, but dyld looks there and does not find it, but then notices 
that libwrapper re-exports libfoo.dylib, so dyld then searches libfoo.dylib.
To improve performance, the linker has an optimization which can “hoist”
“public” dylibs up.  An example is a Cocoa app that just links with Cocoa.framework
and calls _objc_msgSend.  Well, Cocoa re-exports AppKit which re-exports
Foundation which re-exports libobjc.dylib which actually implements 
_objc_msgSend.  Rather than recording that _objc_msgSend is in Cocoa in
the app binary (which would cause dyld to do a lot of searching), the linker
sees that libobjc.dylib is in /usr/lib/ which means it is a public framework.
Therefore, the developer could have added -lobjc and the linker would 
have recorded _objc_msgSend came from that.  So the linker pretends
the user added -lobjc and then records _objc_msgSend as coming from it.

And a more recent feature (tied into clang modules) is “auto-linking”.  The 
compiler can now emit LC_LINKER_OPTION load commands into .o files.
These tell the linker about frameworks and libraries that *might* be needed
during the linker and are only really added to the linker if doing so would
resolve some undefined symbol. 

Lastly, the way two-level-namespace ordinals are based on the index of each
LC_LOAD_DYLIB (and friends) in the binary.  We must have a load command
for each library on the command line (you can force a dependency on a dylib
even if nothing is used from it). There may be additional load dylib commands 
based on auto-linking or hoisting.  But overall we want links to be stable and
deterministic.  There should not be race conditions that result it different 
possible binaries from the same link.

The net of all these features (in my mind) is that the linker needs to maintain
a couple of “pools” of dylibs:
1) dylibs with an assigned ordinal (initially the dylibs directly on command line)
2) indirect dylibs 
3) auto-link dylibs in waiting
Via various rules, dylibs in 2 or 3 may get moved up to 1. We need stable rules
so that the ordinals always reproduce.

-Nick





More information about the llvm-dev mailing list