[PATCH] D59709: [ThinLTO] Auto-hide prevailing linkonce_odr only when all copies eligible

Fri Sep 6 11:37:27 PDT 2019

tejohnson marked an inline comment as done.
tejohnson added inline comments.
Herald added a subscriber: ychen.

================
Comment at: llvm/trunk/lib/LTO/LTO.cpp:334
+        S->setCanAutoHide(VI.canAutoHide() &&
+                          !GUIDPreservedSymbols.count(VI.getGUID()));
+      }
----------------
steven_wu wrote:
> tejohnson wrote:
> > tejohnson wrote:
> > > steven_wu wrote:
> > > > The regression I saw is exactly the situation mentioned in the comments but I am not so sure about the conclusion here. For ld64, it can see whether a symbol can be auto hide or not, and it prefers the one that is not auto hide. This means if there are other copies of the symbol that is not autohide outside the summary and not autohide, the one can be autohide should never be prevailing.
> > > > 
> > > > Is that the case for other linker?
> > > > The regression I saw is exactly the situation mentioned in the comments but I am not so sure about the conclusion here.
> > > 
> > > By the situation mentioned in the comments I assume you mean the part about symbols in the GUIDPreservedSymbols set because they are visible outside the summary? In your situation, why are there so many symbols in the preserved symbols set? Are there a lot of symbols in native code being linked in?
> > > 
> > > I guess it is different in lld since in the case I encountered (in an internal application), the linkonce_odr copy was prevailing, not the weak_odr copy. If ld64 guarantees that the autohide (linkonce_odr) copy will never be prevailing if there is an autohide (weak_odr) copy available somewhere, then we may need to figure out a way to communicate that.
> > > If ld64 guarantees that the autohide (linkonce_odr) copy will never be prevailing if there is an autohide (weak_odr) copy available somewhere
> > 
> > I meant if there is a *non*-autohide (weak_odr) copy available of course...
> > By the situation mentioned in the comments I assume you mean the part about symbols in the GUIDPreservedSymbols set because they are visible outside the summary? In your situation, why are there so many symbols in the preserved symbols set? Are there a lot of symbols in native code being linked in?
> > 
> 
> Yes. The situation is the code is linking a static c++ library (native code) with linkonce autohide libcxx symbols. Since ld64 decides to coalesce away the copy in the native code and use the version inside the LTO, it has to add it to mustPreserve symbols. Now even both copies can be autohide, it is not autohide because it is in the preserve list.
> 
> 
> > I guess it is different in lld since in the case I encountered (in an internal application), the linkonce_odr copy was prevailing, not the weak_odr copy. If ld64 guarantees that the autohide (linkonce_odr) copy will never be prevailing if there is an autohide (weak_odr) copy available somewhere, then we may need to figure out a way to communicate that.
> > 
> 
> This sounds like a linker bug but I am not familiar with lld LTO pipeline to be sure. Should lld mark weak_odr copy as prevailing? I also would like to understand how lld is using mustPreserve symbols because I thought lld doesn't need it.
> For ld64, it uses mustPreserve symbol to communicate if the weak/linkonce symbols are prevailing in this case.
> 
In the new LTO API, symbols are added to the GUIDPreservedSymbols set if they are visible outside the summary (either in a bitcode file without a summary or a native object).

We could do better if the linker indicated whether these symbols in objects without summaries were eligible for auto hide, which in the compiler is based both on the linkage type and on it having the global unnamed addr flag - does the linker know both of these for  native objects?

I'm not an lld expert, but looking back at my original internal discussion with @pcc on this issue, it appears to be a difference in ld64 (Mach-O) semantics and ELF semantics. Here is an excerpt:

pcc wrote:
> tejohnson wrote:
>> Is there some other criteria that should be used to prevent the symbol from being marked Hidden in this case?
>
> It seems that the rule should be: if we see at least one weak_odr definition (i.e. explicit instantiation) that should prevent the relaxation from default to hidden. The reason is that the weak_odr makes it possible to have another translation unit that does not implicitly instantiate the function, including a translation unit in another DSO. Similar logic is implemented in ld64: if one symbol is weak_odr unnamed_addr (i.e. non-auto-hide) and the other is linkonce_odr unnamed_addr (i.e. auto-hide), the linker selects the non-auto-hide one.
> https://github.com/apple-opensource/ld64/blob/master/src/ld/SymbolTable.cpp#L228
>
> Unfortunately, this is at odds with ELF semantics where the "most hidden" visibility wins, so I think this means that we would not be able to mark linkonce_odr unnamed_addr symbols as hidden in the non-LTO case, as was suggested at one point on llvm-dev I believe.

So perhaps the best thing to do in this case is have an interface for the linker to indicate the correct semantics with regard to whether most or least hidden wins?

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59709/new/

https://reviews.llvm.org/D59709