[PATCH] D63932: [GlobalDCE] Dead Virtual Function Elimination

Tue Jul 30 10:25:13 PDT 2019

pcc added a comment.

In D63932#1606248 <https://reviews.llvm.org/D63932#1606248>, @ostannard wrote:

> In that example, with everything having default ELF visibility, all of the vtables will get vcall_visibility public, which can't be optimised by VFE, and won't ever be relaxed to one of the tighter visibilities.
>
> The cases which can be optimised are (assuming "visibility" means visibility of the most-visible dynamic base class):
>
> - Classes in an anonymous namespace, which aren't visible outside their translation unit. Probably doesn't happen very often, but can be optimised without LTO.
> - Classes visible outside the translation unit, but not outside the LTO unit. This generally means hidden ELF visibility, unless the lto_visibility_public attribute is used. These can't be optimised without LTO, but can be with it.
>
>   I implemented the second case by adding the LinkageUnit visibility, which can't be optimised by VFE, but can be reduced to TranslationUnit when LTO internalisation happens. This could also have been done by not changing the visibility at LTO time, and instead leting GlobalDCE know if it is running post-LTO. Both of these should give the same results, but the way I used represents the visibility fully in the IR, without having the extra state of "are we doing LTO?".

Okay, I somehow missed that the relaxation to TranslationUnit was guarded on it being LinkageUnit to start with. So at least from that perspective I don't see any soundness issues. It still doesn't seem like a good idea to do the relaxation in LTO though, since the visibility-outside-of-LTO of the vtable symbols is not a reliable indicator whether the type is used externally (I think that's what confused me to begin with). Here are a couple of cases where things might go wrong:

- Take the example from my earlier message, give the "main executable" and "DSO" hidden visibility, build the "main executable" with LTO and the "DSO" without LTO, and statically link them both into the same executable. We run into the same problem where the `Plugin1` vtable is potentially not referenced and thus misoptimised. Yes, this is a violation of the LTO visibility rules, but the example shows that we only detect it sometimes. I think that if we did want to detect cases where the LTO visibility rules are clearly being violated, the outcome should be to issue a diagnostic and not to silently proceed with optimizations disabled, since the violation might be masking other undetected issues. That really seems orthogonal to this patch, though.
- Imagine linking part of a program with `ld -r` with LTO -- all symbols including vtable symbols will appear to be externally visible, which is not necessarily a violation of the LTO visibility rules because they may not be used externally in the final link. So we end up pessimising unnecessarily.

I gave this some more thought and it seems that the frontend has all of the information required to make a determination about whether to allow VFE, without needing to teach LTO to relax visibility. We can make the frontend do this:

- If `-flto` was passed (see `CodeGenOptions::PrepareForLTO`), attach "allow VFE" metadata if class has hidden LTO visibility.
- Otherwise, attach "allow VFE" metadata if class is not `isExternallyVisible`.

Now the behaviour of GlobalDCE can be made conditional on the "allow VFE" metadata alone. This also has the advantage of simplifying the IR by making "allow VFE" a boolean rather than a tri-state as it is now.

================
Comment at: llvm/lib/Transforms/IPO/GlobalDCE.cpp:183
+    // unit, we know that we can see all virtual functions which might use it,
+    // so VFE is safe.
+    if (auto GO = dyn_cast<GlobalObject>(&GV)) {
----------------
ostannard wrote:
> pcc wrote:
> > Not necessarily, at least in this implementation, because "vtable symbol can be internalized" doesn't imply "all vcalls are visible". See main response.
> This is checking the vcall_visibility, which will only be VCallVisibilityTranslationUnit if all base classes which contain virtual functions are private to this translation unit, which does imply "all vcalls are visible".
Ack.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D63932/new/

https://reviews.llvm.org/D63932