[PATCH] D155659: [WPD][LLD] Add option to validate RTTI is enabled on all native types and prevent devirtualization on types with native RTTI

Tue Jul 25 17:20:58 PDT 2023

modimo added a comment.

In D155659#4533387 <https://reviews.llvm.org/D155659#4533387>, @tejohnson wrote:

> In D155659#4529471 <https://reviews.llvm.org/D155659#4529471>, @tejohnson wrote:
>
>> I'm curious to give this a try internally on a few codes and see how frequently it ends up disabling WPD.
>
> I cranked through a bunch of builds with this change and thankfully while they all do have at least one vtable from an -fno-rtti native object, there are only a handful of unique symbols (which all appear safe), so we could consider using --lto-known-safe-vtables to allowlist them. I did find a couple that seem spurious (see comment inline below about this).

Nice! That mirrors my experience as well.

================
Comment at: lld/ELF/Driver.cpp:1098
+  for (auto &s : vtableSymbolsWithNoRTTI) {
+    warn("--lto-validate-all-vtables-have-type-infos: RTTI missing for vtable "
+         "_ZTV" +
----------------
tejohnson wrote:
> Prefer message() over warn() because the latter causes builds using -Werror to fail.
Is that the case for lld? Looking through the equivalent functionality is under `--fatal-warnings` and on a small example `-Werror` doesn't affect this flag/cause the build to fail.

================
Comment at: llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp:2413
+  // case bail on the optimization.
+  if (ValidateAllVtablesHaveTypeInfos && !AllVtablesHaveTypeInfos)
+    return;
----------------
tejohnson wrote:
> modimo wrote:
> > modimo wrote:
> > > tejohnson wrote:
> > > > modimo wrote:
> > > > > tejohnson wrote:
> > > > > > Rather than doing this down here in LTO/WPD could the linker simply unset the HasWholeProgramVisibility config flag? That would also allow WPD to proceed on types with hidden LTO visibility. This early return would prevent any and all WPD which seems overly conservative in the case of hidden LTO visibility classes. 
> > > > > That makes sense although it does tie this flag's functionality to requiring `--lto-whole-program-visibility`. 
> > > > > 
> > > > > Doing that though means we can instead pass the blocklist to `updateVCallVisibilityInIndex`/`updateVCallVisibilityInModule` similarly to how D91583 does it for dynamically exported symbols which would be cleaner. Thoughts on that approach?
> > > > > That makes sense although it does tie this flag's functionality to requiring --lto-whole-program-visibility.
> > > > 
> > > > What would be the use case of the proposed handling without --lto-whole-program-visibility? Are you saying that there are cases where the normal LTO visibility is incorrect?
> > > > 
> > > > > Doing that though means we can instead pass the blocklist to updateVCallVisibilityInIndex/updateVCallVisibilityInModule similarly to how D91583 does it for dynamically exported symbols which would be cleaner. Thoughts on that approach?
> > > > 
> > > > Yep I think that would be cleaner.
> > > > What would be the use case of the proposed handling without --lto-whole-program-visibility? Are you saying that there are cases where the normal LTO visibility is incorrect?
> > > 
> > > I don't have a known case so this is more theoretical. Currently there's an assertion that it's on the user to make sure LTO visibility is correct but in this case and in D91583 we can catch violations and prevent them from causing problems. How much this should also apply to normal LTO visibility is a question but thinking more about it is orthogonal to this change.
> > > 
> > > 
> > Ah right, the issue with doing this in `updateVCallVisibilityInIndex`/`updateVCallVisibilityInModule` is that vcall visibility is keyed off the vtable symbol. However, TypeID and RTTI are both keyed off of the typename symbol. There's not always a translation from typename to vtable since abstract base classes wouldn't have vtables and by the time we get the association we're in the same place the logic is right now.
> > 
> > Given that, I think I'll keep the logic as-is.
> > There's not always a translation from typename to vtable since abstract base classes wouldn't have vtables and by the time we get the association we're in the same place the logic is right now.
> 
> Can you clarify the case you are concerned about and how this is handled in the lld code that expects a translation from vtable to typename? I tried an example with an abstract base class and do get a vtable and typename.
Inputs/devirt_validate_vtable_typeinfos.ll has `A` as an abstract base class and `Native` deriving from it. Building down to an object file we get a vtable/typeinfo/typeid symbol for `Native` but only the typeinfo/typeid symbol for `A`:
```
~/llvm-project/lld/test/ELF/lto# ~/llvm-project/build-rel/bin/llc -filetype=obj Inputs/devirt_validate_vtable_typeinfos.ll -o devirt_validate_vtable_typeinfos.o
~/llvm-project/lld/test/ELF/lto# readelf -Ws devirt_validate_vtable_typeinfos.o | grep ZT
     5: 0000000000000000    16 OBJECT  WEAK   DEFAULT    3 _ZTVN10__cxxabiv117__class_type_infoE
     6: 0000000000000010    16 OBJECT  WEAK   DEFAULT    3 _ZTVN10__cxxabiv120__si_class_type_infoE
     7: 0000000000000020    32 OBJECT  WEAK   DEFAULT    3 _ZTV6Native
     8: 0000000000000050    24 OBJECT  WEAK   DEFAULT    3 _ZTI6Native
     9: 0000000000000040     8 OBJECT  WEAK   DEFAULT    3 _ZTS6Native
    10: 0000000000000070    16 OBJECT  WEAK   DEFAULT    3 _ZTI1A
    11: 0000000000000068     3 OBJECT  WEAK   DEFAULT    3 _ZTS1A
```

In LLD we're ensuring there's a map from every vtable symbol to its type information but we expect additional type information without vtables for these cases.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155659/new/

https://reviews.llvm.org/D155659