[PATCH] D155659: [WPD][LLD] Add option to validate RTTI is enabled on all native types and prevent devirtualization on types with native RTTI

Wed Jul 26 10:58:52 PDT 2023

modimo added inline comments.

================
Comment at: lld/ELF/Driver.cpp:1098
+  for (auto &s : vtableSymbolsWithNoRTTI) {
+    warn("--lto-validate-all-vtables-have-type-infos: RTTI missing for vtable "
+         "_ZTV" +
----------------
tejohnson wrote:
> modimo wrote:
> > tejohnson wrote:
> > > Prefer message() over warn() because the latter causes builds using -Werror to fail.
> > Is that the case for lld? Looking through the equivalent functionality is under `--fatal-warnings` and on a small example `-Werror` doesn't affect this flag/cause the build to fail.
> Sorry, you are correct. We use --fatal-warnings for this on linker actions (and -Werror on compile actions). In general, I think this should be an informational message, not a warning, since it is being handled automatically.
The case I want to catch is when an existing project changes its native dependencies and disables the optimization which would better fit a warning. In the general case agreed that this isn't a warning. I don't have much experience in linker protocol here, @MaskRay thoughts? 

================
Comment at: llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp:2413
+  // case bail on the optimization.
+  if (ValidateAllVtablesHaveTypeInfos && !AllVtablesHaveTypeInfos)
+    return;
----------------
tejohnson wrote:
> modimo wrote:
> > modimo wrote:
> > > tejohnson wrote:
> > > > modimo wrote:
> > > > > modimo wrote:
> > > > > > tejohnson wrote:
> > > > > > > modimo wrote:
> > > > > > > > tejohnson wrote:
> > > > > > > > > Rather than doing this down here in LTO/WPD could the linker simply unset the HasWholeProgramVisibility config flag? That would also allow WPD to proceed on types with hidden LTO visibility. This early return would prevent any and all WPD which seems overly conservative in the case of hidden LTO visibility classes. 
> > > > > > > > That makes sense although it does tie this flag's functionality to requiring `--lto-whole-program-visibility`. 
> > > > > > > > 
> > > > > > > > Doing that though means we can instead pass the blocklist to `updateVCallVisibilityInIndex`/`updateVCallVisibilityInModule` similarly to how D91583 does it for dynamically exported symbols which would be cleaner. Thoughts on that approach?
> > > > > > > > That makes sense although it does tie this flag's functionality to requiring --lto-whole-program-visibility.
> > > > > > > 
> > > > > > > What would be the use case of the proposed handling without --lto-whole-program-visibility? Are you saying that there are cases where the normal LTO visibility is incorrect?
> > > > > > > 
> > > > > > > > Doing that though means we can instead pass the blocklist to updateVCallVisibilityInIndex/updateVCallVisibilityInModule similarly to how D91583 does it for dynamically exported symbols which would be cleaner. Thoughts on that approach?
> > > > > > > 
> > > > > > > Yep I think that would be cleaner.
> > > > > > > What would be the use case of the proposed handling without --lto-whole-program-visibility? Are you saying that there are cases where the normal LTO visibility is incorrect?
> > > > > > 
> > > > > > I don't have a known case so this is more theoretical. Currently there's an assertion that it's on the user to make sure LTO visibility is correct but in this case and in D91583 we can catch violations and prevent them from causing problems. How much this should also apply to normal LTO visibility is a question but thinking more about it is orthogonal to this change.
> > > > > > 
> > > > > > 
> > > > > Ah right, the issue with doing this in `updateVCallVisibilityInIndex`/`updateVCallVisibilityInModule` is that vcall visibility is keyed off the vtable symbol. However, TypeID and RTTI are both keyed off of the typename symbol. There's not always a translation from typename to vtable since abstract base classes wouldn't have vtables and by the time we get the association we're in the same place the logic is right now.
> > > > > 
> > > > > Given that, I think I'll keep the logic as-is.
> > > > > There's not always a translation from typename to vtable since abstract base classes wouldn't have vtables and by the time we get the association we're in the same place the logic is right now.
> > > > 
> > > > Can you clarify the case you are concerned about and how this is handled in the lld code that expects a translation from vtable to typename? I tried an example with an abstract base class and do get a vtable and typename.
> > > Inputs/devirt_validate_vtable_typeinfos.ll has `A` as an abstract base class and `Native` deriving from it. Building down to an object file we get a vtable/typeinfo/typeid symbol for `Native` but only the typeinfo/typeid symbol for `A`:
> > > ```
> > > ~/llvm-project/lld/test/ELF/lto# ~/llvm-project/build-rel/bin/llc -filetype=obj Inputs/devirt_validate_vtable_typeinfos.ll -o devirt_validate_vtable_typeinfos.o
> > > ~/llvm-project/lld/test/ELF/lto# readelf -Ws devirt_validate_vtable_typeinfos.o | grep ZT
> > >      5: 0000000000000000    16 OBJECT  WEAK   DEFAULT    3 _ZTVN10__cxxabiv117__class_type_infoE
> > >      6: 0000000000000010    16 OBJECT  WEAK   DEFAULT    3 _ZTVN10__cxxabiv120__si_class_type_infoE
> > >      7: 0000000000000020    32 OBJECT  WEAK   DEFAULT    3 _ZTV6Native
> > >      8: 0000000000000050    24 OBJECT  WEAK   DEFAULT    3 _ZTI6Native
> > >      9: 0000000000000040     8 OBJECT  WEAK   DEFAULT    3 _ZTS6Native
> > >     10: 0000000000000070    16 OBJECT  WEAK   DEFAULT    3 _ZTI1A
> > >     11: 0000000000000068     3 OBJECT  WEAK   DEFAULT    3 _ZTS1A
> > > ```
> > > 
> > > In LLD we're ensuring there's a map from every vtable symbol to its type information but we expect additional type information without vtables for these cases.
> > That being said, the information to map typeid->[associated vtables] is `typeIdCompatibleVtableMap` for thinLTO and in full LTO we have the actual vtable global variables which contains `!type` metadata that maps back to typeid.
> > 
> > For thinLTO, we can save a scan of `typeIdCompatibleVtableMap` for `updateVCallVisibilityInIndex` by currently combining it with the scan in `DevirtIndex::run` however that's probably not a big deal.
> > 
> > For full LTO the information is better passed through `updateVCallVisibilityInModule` since we don't build the corresponding `TypeIDMap` until codegen and carrying this information around until then is too much.
> > 
> > I think I've come back around to doing it in `updateVCallVisibilityInIndex`/`updateVCallVisibilityInModule`. A little less efficient for thinLTO but keeps consistency with full LTO.
> Ok, thanks. I think I have lost track of what the change will be - is it to replace passing down a global flag AllVtablesHaveTypeInfos, or is it to replace what is being down below in DevirtIndex::run()? For the former alone it doesn't seem worth it, but it would be nice to move the handling from DevirtIndex::run() into the vcall_visibility updates.
Ah sorry yeah there's 2 different things. For passing down `AllVtablesHaveTypeInfos` I like the approach of unsetting --lto-whole-program-visibility instead. For moving the safety logic out of DevirtIndex::run() making this work for full LTO wants the logic to be in vcall_visibility and it makes sense to be consistent even if slightly *less* efficient for thinLTO.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155659/new/

https://reviews.llvm.org/D155659