[llvm] 0a5949d - [WPD] Fix handling of pure virtual base class

Teresa Johnson via llvm-commits llvm-commits at lists.llvm.org
Tue Feb 23 16:07:31 PST 2021


Author: Teresa Johnson
Date: 2021-02-23T16:07:09-08:00
New Revision: 0a5949dcfa31d599353fb4ccf4d207bdae7c7281

URL: https://github.com/llvm/llvm-project/commit/0a5949dcfa31d599353fb4ccf4d207bdae7c7281
DIFF: https://github.com/llvm/llvm-project/commit/0a5949dcfa31d599353fb4ccf4d207bdae7c7281.diff

LOG: [WPD] Fix handling of pure virtual base class

The fix in 3c4c205060c9398da705eb71b63ddd8a04999de9 caused an assert in
the case of a pure virtual base class. In that case, the vTableFuncs
list on the summary will be empty, so we were hitting the new assert
that the linkage type was not available_externally.

In the case of pure virtual, we do not want to assert, and additionally
need to set VS so that we don't treat it conservatively and quit the
analysis of the type id early.

This exposed a pre-existing issue where we were not updating the vcall
visibility on pure virtual functions when whole program visibility was
specified. We were skipping updating the visibility on any global vars
that didn't have any vTableFuncs, which meant all pure virtual were not
updated, and the later analysis would block any devirtualization of
calls that had a type id used on those pure virtual vtables (see the
handling in the other code modified in this patch). Simply remove that
check. It will mean that we may update the vcall visibility on global
vars that aren't vtables, but that setting is ignored for any global
vars that didn't have type metadata anyway.

Added a new test case that asserted without removing the assert, and
that requires the other fixes in this patch (updateVCallVisibilityInIndex
and not skipping all vtables without virtual funcs) to get a successful
devirtualization with index-only WPD. I added cases to test hybrid and
regular LTO for completeness, although those already worked without the
fixes here.

With this final fix, a clang multistage bootstrap with WPD builds and
runs all tests successfully.

Differential Revision: https://reviews.llvm.org/D97126

Added: 
    llvm/test/ThinLTO/X86/devirt_pure_virtual_base.ll

Modified: 
    llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp

Removed: 
    


################################################################################
diff  --git a/llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp b/llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp
index 6fca30c68751..14214562f032 100644
--- a/llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp
+++ b/llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp
@@ -814,7 +814,7 @@ void updateVCallVisibilityInIndex(
   for (auto &P : Index) {
     for (auto &S : P.second.SummaryList) {
       auto *GVar = dyn_cast<GlobalVarSummary>(S.get());
-      if (!GVar || GVar->vTableFuncs().empty() ||
+      if (!GVar ||
           GVar->getVCallVisibility() != GlobalObject::VCallVisibilityPublic ||
           // Don't upgrade the visibility for symbols exported to the dynamic
           // linker, as we have no information on their eventual use.
@@ -1005,10 +1005,8 @@ bool DevirtIndex::tryFindVirtualCallTargets(
   for (const TypeIdOffsetVtableInfo &P : TIdInfo) {
     // Find a representative copy of the vtable initializer.
     // We can have multiple available_externally, linkonce_odr and weak_odr
-    // vtable initializers, however currently clang does not attach type
-    // metadata to available_externally, and therefore the summary will not
-    // contain any vtable functions. We can also have multiple external
-    // vtable initializers in the case of comdats, which we cannot check here.
+    // vtable initializers. We can also have multiple external vtable
+    // initializers in the case of comdats, which we cannot check here.
     // The linker should give an error in this case.
     //
     // Also, handle the case of same-named local Vtables with the same path
@@ -1024,16 +1022,21 @@ bool DevirtIndex::tryFindVirtualCallTargets(
         LocalFound = true;
       }
       auto *CurVS = cast<GlobalVarSummary>(S->getBaseObject());
-      if (!CurVS->vTableFuncs().empty()) {
+      if (!CurVS->vTableFuncs().empty() ||
+          // Previously clang did not attach the necessary type metadata to
+          // available_externally vtables, in which case there would not
+          // be any vtable functions listed in the summary and we need
+          // to treat this case conservatively (in case the bitcode is old).
+          // However, we will also not have any vtable functions in the
+          // case of a pure virtual base class. In that case we do want
+          // to set VS to avoid treating it conservatively.
+          !GlobalValue::isAvailableExternallyLinkage(S->linkage())) {
         VS = CurVS;
         // We cannot perform whole program devirtualization analysis on a vtable
         // with public LTO visibility.
         if (VS->getVCallVisibility() == GlobalObject::VCallVisibilityPublic)
           return false;
-      } else
-        // Currently clang will not attach the necessary type metadata to
-        // available_externally vtables.
-        assert(GlobalValue::isAvailableExternallyLinkage(S->linkage()));
+      }
     }
     // There will be no VS if all copies are available_externally having no
     // type metadata. In that case we can't safely perform WPD.

diff  --git a/llvm/test/ThinLTO/X86/devirt_pure_virtual_base.ll b/llvm/test/ThinLTO/X86/devirt_pure_virtual_base.ll
new file mode 100644
index 000000000000..ab3d6a1e7be9
--- /dev/null
+++ b/llvm/test/ThinLTO/X86/devirt_pure_virtual_base.ll
@@ -0,0 +1,112 @@
+;; Test that pure virtual base is handled correctly.
+
+;; Index based WPD
+;; Generate unsplit module with summary for ThinLTO index-based WPD.
+; RUN: opt --thinlto-bc -o %t1a.o %s
+; RUN: llvm-lto2 run %t1a.o -save-temps -pass-remarks=. \
+; RUN:   -whole-program-visibility \
+; RUN:   -o %t3a \
+; RUN:   -r=%t1a.o,_start,plx \
+; RUN:   -r=%t1a.o,_ZTV1A,p \
+; RUN:   -r=%t1a.o,_ZTV1B,p \
+; RUN:   -r=%t1a.o,_ZN1A1fEi,p \
+; RUN:   -r=%t1a.o,_ZN1A1nEi,p \
+; RUN:   -r=%t1a.o,_ZN1B1fEi,p \
+; RUN:   -r=%t1a.o,__cxa_pure_virtual, \
+; RUN:   2>&1 | FileCheck %s --check-prefix=REMARK
+; RUN: llvm-dis %t3a.1.4.opt.bc -o - | FileCheck %s --check-prefix=CHECK-IR
+
+;; Hybrid WPD
+;; Generate split module with summary for hybrid Thin/Regular LTO WPD.
+; RUN: opt --thinlto-bc --thinlto-split-lto-unit -o %t1b.o %s
+; RUN: llvm-lto2 run %t1b.o -save-temps -pass-remarks=. \
+; RUN:   -whole-program-visibility \
+; RUN:   -o %t3b \
+; RUN:   -r=%t1b.o,_start,plx \
+; RUN:   -r=%t1b.o,_ZTV1A, \
+; RUN:   -r=%t1b.o,_ZTV1B, \
+; RUN:   -r=%t1b.o,__cxa_pure_virtual, \
+; RUN:   -r=%t1b.o,_ZN1A1fEi,p \
+; RUN:   -r=%t1b.o,_ZN1A1nEi,p \
+; RUN:   -r=%t1b.o,_ZN1B1fEi,p \
+; RUN:   -r=%t1b.o,_ZTV1A,p \
+; RUN:   -r=%t1b.o,_ZTV1B,p \
+; RUN:   -r=%t1b.o,_ZN1A1nEi, \
+; RUN:   -r=%t1b.o,_ZN1B1fEi, \
+; RUN:   -r=%t1b.o,__cxa_pure_virtual, \
+; RUN:   2>&1 | FileCheck %s --check-prefix=REMARK
+; RUN: llvm-dis %t3b.1.4.opt.bc -o - | FileCheck %s --check-prefix=CHECK-IR
+
+;; Regular LTO WPD
+; RUN: opt -o %t1c.o %s
+; RUN: llvm-lto2 run %t1c.o -save-temps -pass-remarks=. \
+; RUN:   -whole-program-visibility \
+; RUN:   -o %t3c \
+; RUN:   -r=%t1c.o,_start,plx \
+; RUN:   -r=%t1c.o,_ZTV1A,p \
+; RUN:   -r=%t1c.o,_ZTV1B,p \
+; RUN:   -r=%t1c.o,_ZN1A1fEi,p \
+; RUN:   -r=%t1c.o,_ZN1A1nEi,p \
+; RUN:   -r=%t1c.o,_ZN1B1fEi,p \
+; RUN:   -r=%t1c.o,__cxa_pure_virtual, \
+; RUN:   2>&1 | FileCheck %s --check-prefix=REMARK
+; RUN: llvm-dis %t3c.0.4.opt.bc -o - | FileCheck %s --check-prefix=CHECK-IR
+
+; REMARK-DAG: single-impl: devirtualized a call to _ZN1A1nEi
+
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-grtev4-linux-gnu"
+
+%struct.A = type { i32 (...)** }
+%struct.B = type { %struct.A }
+
+ at _ZTV1A = linkonce_odr unnamed_addr constant { [4 x i8*] } { [4 x i8*] [i8* null, i8* undef, i8* bitcast (void ()* @__cxa_pure_virtual to i8*), i8* bitcast (void ()* @__cxa_pure_virtual to i8*)] }, !type !0, !vcall_visibility !2
+ at _ZTV1B = linkonce_odr unnamed_addr constant { [4 x i8*] } { [4 x i8*] [i8* null, i8* undef, i8* bitcast (i32 (%struct.B*, i32)* @_ZN1B1fEi to i8*), i8* bitcast (i32 (%struct.A*, i32)* @_ZN1A1nEi to i8*)] }, !type !0, !type !1, !vcall_visibility !2
+
+;; Prevent the vtables from being dead code eliminated.
+ at llvm.used = appending global [2 x i8*] [ i8* bitcast ( { [4 x i8*] }* @_ZTV1A to i8*), i8* bitcast ( { [4 x i8*] }* @_ZTV1B to i8*)]
+
+; CHECK-IR-LABEL: define dso_local i32 @_start
+define i32 @_start(%struct.A* %obj, i32 %a) {
+entry:
+  %0 = bitcast %struct.A* %obj to i8***
+  %vtable = load i8**, i8*** %0
+  %1 = bitcast i8** %vtable to i8*
+  %p = call i1 @llvm.type.test(i8* %1, metadata !"_ZTS1A")
+  call void @llvm.assume(i1 %p)
+  %fptrptr = getelementptr i8*, i8** %vtable, i32 1
+  %2 = bitcast i8** %fptrptr to i32 (%struct.A*, i32)**
+  %fptr1 = load i32 (%struct.A*, i32)*, i32 (%struct.A*, i32)** %2, align 8
+
+  ;; Check that the call was devirtualized.
+  ; CHECK-IR: %call = tail call i32 @_ZN1A1nEi
+  ; CHECK-NODEVIRT-IR: %call = tail call i32 %fptr1
+  %call = tail call i32 %fptr1(%struct.A* nonnull %obj, i32 %a)
+
+  ret i32 %call
+}
+; CHECK-IR-LABEL: ret i32
+; CHECK-IR-NEXT: }
+
+declare i1 @llvm.type.test(i8*, metadata)
+declare void @llvm.assume(i1)
+declare void @__cxa_pure_virtual() unnamed_addr
+
+define linkonce_odr i32 @_ZN1A1fEi(%struct.A* %this, i32 %a) #0 {
+   ret i32 0
+}
+
+define linkonce_odr i32 @_ZN1A1nEi(%struct.A* %this, i32 %a) #0 {
+   ret i32 0
+}
+
+define linkonce_odr i32 @_ZN1B1fEi(%struct.B* %this, i32 %a) #0 {
+   ret i32 0
+}
+
+;; Make sure we don't inline or otherwise optimize out the direct calls.
+attributes #0 = { noinline optnone }
+
+!0 = !{i64 16, !"_ZTS1A"}
+!1 = !{i64 16, !"_ZTS1B"}
+!2 = !{i64 0}


        


More information about the llvm-commits mailing list