[llvm] [WPD]Regard unreachable function as a possible devirtualizable target (PR #115668)

Mingming Liu via llvm-commits llvm-commits at lists.llvm.org
Sun Nov 10 12:08:03 PST 2024


https://github.com/minglotus-6 created https://github.com/llvm/llvm-project/pull/115668

https://reviews.llvm.org/D115492 skips unreachable functions (pure virtual functions) and potentially allows more static de-virtualizations.  

This PR proposes to undo the change, because it turns out there are other unreachable functions that a general program wants to run and fail intentionally, with `LOG(FATAL)` or `CHECK` [0] for example. While many real-world applications are encouraged to check-fail sparingly, they are allowed to do so on critical errors (e.g., misconfiguration or bug is detected during server startup).
* Implementation-wise, this PR keeps the one-bit 'unreachable' state in bitcode and updates WPD analysis.

Where the observation comes from:
* To rollout WPD to a wider set of production binaries, we did a pre-release test to catch source code UB [1] that might be exposed by WPD if any. 
* With WPD enabled, a couple of unit tests failed its death tests [2], where the test program is expected to terminate but it doesn't.  The test failure is consistently reproducible (i.e., not flaky) and the pattern is illustrated as 
  ```
  Death test: ptr->VirtualMethod(args)
    Result: failed to die.
  Error msg:
    ...
  ```
 * Debugging shows the test doesn't terminate as expected because the virtual method that contains `LOG(FATAL)` or `CHECK` failures are `unreachable` and thereby not counted as WPD de-virtualizable target. 

[0] https://abseil.io/docs/cpp/guides/logging#CHECK and https://abseil.io/docs/cpp/guides/logging#configuration-and-flags
[1] For example, the source code has a virtual call that is performed on a pointer with a different type from `reinterpret_cast`. This relies on undefined behavior.
[2] https://google.github.io/googletest/reference/assertions.html#death

>From 830c0d493aee2f86a4e8e0e6711df93f53399364 Mon Sep 17 00:00:00 2001
From: mingmingl <mingmingl at google.com>
Date: Sun, 10 Nov 2024 11:34:30 -0800
Subject: [PATCH] [WPD]Regard unreachable function as a possible
 devirtualizable target

---
 llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp  | 17 +++++++++++++++++
 ...ngle_after_filtering_unreachable_function.ll | 12 ++++++++----
 2 files changed, 25 insertions(+), 4 deletions(-)

diff --git a/llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp b/llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp
index 78516cadcf2313..f72cffcf75d2dd 100644
--- a/llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp
+++ b/llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp
@@ -167,6 +167,18 @@ static cl::list<std::string>
                       cl::desc("Prevent function(s) from being devirtualized"),
                       cl::Hidden, cl::CommaSeparated);
 
+/// A function is unreachable if its entry block ends with 'unreachable' IR
+/// instruction. In some cases, the program intends to run such functions and
+/// terminate, for instance, a unit test may run a death test. A non-test
+/// program might (or allowed to) invoke such functions to report failures
+/// (whether/when it's a good practice or not is a different topic). Regard
+/// unreachable function as possible devirtualize targets to keep the program
+/// behavior.
+static cl::opt<bool> WholeProgramDevirtKeepUnreachableFunction(
+    "wholeprogramdevirt-keep-unreachable-function",
+    cl::desc("Regard unreachable functions as possible devirtualize targets."),
+    cl::Hidden, cl::init(true));
+
 /// If explicitly specified, the devirt module pass will stop transformation
 /// once the total number of devirtualizations reach the cutoff value. Setting
 /// this option to 0 explicitly will do 0 devirtualization.
@@ -386,6 +398,9 @@ template <> struct DenseMapInfo<VTableSlotSummary> {
 //   2) All function summaries indicate it's unreachable
 //   3) There is no non-function with the same GUID (which is rare)
 static bool mustBeUnreachableFunction(ValueInfo TheFnVI) {
+  if (WholeProgramDevirtKeepUnreachableFunction)
+    return false;
+
   if ((!TheFnVI) || TheFnVI.getSummaryList().empty()) {
     // Returns false if ValueInfo is absent, or the summary list is empty
     // (e.g., function declarations).
@@ -2241,6 +2256,8 @@ DevirtModule::lookUpFunctionValueInfo(Function *TheFn,
 
 bool DevirtModule::mustBeUnreachableFunction(
     Function *const F, ModuleSummaryIndex *ExportSummary) {
+  if (WholeProgramDevirtKeepUnreachableFunction)
+    return false;
   // First, learn unreachability by analyzing function IR.
   if (!F->isDeclaration()) {
     // A function must be unreachable if its entry block ends with an
diff --git a/llvm/test/Transforms/WholeProgramDevirt/devirt_single_after_filtering_unreachable_function.ll b/llvm/test/Transforms/WholeProgramDevirt/devirt_single_after_filtering_unreachable_function.ll
index 457120b9c6f410..599d3296bb163b 100644
--- a/llvm/test/Transforms/WholeProgramDevirt/devirt_single_after_filtering_unreachable_function.ll
+++ b/llvm/test/Transforms/WholeProgramDevirt/devirt_single_after_filtering_unreachable_function.ll
@@ -1,11 +1,15 @@
+; Test that static devirtualization doesn't happen because there are two
+; devirtualizable targets. Unreachable functions are kept in the devirtualizable
+; target set by default.
+; RUN: opt -S -passes=wholeprogramdevirt -whole-program-visibility -pass-remarks=wholeprogramdevirt %s 2>&1 | FileCheck %s  --implicit-check-not="single-impl"
+
 ; Test that regular LTO will analyze IR, detect unreachable functions and discard unreachable functions
 ; when finding virtual call targets.
 ; In this test case, the unreachable function is the virtual deleting destructor of an abstract class.
+; RUN: opt -S -passes=wholeprogramdevirt -whole-program-visibility -pass-remarks=wholeprogramdevirt -wholeprogramdevirt-keep-unreachable-function=false %s 2>&1 | FileCheck %s --check-prefix=DEVIRT
 
-; RUN: opt -S -passes=wholeprogramdevirt -whole-program-visibility -pass-remarks=wholeprogramdevirt %s 2>&1 | FileCheck %s
-
-; CHECK: remark: tmp.cc:21:3: single-impl: devirtualized a call to _ZN7DerivedD0Ev
-; CHECK: remark: <unknown>:0:0: devirtualized _ZN7DerivedD0Ev
+; DEVIRT: remark: tmp.cc:21:3: single-impl: devirtualized a call to _ZN7DerivedD0Ev
+; DEVIRT: remark: <unknown>:0:0: devirtualized _ZN7DerivedD0Ev
 
 source_filename = "tmp.cc"
 target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"



More information about the llvm-commits mailing list