[llvm] [WPD]Regard unreachable function as a possible devirtualizable target (PR #115668)
Mingming Liu via llvm-commits
llvm-commits at lists.llvm.org
Sun Nov 10 12:08:03 PST 2024
https://github.com/minglotus-6 created https://github.com/llvm/llvm-project/pull/115668
https://reviews.llvm.org/D115492 skips unreachable functions (pure virtual functions) and potentially allows more static de-virtualizations.
This PR proposes to undo the change, because it turns out there are other unreachable functions that a general program wants to run and fail intentionally, with `LOG(FATAL)` or `CHECK` [0] for example. While many real-world applications are encouraged to check-fail sparingly, they are allowed to do so on critical errors (e.g., misconfiguration or bug is detected during server startup).
* Implementation-wise, this PR keeps the one-bit 'unreachable' state in bitcode and updates WPD analysis.
Where the observation comes from:
* To rollout WPD to a wider set of production binaries, we did a pre-release test to catch source code UB [1] that might be exposed by WPD if any.
* With WPD enabled, a couple of unit tests failed its death tests [2], where the test program is expected to terminate but it doesn't. The test failure is consistently reproducible (i.e., not flaky) and the pattern is illustrated as
```
Death test: ptr->VirtualMethod(args)
Result: failed to die.
Error msg:
...
```
* Debugging shows the test doesn't terminate as expected because the virtual method that contains `LOG(FATAL)` or `CHECK` failures are `unreachable` and thereby not counted as WPD de-virtualizable target.
[0] https://abseil.io/docs/cpp/guides/logging#CHECK and https://abseil.io/docs/cpp/guides/logging#configuration-and-flags
[1] For example, the source code has a virtual call that is performed on a pointer with a different type from `reinterpret_cast`. This relies on undefined behavior.
[2] https://google.github.io/googletest/reference/assertions.html#death
>From 830c0d493aee2f86a4e8e0e6711df93f53399364 Mon Sep 17 00:00:00 2001
From: mingmingl <mingmingl at google.com>
Date: Sun, 10 Nov 2024 11:34:30 -0800
Subject: [PATCH] [WPD]Regard unreachable function as a possible
devirtualizable target
---
llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp | 17 +++++++++++++++++
...ngle_after_filtering_unreachable_function.ll | 12 ++++++++----
2 files changed, 25 insertions(+), 4 deletions(-)
diff --git a/llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp b/llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp
index 78516cadcf2313..f72cffcf75d2dd 100644
--- a/llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp
+++ b/llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp
@@ -167,6 +167,18 @@ static cl::list<std::string>
cl::desc("Prevent function(s) from being devirtualized"),
cl::Hidden, cl::CommaSeparated);
+/// A function is unreachable if its entry block ends with 'unreachable' IR
+/// instruction. In some cases, the program intends to run such functions and
+/// terminate, for instance, a unit test may run a death test. A non-test
+/// program might (or allowed to) invoke such functions to report failures
+/// (whether/when it's a good practice or not is a different topic). Regard
+/// unreachable function as possible devirtualize targets to keep the program
+/// behavior.
+static cl::opt<bool> WholeProgramDevirtKeepUnreachableFunction(
+ "wholeprogramdevirt-keep-unreachable-function",
+ cl::desc("Regard unreachable functions as possible devirtualize targets."),
+ cl::Hidden, cl::init(true));
+
/// If explicitly specified, the devirt module pass will stop transformation
/// once the total number of devirtualizations reach the cutoff value. Setting
/// this option to 0 explicitly will do 0 devirtualization.
@@ -386,6 +398,9 @@ template <> struct DenseMapInfo<VTableSlotSummary> {
// 2) All function summaries indicate it's unreachable
// 3) There is no non-function with the same GUID (which is rare)
static bool mustBeUnreachableFunction(ValueInfo TheFnVI) {
+ if (WholeProgramDevirtKeepUnreachableFunction)
+ return false;
+
if ((!TheFnVI) || TheFnVI.getSummaryList().empty()) {
// Returns false if ValueInfo is absent, or the summary list is empty
// (e.g., function declarations).
@@ -2241,6 +2256,8 @@ DevirtModule::lookUpFunctionValueInfo(Function *TheFn,
bool DevirtModule::mustBeUnreachableFunction(
Function *const F, ModuleSummaryIndex *ExportSummary) {
+ if (WholeProgramDevirtKeepUnreachableFunction)
+ return false;
// First, learn unreachability by analyzing function IR.
if (!F->isDeclaration()) {
// A function must be unreachable if its entry block ends with an
diff --git a/llvm/test/Transforms/WholeProgramDevirt/devirt_single_after_filtering_unreachable_function.ll b/llvm/test/Transforms/WholeProgramDevirt/devirt_single_after_filtering_unreachable_function.ll
index 457120b9c6f410..599d3296bb163b 100644
--- a/llvm/test/Transforms/WholeProgramDevirt/devirt_single_after_filtering_unreachable_function.ll
+++ b/llvm/test/Transforms/WholeProgramDevirt/devirt_single_after_filtering_unreachable_function.ll
@@ -1,11 +1,15 @@
+; Test that static devirtualization doesn't happen because there are two
+; devirtualizable targets. Unreachable functions are kept in the devirtualizable
+; target set by default.
+; RUN: opt -S -passes=wholeprogramdevirt -whole-program-visibility -pass-remarks=wholeprogramdevirt %s 2>&1 | FileCheck %s --implicit-check-not="single-impl"
+
; Test that regular LTO will analyze IR, detect unreachable functions and discard unreachable functions
; when finding virtual call targets.
; In this test case, the unreachable function is the virtual deleting destructor of an abstract class.
+; RUN: opt -S -passes=wholeprogramdevirt -whole-program-visibility -pass-remarks=wholeprogramdevirt -wholeprogramdevirt-keep-unreachable-function=false %s 2>&1 | FileCheck %s --check-prefix=DEVIRT
-; RUN: opt -S -passes=wholeprogramdevirt -whole-program-visibility -pass-remarks=wholeprogramdevirt %s 2>&1 | FileCheck %s
-
-; CHECK: remark: tmp.cc:21:3: single-impl: devirtualized a call to _ZN7DerivedD0Ev
-; CHECK: remark: <unknown>:0:0: devirtualized _ZN7DerivedD0Ev
+; DEVIRT: remark: tmp.cc:21:3: single-impl: devirtualized a call to _ZN7DerivedD0Ev
+; DEVIRT: remark: <unknown>:0:0: devirtualized _ZN7DerivedD0Ev
source_filename = "tmp.cc"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
More information about the llvm-commits
mailing list