[llvm] [BOLT] Never call fixBranches() on non-simple functions (PR #141112)
Maksim Panchenko via llvm-commits
llvm-commits at lists.llvm.org
Thu May 22 10:57:08 PDT 2025
https://github.com/maksfb created https://github.com/llvm/llvm-project/pull/141112
We should never call fixBranches() on a function with invalid CFG. E.g., ValidateInternalCalls modifies CFG for its internal analysis purposes. At the same time, it marks the function as non-simple with an assumption that fixBranches() will never run on that function.
However, calculateEmittedSize() by default calls fixBranches() which can lead to all sorts of issues, including assertions firing in fixBranches().
The fix is to use the original size for non-simple functions in calculateEmittedSize() since we are supposed to emit the function unmodified. Additionally, add an assertion at the start of fixBranches().
>From c0aeb952be7eb8cb603e93c67eab69986e80f9c1 Mon Sep 17 00:00:00 2001
From: Maksim Panchenko <maks at meta.com>
Date: Wed, 21 May 2025 19:33:37 -0700
Subject: [PATCH] [BOLT] Never call fixBranches() on non-simple functions
We should never call fixBranches() on a function with invalid CFG. E.g.,
ValidateInternalCalls modifies CFG for its internal analysis purposes.
At the same time, it marks the function as non-simple with an assumption
that fixBranches() will never run on that function.
However, calculateEmittedSize() by default calls fixBranches() which can
lead to all sorts of issues, including assertions firing in
fixBranches().
The fix is to use the original size for non-simple functions in
calculateEmittedSize() since we are supposed to emit the function
unmodified. Additionally, add an assertion at the start of
fixBranches().
---
bolt/lib/Core/BinaryContext.cpp | 4 +++
bolt/lib/Core/BinaryFunction.cpp | 2 ++
bolt/test/X86/fix-branches-broken-cfg.s | 42 +++++++++++++++++++++++++
3 files changed, 48 insertions(+)
create mode 100644 bolt/test/X86/fix-branches-broken-cfg.s
diff --git a/bolt/lib/Core/BinaryContext.cpp b/bolt/lib/Core/BinaryContext.cpp
index 73059ef23775a..36b789d429363 100644
--- a/bolt/lib/Core/BinaryContext.cpp
+++ b/bolt/lib/Core/BinaryContext.cpp
@@ -2425,6 +2425,10 @@ BinaryContext::createInstructionPatch(uint64_t Address,
std::pair<size_t, size_t>
BinaryContext::calculateEmittedSize(BinaryFunction &BF, bool FixBranches) {
+ // Use the original size for non-simple functions.
+ if (!BF.isSimple() || BF.isIgnored())
+ return std::make_pair(BF.getSize(), 0);
+
// Adjust branch instruction to match the current layout.
if (FixBranches)
BF.fixBranches();
diff --git a/bolt/lib/Core/BinaryFunction.cpp b/bolt/lib/Core/BinaryFunction.cpp
index 22bd8f8ddb110..ceb9390b546e1 100644
--- a/bolt/lib/Core/BinaryFunction.cpp
+++ b/bolt/lib/Core/BinaryFunction.cpp
@@ -3579,6 +3579,8 @@ bool BinaryFunction::validateCFG() const {
}
void BinaryFunction::fixBranches() {
+ assert(isSimple() && "Expected function with valid CFG.");
+
auto &MIB = BC.MIB;
MCContext *Ctx = BC.Ctx.get();
diff --git a/bolt/test/X86/fix-branches-broken-cfg.s b/bolt/test/X86/fix-branches-broken-cfg.s
new file mode 100644
index 0000000000000..0de1d7ea92f9b
--- /dev/null
+++ b/bolt/test/X86/fix-branches-broken-cfg.s
@@ -0,0 +1,42 @@
+## Check that fixBranches() is not invoked on a broken CFG which could lead to
+## unintended consequences including a firing assertion.
+
+# RUN: llvm-mc --filetype=obj --triple x86_64-unknown-unknown %s -o %t.o
+# RUN: link_fdata %s %t.o %t.fdata
+# RUN: llvm-strip --strip-unneeded %t.o
+# RUN: %clang %cflags %t.o -o %t.exe -Wl,-q
+# RUN: llvm-bolt %t.exe -o %t.bolt --split-functions --split-strategy=cdsplit \
+# RUN: --data=%t.fdata --reorder-blocks=ext-tsp 2>&1 | FileCheck %s
+
+# CHECK: internal call detected
+
+ .text
+
+ .globl foo
+ .type foo, @function
+foo:
+ ret
+ .size foo, .-foo
+
+## main contains an internal call. ValidateInternalCalls pass will modify CFG
+## (making it invalid) and mark the function as non-simple. After that, we
+## cannot make any assumption about the CFG.
+
+ .globl main
+ .type main, @function
+main:
+ call .L1
+ ret
+.L1:
+ pushq %rbp
+ movq %rsp, %rbp
+ movl $1, %edi
+LLmain_foo1:
+ call foo
+# FDATA: 1 main #LLmain_foo1# 1 foo 0 0 600
+ movl $4, %edi
+ xorl %eax, %eax
+ popq %rbp
+ retq
+.Lmain_end:
+ .size main, .Lmain_end-main
More information about the llvm-commits
mailing list