[llvm] [BOLT] Never call fixBranches() on non-simple functions (PR #141112)

Thu May 22 10:57:44 PDT 2025

llvmbot wrote:




@llvm/pr-subscribers-bolt

Author: Maksim Panchenko (maksfb)

<details>
<summary>Changes</summary>

We should never call fixBranches() on a function with invalid CFG. E.g., ValidateInternalCalls modifies CFG for its internal analysis purposes. At the same time, it marks the function as non-simple with an assumption that fixBranches() will never run on that function.

However, calculateEmittedSize() by default calls fixBranches() which can lead to all sorts of issues, including assertions firing in fixBranches().

The fix is to use the original size for non-simple functions in calculateEmittedSize() since we are supposed to emit the function unmodified. Additionally, add an assertion at the start of fixBranches().

---
Full diff: https://github.com/llvm/llvm-project/pull/141112.diff


3 Files Affected:

- (modified) bolt/lib/Core/BinaryContext.cpp (+4) 
- (modified) bolt/lib/Core/BinaryFunction.cpp (+2) 
- (added) bolt/test/X86/fix-branches-broken-cfg.s (+42) 


``````````diff

diff --git a/bolt/lib/Core/BinaryContext.cpp b/bolt/lib/Core/BinaryContext.cpp
index 73059ef23775a..36b789d429363 100644
--- a/bolt/lib/Core/BinaryContext.cpp
+++ b/bolt/lib/Core/BinaryContext.cpp
@@ -2425,6 +2425,10 @@ BinaryContext::createInstructionPatch(uint64_t Address,
 
 std::pair<size_t, size_t>
 BinaryContext::calculateEmittedSize(BinaryFunction &BF, bool FixBranches) {
+  // Use the original size for non-simple functions.
+  if (!BF.isSimple() || BF.isIgnored())
+    return std::make_pair(BF.getSize(), 0);
+
   // Adjust branch instruction to match the current layout.
   if (FixBranches)
     BF.fixBranches();
diff --git a/bolt/lib/Core/BinaryFunction.cpp b/bolt/lib/Core/BinaryFunction.cpp
index 22bd8f8ddb110..ceb9390b546e1 100644
--- a/bolt/lib/Core/BinaryFunction.cpp
+++ b/bolt/lib/Core/BinaryFunction.cpp
@@ -3579,6 +3579,8 @@ bool BinaryFunction::validateCFG() const {
 }
 
 void BinaryFunction::fixBranches() {
+  assert(isSimple() && "Expected function with valid CFG.");
+
   auto &MIB = BC.MIB;
   MCContext *Ctx = BC.Ctx.get();
 
diff --git a/bolt/test/X86/fix-branches-broken-cfg.s b/bolt/test/X86/fix-branches-broken-cfg.s
new file mode 100644
index 0000000000000..0de1d7ea92f9b
--- /dev/null
+++ b/bolt/test/X86/fix-branches-broken-cfg.s
@@ -0,0 +1,42 @@
+## Check that fixBranches() is not invoked on a broken CFG which could lead to
+## unintended consequences including a firing assertion.
+
+# RUN: llvm-mc --filetype=obj --triple x86_64-unknown-unknown %s -o %t.o
+# RUN: link_fdata %s %t.o %t.fdata
+# RUN: llvm-strip --strip-unneeded %t.o
+# RUN: %clang %cflags %t.o -o %t.exe -Wl,-q
+# RUN: llvm-bolt %t.exe -o %t.bolt --split-functions --split-strategy=cdsplit \
+# RUN:         --data=%t.fdata --reorder-blocks=ext-tsp 2>&1 | FileCheck %s
+
+# CHECK: internal call detected
+
+        .text
+
+        .globl  foo
+        .type   foo, @function
+foo:
+        ret
+        .size foo, .-foo
+
+## main contains an internal call. ValidateInternalCalls pass will modify CFG
+## (making it invalid) and mark the function as non-simple. After that, we
+## cannot make any assumption about the CFG.
+
+        .globl  main
+        .type   main, @function
+main:
+        call .L1
+        ret
+.L1:
+        pushq   %rbp
+        movq    %rsp, %rbp
+        movl    $1, %edi
+LLmain_foo1:
+        call    foo
+# FDATA: 1 main #LLmain_foo1# 1 foo 0 0 600
+        movl    $4, %edi
+        xorl    %eax, %eax
+        popq    %rbp
+        retq
+.Lmain_end:
+        .size   main, .Lmain_end-main

``````````

</details>


https://github.com/llvm/llvm-project/pull/141112