[llvm] [BOLT] Improve DWARF CFI generation for pac-ret binaries (PR #163381)

Gergely Bálint via llvm-commits llvm-commits at lists.llvm.org
Fri Nov 28 04:11:11 PST 2025


https://github.com/bgergely0 updated https://github.com/llvm/llvm-project/pull/163381

>From dda35e36d59a703dde03e0541f029387acad07a2 Mon Sep 17 00:00:00 2001
From: Gergely Balint <gergely.balint at arm.com>
Date: Tue, 7 Oct 2025 14:01:47 +0000
Subject: [PATCH 01/13] [BOLT] Improve
 InsertNegateRAStatePass::inferUnknownStates

Previous implementation used a simple heuristic. This can be improved in
several ways:
- If a BasicBlock has instruction both with known RAState and unknown RAState,
  use the known states to work out the unknown ones.
- If a BasicBlock only consists of instructions with unknown RAState,
  use the last known RAState from its predecessors, or the first known
  from its successors to set the RAStates in the BasicBlock. This includes
  error checking: all predecessors/successors should have the same RAState.
- Some BasicBlocks may only contain instructions with unknown RAState,
  and have no CFG neighbors. These already have incorrect unwind info.
  For these, we copy the last known RAState based on the layout order.

Updated bolt/docs/PacRetDesign.md to reflect changes.
---
 bolt/docs/PacRetDesign.md                     |  23 +-
 .../bolt/Passes/InsertNegateRAStatePass.h     |  34 ++-
 bolt/lib/Passes/InsertNegateRAStatePass.cpp   | 221 ++++++++++++++++--
 3 files changed, 252 insertions(+), 26 deletions(-)

diff --git a/bolt/docs/PacRetDesign.md b/bolt/docs/PacRetDesign.md
index f3fe5fbd522cb..c7c76cac3a100 100644
--- a/bolt/docs/PacRetDesign.md
+++ b/bolt/docs/PacRetDesign.md
@@ -200,16 +200,29 @@ This pass runs after optimizations. It performns the _inverse_ of MarkRAState pa
 Some BOLT passes can add new Instructions. In InsertNegateRAStatePass, we have
 to know what RA state these have.
 
-The current solution has the `inferUnknownStates` function to cover these, using
-a fairly simple strategy: unknown states inherit the last known state.
-
-This will be updated to a more robust solution.
-
 > [!important]
 > As issue #160989 describes, unwind info is incorrect in stubs with multiple callers.
 > For this same reason, we cannot generate correct pac-specific unwind info: the signess
 > of the _incorrect_ return address is meaningless.
 
+Assignment of RAStates to newly generated instructions is done in `inferUnknownStates`.
+We have three different cases to cover:
+
+1. If a BasicBlock has some instructions with known RA state, and some without, we
+   can copy the RAState of known instructions to the unknown ones. As the control
+   flow only changes between BasicBlocks, instructions in the same BasicBlock have the
+   same return address.
+
+2. If all instructions in a BasicBlock are unknown, we can look at all CFG neighbors
+   (that is predecessors/successors). The RAState should be the same as of the
+   neighboring blocks. Conflicting RAStates in neighbors indicate an error. Such
+   functions should be ignored.
+
+3. If a BasicBlock has no CFG neighbors, we have to copy the RAState of the previous
+    BasicBlock in layout order.
+
+If any BasicBlocks remain with unknown instructions, the function will be ignored.
+
 ### Optimizations requiring special attention
 
 Marking states before optimizations ensure that instructions can be moved around
diff --git a/bolt/include/bolt/Passes/InsertNegateRAStatePass.h b/bolt/include/bolt/Passes/InsertNegateRAStatePass.h
index 836948bf5e9c0..b4b428207b657 100644
--- a/bolt/include/bolt/Passes/InsertNegateRAStatePass.h
+++ b/bolt/include/bolt/Passes/InsertNegateRAStatePass.h
@@ -1,4 +1,4 @@
-//===- bolt/Passes/InsertNegateRAStatePass.cpp ----------------------------===//
+//===- bolt/Passes/InsertNegateRAStatePass.h ------------------------------===//
 //
 // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
 // See https://llvm.org/LICENSE.txt for license information.
@@ -30,9 +30,39 @@ class InsertNegateRAState : public BinaryFunctionPass {
 private:
   /// Because states are tracked as MCAnnotations on individual instructions,
   /// newly inserted instructions do not have a state associated with them.
-  /// New states are "inherited" from the last known state.
   void inferUnknownStates(BinaryFunction &BF);
 
+  /// Simple case: copy RAStates to unknown insts from previous inst.
+  /// Account for signing and authenticating insts.
+  void fillUnknownStateInBB(BinaryContext &BC, BinaryBasicBlock &BB);
+
+  /// Fill unknown RAStates in BBs with no successors/predecessors. These are
+  /// Stubs inserted by LongJmp. As of #160989, we have to copy the RAState from
+  /// the previous BB in the layout, because CFIs are already incorrect here.
+  void fillUnknownStubs(BinaryFunction &BF);
+
+  /// Fills unknowns RAStates of BBs with successors/predecessors. Uses
+  /// getRAStateByCFG to determine the RAState. Does more than one iteration if
+  /// needed. Reports an error, if it cannot find the RAState for all BBs with
+  /// predecessors/successors.
+  void fillUnknownBlocksInCFG(BinaryFunction &BF);
+
+  /// For BBs which only hold instructions with unknown RAState, we check
+  /// CFG neighbors (successors, predecessors) of the BB. If they have different
+  /// RAStates, we report an inconsistency. Otherwise, we return the found
+  /// RAState.
+  std::optional<bool> getRAStateByCFG(BinaryBasicBlock &BB, BinaryFunction &BF);
+  /// Returns the first known RAState from \p BB, or std::nullopt if all are
+  /// unknown.
+  std::optional<bool> getFirstKnownRAState(BinaryContext &BC,
+                                           BinaryBasicBlock &BB);
+
+  /// \p Return true if all instructions have unknown RAState.
+  bool isUnknownBlock(BinaryContext &BC, BinaryBasicBlock &BB);
+
+  /// Set all instructions in \p BB to \p State.
+  void markUnknownBlock(BinaryContext &BC, BinaryBasicBlock &BB, bool State);
+
   /// Support for function splitting:
   /// if two consecutive BBs with Signed state are going to end up in different
   /// functions (so are held by different FunctionFragments), we have to add a
diff --git a/bolt/lib/Passes/InsertNegateRAStatePass.cpp b/bolt/lib/Passes/InsertNegateRAStatePass.cpp
index 775b7795e77c5..1b1fbf7f20e9d 100644
--- a/bolt/lib/Passes/InsertNegateRAStatePass.cpp
+++ b/bolt/lib/Passes/InsertNegateRAStatePass.cpp
@@ -74,6 +74,20 @@ void InsertNegateRAState::runOnFunction(BinaryFunction &BF) {
   }
 }
 
+void InsertNegateRAState::inferUnknownStates(BinaryFunction &BF) {
+  BinaryContext &BC = BF.getBinaryContext();
+
+  // Fill in missing RAStates in simple cases (inside BBs).
+  for (BinaryBasicBlock &BB : BF) {
+    fillUnknownStateInBB(BC, BB);
+  }
+  // Some stubs have no predecessors. For those, we iterate once in the layout
+  // order to fill their RAState.
+  fillUnknownStubs(BF);
+
+  fillUnknownBlocksInCFG(BF);
+}
+
 void InsertNegateRAState::coverFunctionFragmentStart(BinaryFunction &BF,
                                                      FunctionFragment &FF) {
   BinaryContext &BC = BF.getBinaryContext();
@@ -104,32 +118,201 @@ void InsertNegateRAState::coverFunctionFragmentStart(BinaryFunction &BF,
                          MCCFIInstruction::createNegateRAState(nullptr));
 }
 
-void InsertNegateRAState::inferUnknownStates(BinaryFunction &BF) {
+std::optional<bool>
+InsertNegateRAState::getFirstKnownRAState(BinaryContext &BC,
+                                          BinaryBasicBlock &BB) {
+  for (const MCInst &Inst : BB) {
+    if (BC.MIB->isCFI(Inst))
+      continue;
+    auto RAStateOpt = BC.MIB->getRAState(Inst);
+    if (RAStateOpt)
+      return RAStateOpt;
+  }
+  return std::nullopt;
+}
+
+void InsertNegateRAState::fillUnknownStateInBB(BinaryContext &BC,
+                                               BinaryBasicBlock &BB) {
+
+  auto First = BB.getFirstNonPseudo();
+  if (First == BB.end())
+    return;
+  // If the first instruction has unknown RAState, we should copy the first
+  // known RAState.
+  auto RAStateOpt = BC.MIB->getRAState(*First);
+  if (!RAStateOpt) {
+    auto FirstRAState = getFirstKnownRAState(BC, BB);
+    if (!FirstRAState)
+      // We fill unknown BBs later.
+      return;
+
+    BC.MIB->setRAState(*First, *FirstRAState);
+  }
+
+  // At this point we know the RAState of the first instruction,
+  // so we can propagate the RAStates to all subsequent unknown instructions.
+  MCInst Prev = *First;
+  for (auto It = BB.begin() + 1; It != BB.end(); ++It) {
+    MCInst &Inst = *It;
+    if (BC.MIB->isCFI(Inst))
+      continue;
+
+    auto PrevRAState = BC.MIB->getRAState(Prev);
+    if (!PrevRAState)
+      llvm_unreachable("Previous Instruction has no RAState.");
+
+    auto RAState = BC.MIB->getRAState(Inst);
+    if (!RAState) {
+      if (BC.MIB->isPSignOnLR(Prev))
+        PrevRAState = true;
+      else if (BC.MIB->isPAuthOnLR(Prev))
+        PrevRAState = false;
+      BC.MIB->setRAState(Inst, *PrevRAState);
+    }
+    Prev = Inst;
+  }
+}
+
+bool InsertNegateRAState::isUnknownBlock(BinaryContext &BC,
+                                         BinaryBasicBlock &BB) {
+  for (const MCInst &Inst : BB) {
+    if (BC.MIB->isCFI(Inst))
+      continue;
+    auto RAState = BC.MIB->getRAState(Inst);
+    if (RAState)
+      return false;
+  }
+  return true;
+}
+
+void InsertNegateRAState::markUnknownBlock(BinaryContext &BC,
+                                           BinaryBasicBlock &BB, bool State) {
+  // If we call this when an Instruction has either kRASigned or kRAUnsigned
+  // annotation, setRASigned or setRAUnsigned would fail.
+  assert(isUnknownBlock(BC, BB) &&
+         "markUnknownBlock should only be called on unknown blocks");
+  for (MCInst &Inst : BB) {
+    if (BC.MIB->isCFI(Inst))
+      continue;
+    BC.MIB->setRAState(Inst, State);
+  }
+}
+
+std::optional<bool> InsertNegateRAState::getRAStateByCFG(BinaryBasicBlock &BB,
+                                                         BinaryFunction &BF) {
   BinaryContext &BC = BF.getBinaryContext();
-  bool FirstIter = true;
-  MCInst PrevInst;
-  for (BinaryBasicBlock &BB : BF) {
-    for (MCInst &Inst : BB) {
-      if (BC.MIB->isCFI(Inst))
+
+  auto checkRAState = [&](std::optional<bool> &NeighborRAState, MCInst &Inst) {
+    auto RAState = BC.MIB->getRAState(Inst);
+    if (!RAState)
+      return;
+    if (!NeighborRAState) {
+      NeighborRAState = *RAState;
+      return;
+    }
+    if (NeighborRAState != *RAState) {
+      BC.outs() << "BOLT-WARNING: Conflicting RAState found in function "
+                << BF.getPrintName() << ". Function will not be optimized.\n";
+      BF.setIgnored();
+    }
+  };
+
+  // Holds the first found RAState from CFG neighbors.
+  std::optional<bool> NeighborRAState = std::nullopt;
+  if (BB.pred_size() != 0) {
+    for (BinaryBasicBlock *PredBB : BB.predecessors()) {
+      //  find last inst of Predecessor with known RA State.
+      auto LI = PredBB->getLastNonPseudo();
+      if (LI == PredBB->rend())
+        continue;
+      MCInst &LastInst = *LI;
+      checkRAState(NeighborRAState, LastInst);
+    }
+  } else if (BB.succ_size() != 0) {
+    for (BinaryBasicBlock *SuccBB : BB.successors()) {
+      //  find first inst of Successor with known RA State.
+      auto FI = SuccBB->getFirstNonPseudo();
+      if (FI == SuccBB->end())
         continue;
+      MCInst &FirstInst = *FI;
+      checkRAState(NeighborRAState, FirstInst);
+    }
+  } else {
+    llvm_unreachable("Called getRAStateByCFG on a BB with no preds or succs.");
+  }
+
+  return NeighborRAState;
+}
 
-      auto RAState = BC.MIB->getRAState(Inst);
-      if (!FirstIter && !RAState) {
-        if (BC.MIB->isPSignOnLR(PrevInst))
-          RAState = true;
-        else if (BC.MIB->isPAuthOnLR(PrevInst))
-          RAState = false;
-        else {
+void InsertNegateRAState::fillUnknownStubs(BinaryFunction &BF) {
+  BinaryContext &BC = BF.getBinaryContext();
+  bool FirstIter = true;
+  MCInst PrevInst;
+  for (FunctionFragment &FF : BF.getLayout().fragments()) {
+    for (BinaryBasicBlock *BB : FF) {
+      if (!FirstIter && isUnknownBlock(BC, *BB)) {
+        // If we have no predecessors or successors, current BB is a Stub called
+        // from another BinaryFunction. As of #160989, we have to copy the
+        // PrevInst's RAState, because CFIs are already incorrect here.
+        if (BB->pred_size() == 0 && BB->succ_size() == 0) {
           auto PrevRAState = BC.MIB->getRAState(PrevInst);
-          RAState = PrevRAState ? *PrevRAState : false;
+          if (!PrevRAState) {
+            llvm_unreachable(
+                "Previous Instruction has no RAState in fillUnknownStubs.");
+            continue;
+          }
+
+          if (BC.MIB->isPSignOnLR(PrevInst)) {
+            PrevRAState = true;
+          } else if (BC.MIB->isPAuthOnLR(PrevInst)) {
+            PrevRAState = false;
+          }
+          markUnknownBlock(BC, *BB, *PrevRAState);
         }
-        BC.MIB->setRAState(Inst, *RAState);
-      } else {
+      }
+      if (FirstIter) {
         FirstIter = false;
-        if (!RAState)
-          BC.MIB->setRAState(Inst, BF.getInitialRAState());
+        if (isUnknownBlock(BC, *BB))
+          markUnknownBlock(BC, *BB, false);
       }
-      PrevInst = Inst;
+      auto Last = BB->getLastNonPseudo();
+      if (Last != BB->rend())
+        PrevInst = *Last;
+    }
+  }
+}
+void InsertNegateRAState::fillUnknownBlocksInCFG(BinaryFunction &BF) {
+  BinaryContext &BC = BF.getBinaryContext();
+
+  auto fillUnknowns = [&](BinaryFunction &BF) -> std::pair<int, bool> {
+    int Unknowns = 0;
+    bool Updated = false;
+    for (BinaryBasicBlock &BB : BF) {
+      // Only try to iterate if the BB has either predecessors or successors.
+      if (isUnknownBlock(BC, BB) &&
+          (BB.pred_size() != 0 || BB.succ_size() != 0)) {
+        auto RAStateOpt = getRAStateByCFG(BB, BF);
+        if (RAStateOpt) {
+          markUnknownBlock(BC, BB, *RAStateOpt);
+          Updated = true;
+        } else {
+          Unknowns++;
+        }
+      }
+    }
+    return std::pair<int, bool>{Unknowns, Updated};
+  };
+
+  while (true) {
+    std::pair<int, bool> Iter = fillUnknowns(BF);
+    if (Iter.first == 0)
+      break;
+    if (!Iter.second) {
+      BC.errs() << "BOLT-WARNING: Could not infer RAState for " << Iter.first
+                << " BBs in function " << BF.getPrintName()
+                << ". Function will not be optimized.\n";
+      BF.setIgnored();
+      break;
     }
   }
 }

>From f330e22e1baa5105011547abfc4285cf1fc7638d Mon Sep 17 00:00:00 2001
From: Gergely Balint <gergely.balint at arm.com>
Date: Fri, 31 Oct 2025 13:39:01 +0000
Subject: [PATCH 02/13] [BOLT] Single-pass unittest for InsertNegateRAState

This commit creates a new directory: bolt/unittests/Passes, to be used
by unittests that need to register and run passes with the
BinaryFunctionPassManager.

An example test is created for InsertNegateRAState pass. Actual tests
will be added in followup commits.
---
 bolt/unittests/CMakeLists.txt                 |   1 +
 bolt/unittests/Passes/CMakeLists.txt          |  30 ++++
 bolt/unittests/Passes/InsertNegateRAState.cpp | 164 ++++++++++++++++++
 3 files changed, 195 insertions(+)
 create mode 100644 bolt/unittests/Passes/CMakeLists.txt
 create mode 100644 bolt/unittests/Passes/InsertNegateRAState.cpp

diff --git a/bolt/unittests/CMakeLists.txt b/bolt/unittests/CMakeLists.txt
index 64414b83d39fe..d47ddc46b7388 100644
--- a/bolt/unittests/CMakeLists.txt
+++ b/bolt/unittests/CMakeLists.txt
@@ -7,3 +7,4 @@ endfunction()
 
 add_subdirectory(Core)
 add_subdirectory(Profile)
+add_subdirectory(Passes)
diff --git a/bolt/unittests/Passes/CMakeLists.txt b/bolt/unittests/Passes/CMakeLists.txt
new file mode 100644
index 0000000000000..02dcd2c8417e1
--- /dev/null
+++ b/bolt/unittests/Passes/CMakeLists.txt
@@ -0,0 +1,30 @@
+set(LLVM_LINK_COMPONENTS
+  DebugInfoDWARF
+  Object
+  MC
+  ${LLVM_TARGETS_TO_BUILD}
+  )
+
+add_bolt_unittest(PassTests
+  InsertNegateRAState.cpp
+
+  DISABLE_LLVM_LINK_LLVM_DYLIB
+  )
+
+target_link_libraries(PassTests
+  PRIVATE
+  LLVMBOLTCore
+  LLVMBOLTRewrite
+  LLVMBOLTPasses
+  LLVMBOLTProfile
+  LLVMBOLTUtils
+  )
+
+foreach (tgt ${BOLT_TARGETS_TO_BUILD})
+  include_directories(
+    ${LLVM_MAIN_SRC_DIR}/lib/Target/${tgt}
+    ${LLVM_BINARY_DIR}/lib/Target/${tgt}
+  )
+  string(TOUPPER "${tgt}" upper)
+  target_compile_definitions(PassTests PRIVATE "${upper}_AVAILABLE")
+endforeach()
diff --git a/bolt/unittests/Passes/InsertNegateRAState.cpp b/bolt/unittests/Passes/InsertNegateRAState.cpp
new file mode 100644
index 0000000000000..024b696eb0534
--- /dev/null
+++ b/bolt/unittests/Passes/InsertNegateRAState.cpp
@@ -0,0 +1,164 @@
+//===- bolt/unittest/Passes/InsertNegateRAState.cpp -----------------------===//
+//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifdef AARCH64_AVAILABLE
+#include "AArch64Subtarget.h"
+#include "MCTargetDesc/AArch64MCTargetDesc.h"
+#endif // AARCH64_AVAILABLE
+
+#include "bolt/Core/BinaryBasicBlock.h"
+#include "bolt/Core/BinaryFunction.h"
+#include "bolt/Passes/InsertNegateRAStatePass.h"
+#include "bolt/Rewrite/BinaryPassManager.h"
+#include "bolt/Rewrite/RewriteInstance.h"
+#include "llvm/BinaryFormat/ELF.h"
+#include "llvm/MC/MCDwarf.h"
+#include "llvm/MC/MCInstBuilder.h"
+#include "llvm/Support/TargetSelect.h"
+#include "gtest/gtest.h"
+
+using namespace llvm;
+using namespace llvm::object;
+using namespace llvm::ELF;
+using namespace bolt;
+
+namespace {
+struct PassTester : public testing::TestWithParam<Triple::ArchType> {
+  void SetUp() override {
+    initalizeLLVM();
+    prepareElf();
+    initializeBolt();
+  }
+
+protected:
+  void initalizeLLVM() {
+#define BOLT_TARGET(target)                                                    \
+  LLVMInitialize##target##TargetInfo();                                        \
+  LLVMInitialize##target##TargetMC();                                          \
+  LLVMInitialize##target##AsmParser();                                         \
+  LLVMInitialize##target##Disassembler();                                      \
+  LLVMInitialize##target##Target();                                            \
+  LLVMInitialize##target##AsmPrinter();
+
+#include "bolt/Core/TargetConfig.def"
+  }
+
+#define PREPARE_FUNC(name)                                                     \
+  constexpr uint64_t FunctionAddress = 0x1000;                                 \
+  BinaryFunction *BF = BC->createBinaryFunction(                               \
+      name, *TextSection, FunctionAddress, /*Size=*/0, /*SymbolSize=*/0,       \
+      /*Alignment=*/16);                                                       \
+  /* Make sure the pass runs on the BF.*/                                      \
+  BF->updateState(BinaryFunction::State::CFG);                                 \
+  BF->setContainedNegateRAState();                                             \
+  /* All tests need at least one BB. */                                        \
+  BinaryBasicBlock *BB = BF->addBasicBlock();                                  \
+  BF->addEntryPoint(*BB);                                                      \
+  BB->setCFIState(0);
+
+  void prepareElf() {
+    memcpy(ElfBuf, "\177ELF", 4);
+    ELF64LE::Ehdr *EHdr = reinterpret_cast<typename ELF64LE::Ehdr *>(ElfBuf);
+    EHdr->e_ident[llvm::ELF::EI_CLASS] = llvm::ELF::ELFCLASS64;
+    EHdr->e_ident[llvm::ELF::EI_DATA] = llvm::ELF::ELFDATA2LSB;
+    EHdr->e_machine = GetParam() == Triple::aarch64 ? EM_AARCH64 : EM_X86_64;
+    MemoryBufferRef Source(StringRef(ElfBuf, sizeof(ElfBuf)), "ELF");
+    ObjFile = cantFail(ObjectFile::createObjectFile(Source));
+  }
+  void initializeBolt() {
+    Relocation::Arch = ObjFile->makeTriple().getArch();
+    BC = cantFail(BinaryContext::createBinaryContext(
+        ObjFile->makeTriple(), std::make_shared<orc::SymbolStringPool>(),
+        ObjFile->getFileName(), nullptr, true, DWARFContext::create(*ObjFile),
+        {llvm::outs(), llvm::errs()}));
+    ASSERT_FALSE(!BC);
+    BC->initializeTarget(std::unique_ptr<MCPlusBuilder>(
+        createMCPlusBuilder(GetParam(), BC->MIA.get(), BC->MII.get(),
+                            BC->MRI.get(), BC->STI.get())));
+
+    PassManager = std::make_unique<BinaryFunctionPassManager>(*BC);
+    PassManager->registerPass(std::make_unique<InsertNegateRAState>());
+
+    TextSection = &BC->registerOrUpdateSection(
+        ".text", ELF::SHT_PROGBITS, ELF::SHF_ALLOC | ELF::SHF_EXECINSTR,
+        /*Data=*/nullptr, /*Size=*/0,
+        /*Alignment=*/16);
+  }
+
+  std::vector<int> findCFIOffsets(BinaryFunction &BF) {
+    std::vector<int> Locations;
+    int Idx = 0;
+    int InstSize = 4; // AArch64
+    for (BinaryBasicBlock &BB : BF) {
+      for (MCInst &Inst : BB) {
+        if (BC->MIB->isCFI(Inst)) {
+          const MCCFIInstruction *CFI = BF.getCFIFor(Inst);
+          if (CFI->getOperation() == MCCFIInstruction::OpNegateRAState)
+            Locations.push_back(Idx * InstSize);
+        }
+        Idx++;
+      }
+    }
+    return Locations;
+  }
+
+  char ElfBuf[sizeof(typename ELF64LE::Ehdr)] = {};
+  std::unique_ptr<ObjectFile> ObjFile;
+  std::unique_ptr<BinaryContext> BC;
+  std::unique_ptr<BinaryFunctionPassManager> PassManager;
+  BinarySection *TextSection;
+};
+} // namespace
+
+TEST_P(PassTester, ExampleTest) {
+  if (GetParam() != Triple::aarch64)
+    GTEST_SKIP();
+
+  ASSERT_NE(TextSection, nullptr);
+
+  PREPARE_FUNC("ExampleFunction");
+
+  MCInst UnsignedInst = MCInstBuilder(AArch64::ADDSXri)
+                            .addReg(AArch64::X0)
+                            .addReg(AArch64::X0)
+                            .addImm(0)
+                            .addImm(0);
+  BC->MIB->setRAState(UnsignedInst, false);
+  BB->addInstruction(UnsignedInst);
+
+  MCInst SignedInst = MCInstBuilder(AArch64::ADDSXri)
+                          .addReg(AArch64::X0)
+                          .addReg(AArch64::X0)
+                          .addImm(1)
+                          .addImm(0);
+  BC->MIB->setRAState(SignedInst, true);
+  BB->addInstruction(SignedInst);
+
+  Error E = PassManager->runPasses();
+  EXPECT_FALSE(E);
+
+  /* Expected layout of BF after the pass:
+
+   .LBB0 (3 instructions, align : 1)
+      Entry Point
+      CFI State : 0
+        00000000:   adds    x0, x0, #0x0
+        00000004:   !CFI    $0      ; OpNegateRAState
+        00000004:   adds    x0, x0, #0x1
+      CFI State: 0
+   */
+  auto CFILoc = findCFIOffsets(*BF);
+  EXPECT_EQ(CFILoc.size(), 1u);
+  EXPECT_EQ(CFILoc[0], 4);
+}
+
+#ifdef AARCH64_AVAILABLE
+INSTANTIATE_TEST_SUITE_P(AArch64, PassTester,
+                         ::testing::Values(Triple::aarch64));
+#endif

>From e42db94d10dd5fa2ad0e650683ec03489238a26d Mon Sep 17 00:00:00 2001
From: Gergely Balint <gergely.balint at arm.com>
Date: Fri, 31 Oct 2025 16:18:08 +0000
Subject: [PATCH 03/13] [BOLT] Test fillUnknownRAStateInBB

---
 bolt/unittests/Passes/InsertNegateRAState.cpp | 63 +++++++++++++++++++
 1 file changed, 63 insertions(+)

diff --git a/bolt/unittests/Passes/InsertNegateRAState.cpp b/bolt/unittests/Passes/InsertNegateRAState.cpp
index 024b696eb0534..29ee7b196603a 100644
--- a/bolt/unittests/Passes/InsertNegateRAState.cpp
+++ b/bolt/unittests/Passes/InsertNegateRAState.cpp
@@ -158,6 +158,69 @@ TEST_P(PassTester, ExampleTest) {
   EXPECT_EQ(CFILoc[0], 4);
 }
 
+TEST_P(PassTester, fillUnknownStateInBBTest) {
+  /* Check that a if BB starts with unknown RAState, we can fill the unknown
+   states based on following instructions with known RAStates.
+   *
+   * .LBB0 (1 instructions, align : 1)
+        Entry Point
+        CFI State : 0
+          00000000:   adds    x0, x0, #0x0
+        CFI State: 0
+
+     .LBB1 (4 instructions, align : 1)
+        CFI State : 0
+          00000004:   !CFI    $0      ; OpNegateRAState
+          00000004:   adds    x0, x0, #0x1
+          00000008:   adds    x0, x0, #0x2
+          0000000c:   adds    x0, x0, #0x3
+        CFI State: 0
+   */
+  if (GetParam() != Triple::aarch64)
+    GTEST_SKIP();
+
+  ASSERT_NE(TextSection, nullptr);
+
+  PREPARE_FUNC("FuncWithUnknownStateInBB");
+  BinaryBasicBlock *BB2 = BF->addBasicBlock();
+  BB2->setCFIState(0);
+
+  MCInst Unsigned = MCInstBuilder(AArch64::ADDSXri)
+                        .addReg(AArch64::X0)
+                        .addReg(AArch64::X0)
+                        .addImm(0)
+                        .addImm(0);
+  BC->MIB->setRAState(Unsigned, false);
+  BB->addInstruction(Unsigned);
+
+  MCInst Unknown = MCInstBuilder(AArch64::ADDSXri)
+                       .addReg(AArch64::X0)
+                       .addReg(AArch64::X0)
+                       .addImm(1)
+                       .addImm(0);
+  MCInst Unknown1 = MCInstBuilder(AArch64::ADDSXri)
+                        .addReg(AArch64::X0)
+                        .addReg(AArch64::X0)
+                        .addImm(2)
+                        .addImm(0);
+  MCInst Signed = MCInstBuilder(AArch64::ADDSXri)
+                      .addReg(AArch64::X0)
+                      .addReg(AArch64::X0)
+                      .addImm(3)
+                      .addImm(0);
+  BC->MIB->setRAState(Signed, true);
+  BB2->addInstruction(Unknown);
+  BB2->addInstruction(Unknown1);
+  BB2->addInstruction(Signed);
+
+  Error E = PassManager->runPasses();
+  EXPECT_FALSE(E);
+
+  auto CFILoc = findCFIOffsets(*BF);
+  EXPECT_EQ(CFILoc.size(), 1u);
+  EXPECT_EQ(CFILoc[0], 4);
+}
+
 #ifdef AARCH64_AVAILABLE
 INSTANTIATE_TEST_SUITE_P(AArch64, PassTester,
                          ::testing::Values(Triple::aarch64));

>From b1730fbcbe73f96b19e2980dfcc0d2d0daa8fa00 Mon Sep 17 00:00:00 2001
From: Gergely Balint <gergely.balint at arm.com>
Date: Mon, 3 Nov 2025 12:52:30 +0000
Subject: [PATCH 04/13] [BOLT] Test fillUnknownStubs

---
 bolt/unittests/Passes/InsertNegateRAState.cpp | 61 +++++++++++++++++++
 1 file changed, 61 insertions(+)

diff --git a/bolt/unittests/Passes/InsertNegateRAState.cpp b/bolt/unittests/Passes/InsertNegateRAState.cpp
index 29ee7b196603a..939257341a698 100644
--- a/bolt/unittests/Passes/InsertNegateRAState.cpp
+++ b/bolt/unittests/Passes/InsertNegateRAState.cpp
@@ -221,6 +221,67 @@ TEST_P(PassTester, fillUnknownStateInBBTest) {
   EXPECT_EQ(CFILoc[0], 4);
 }
 
+TEST_P(PassTester, fillUnknownStubs) {
+  /*
+   * Stubs that are not part of the function's CFG should inherit the RAState of
+   the BasicBlock before it.
+   *
+   * LBB1 is not part of the CFG: LBB0 jumps unconditionally to LBB2.
+   * LBB1 would be a stub inserted in LongJmp in real code.
+   * We do not add any NegateRAState CFIs, as other CFIs are not added either.
+   * See issue #160989 for more details.
+   *
+   *  .LBB0 (1 instructions, align : 1)
+       Entry Point
+         00000000:   b       .LBB2
+       Successors: .LBB2
+
+     .LBB1 (1 instructions, align : 1)
+         00000004:   ret
+
+     .LBB2 (1 instructions, align : 1)
+       Predecessors: .LBB0
+          00000008:   ret
+   */
+  if (GetParam() != Triple::aarch64)
+    GTEST_SKIP();
+
+  ASSERT_NE(TextSection, nullptr);
+
+  PREPARE_FUNC("FuncWithStub");
+  BinaryBasicBlock *BB2 = BF->addBasicBlock();
+  BB2->setCFIState(0);
+  BinaryBasicBlock *BB3 = BF->addBasicBlock();
+  BB3->setCFIState(0);
+
+  BB->addSuccessor(BB3);
+
+  // Jumping over BB2, to BB3.
+  MCInst Jump;
+  BC->MIB->createUncondBranch(Jump, BB3->getLabel(), BC->Ctx.get());
+  BB->addInstruction(Jump);
+  BC->MIB->setRAState(Jump, false);
+
+  // BB2, in real code it would be a ShortJmp.
+  // Unknown RAState.
+  MCInst StubInst;
+  BC->MIB->createReturn(StubInst);
+  BB2->addInstruction(StubInst);
+
+  // Can be any instruction.
+  MCInst Ret;
+  BC->MIB->createReturn(Ret);
+  BB3->addInstruction(Ret);
+  BC->MIB->setRAState(Ret, false);
+
+  Error E = PassManager->runPasses();
+  EXPECT_FALSE(E);
+
+  // Check that we did not generate any NegateRAState CFIs.
+  auto CFILoc = findCFIOffsets(*BF);
+  EXPECT_EQ(CFILoc.size(), 0u);
+}
+
 #ifdef AARCH64_AVAILABLE
 INSTANTIATE_TEST_SUITE_P(AArch64, PassTester,
                          ::testing::Values(Triple::aarch64));

>From 5f46186b4050d67ee68bbde31e116e8417bd29b2 Mon Sep 17 00:00:00 2001
From: Gergely Balint <gergely.balint at arm.com>
Date: Wed, 12 Nov 2025 13:07:40 +0000
Subject: [PATCH 05/13] [BOLT] Review changes

---
 .../bolt/Passes/InsertNegateRAStatePass.h     |  2 +
 bolt/lib/Passes/InsertNegateRAStatePass.cpp   | 83 ++++++++++---------
 2 files changed, 46 insertions(+), 39 deletions(-)

diff --git a/bolt/include/bolt/Passes/InsertNegateRAStatePass.h b/bolt/include/bolt/Passes/InsertNegateRAStatePass.h
index b4b428207b657..c819b1c4914b4 100644
--- a/bolt/include/bolt/Passes/InsertNegateRAStatePass.h
+++ b/bolt/include/bolt/Passes/InsertNegateRAStatePass.h
@@ -30,6 +30,8 @@ class InsertNegateRAState : public BinaryFunctionPass {
 private:
   /// Because states are tracked as MCAnnotations on individual instructions,
   /// newly inserted instructions do not have a state associated with them.
+  /// Uses fillUnknownStateInBB, fillUnknownBlocksInCFG and fillUnknownStubs in
+  /// this order of priority.
   void inferUnknownStates(BinaryFunction &BF);
 
   /// Simple case: copy RAStates to unknown insts from previous inst.
diff --git a/bolt/lib/Passes/InsertNegateRAStatePass.cpp b/bolt/lib/Passes/InsertNegateRAStatePass.cpp
index 1b1fbf7f20e9d..be9bb3bc6e22d 100644
--- a/bolt/lib/Passes/InsertNegateRAStatePass.cpp
+++ b/bolt/lib/Passes/InsertNegateRAStatePass.cpp
@@ -157,9 +157,10 @@ void InsertNegateRAState::fillUnknownStateInBB(BinaryContext &BC,
     if (BC.MIB->isCFI(Inst))
       continue;
 
+    // No need to check for nullopt: we only entered this loop after the first
+    // instruction had its RAState set, and RAState is always set for the
+    // previous instruction in the previous iteration of the loop.
     auto PrevRAState = BC.MIB->getRAState(Prev);
-    if (!PrevRAState)
-      llvm_unreachable("Previous Instruction has no RAState.");
 
     auto RAState = BC.MIB->getRAState(Inst);
     if (!RAState) {
@@ -198,6 +199,47 @@ void InsertNegateRAState::markUnknownBlock(BinaryContext &BC,
   }
 }
 
+void InsertNegateRAState::fillUnknownStubs(BinaryFunction &BF) {
+  BinaryContext &BC = BF.getBinaryContext();
+  bool FirstIter = true;
+  MCInst PrevInst;
+  for (FunctionFragment &FF : BF.getLayout().fragments()) {
+    for (BinaryBasicBlock *BB : FF) {
+      if (!FirstIter && isUnknownBlock(BC, *BB)) {
+        // If we have no predecessors or successors, current BB is a Stub called
+        // from another BinaryFunction. As of #160989, we have to copy the
+        // PrevInst's RAState, because CFIs are already incorrect here.
+        if (BB->pred_size() == 0 && BB->succ_size() == 0) {
+          auto PrevRAState = BC.MIB->getRAState(PrevInst);
+          if (!PrevRAState) {
+            llvm_unreachable(
+                "Previous Instruction has no RAState in fillUnknownStubs.");
+            continue;
+          }
+
+          if (BC.MIB->isPSignOnLR(PrevInst)) {
+            PrevRAState = true;
+          } else if (BC.MIB->isPAuthOnLR(PrevInst)) {
+            PrevRAState = false;
+          }
+          markUnknownBlock(BC, *BB, *PrevRAState);
+        }
+      }
+      if (FirstIter) {
+        FirstIter = false;
+        if (isUnknownBlock(BC, *BB))
+          // Set block to unsigned, because this is the first block of the
+          // function, and all instruction in it are new instructions generated
+          // by BOLT optimizations.
+          markUnknownBlock(BC, *BB, false);
+      }
+      auto Last = BB->getLastNonPseudo();
+      if (Last != BB->rend())
+        PrevInst = *Last;
+    }
+  }
+}
+
 std::optional<bool> InsertNegateRAState::getRAStateByCFG(BinaryBasicBlock &BB,
                                                          BinaryFunction &BF) {
   BinaryContext &BC = BF.getBinaryContext();
@@ -244,43 +286,6 @@ std::optional<bool> InsertNegateRAState::getRAStateByCFG(BinaryBasicBlock &BB,
   return NeighborRAState;
 }
 
-void InsertNegateRAState::fillUnknownStubs(BinaryFunction &BF) {
-  BinaryContext &BC = BF.getBinaryContext();
-  bool FirstIter = true;
-  MCInst PrevInst;
-  for (FunctionFragment &FF : BF.getLayout().fragments()) {
-    for (BinaryBasicBlock *BB : FF) {
-      if (!FirstIter && isUnknownBlock(BC, *BB)) {
-        // If we have no predecessors or successors, current BB is a Stub called
-        // from another BinaryFunction. As of #160989, we have to copy the
-        // PrevInst's RAState, because CFIs are already incorrect here.
-        if (BB->pred_size() == 0 && BB->succ_size() == 0) {
-          auto PrevRAState = BC.MIB->getRAState(PrevInst);
-          if (!PrevRAState) {
-            llvm_unreachable(
-                "Previous Instruction has no RAState in fillUnknownStubs.");
-            continue;
-          }
-
-          if (BC.MIB->isPSignOnLR(PrevInst)) {
-            PrevRAState = true;
-          } else if (BC.MIB->isPAuthOnLR(PrevInst)) {
-            PrevRAState = false;
-          }
-          markUnknownBlock(BC, *BB, *PrevRAState);
-        }
-      }
-      if (FirstIter) {
-        FirstIter = false;
-        if (isUnknownBlock(BC, *BB))
-          markUnknownBlock(BC, *BB, false);
-      }
-      auto Last = BB->getLastNonPseudo();
-      if (Last != BB->rend())
-        PrevInst = *Last;
-    }
-  }
-}
 void InsertNegateRAState::fillUnknownBlocksInCFG(BinaryFunction &BF) {
   BinaryContext &BC = BF.getBinaryContext();
 

>From 339f10c838f2307cfda8ea39b6af30ed54cc837c Mon Sep 17 00:00:00 2001
From: Gergely Balint <gergely.balint at arm.com>
Date: Mon, 24 Nov 2025 13:36:31 +0000
Subject: [PATCH 06/13] [BOLT] Simplify

As fully unknown blocks (stubs) don't have correct unwind info, guessing
their RA state is also futile. Remove code implementing this.
---
 bolt/lib/Passes/InsertNegateRAStatePass.cpp | 107 ++------------------
 1 file changed, 11 insertions(+), 96 deletions(-)

diff --git a/bolt/lib/Passes/InsertNegateRAStatePass.cpp b/bolt/lib/Passes/InsertNegateRAStatePass.cpp
index be9bb3bc6e22d..9fed477ffd5fb 100644
--- a/bolt/lib/Passes/InsertNegateRAStatePass.cpp
+++ b/bolt/lib/Passes/InsertNegateRAStatePass.cpp
@@ -81,11 +81,11 @@ void InsertNegateRAState::inferUnknownStates(BinaryFunction &BF) {
   for (BinaryBasicBlock &BB : BF) {
     fillUnknownStateInBB(BC, BB);
   }
-  // Some stubs have no predecessors. For those, we iterate once in the layout
-  // order to fill their RAState.
+  // BasicBlocks which are made entirely of "new instructions" (instructions
+  // without RAState annotation) are stubs, and do not have correct unwind info.
+  // We should iterate in layout order and fill them based on previous known
+  // RAState.
   fillUnknownStubs(BF);
-
-  fillUnknownBlocksInCFG(BF);
 }
 
 void InsertNegateRAState::coverFunctionFragmentStart(BinaryFunction &BF,
@@ -206,16 +206,14 @@ void InsertNegateRAState::fillUnknownStubs(BinaryFunction &BF) {
   for (FunctionFragment &FF : BF.getLayout().fragments()) {
     for (BinaryBasicBlock *BB : FF) {
       if (!FirstIter && isUnknownBlock(BC, *BB)) {
-        // If we have no predecessors or successors, current BB is a Stub called
-        // from another BinaryFunction. As of #160989, we have to copy the
+        // As of #160989, we have to copy the
         // PrevInst's RAState, because CFIs are already incorrect here.
-        if (BB->pred_size() == 0 && BB->succ_size() == 0) {
-          auto PrevRAState = BC.MIB->getRAState(PrevInst);
-          if (!PrevRAState) {
-            llvm_unreachable(
-                "Previous Instruction has no RAState in fillUnknownStubs.");
-            continue;
-          }
+        auto PrevRAState = BC.MIB->getRAState(PrevInst);
+        if (!PrevRAState) {
+          llvm_unreachable(
+              "Previous Instruction has no RAState in fillUnknownStubs.");
+          continue;
+        }
 
           if (BC.MIB->isPSignOnLR(PrevInst)) {
             PrevRAState = true;
@@ -223,7 +221,6 @@ void InsertNegateRAState::fillUnknownStubs(BinaryFunction &BF) {
             PrevRAState = false;
           }
           markUnknownBlock(BC, *BB, *PrevRAState);
-        }
       }
       if (FirstIter) {
         FirstIter = false;
@@ -240,88 +237,6 @@ void InsertNegateRAState::fillUnknownStubs(BinaryFunction &BF) {
   }
 }
 
-std::optional<bool> InsertNegateRAState::getRAStateByCFG(BinaryBasicBlock &BB,
-                                                         BinaryFunction &BF) {
-  BinaryContext &BC = BF.getBinaryContext();
-
-  auto checkRAState = [&](std::optional<bool> &NeighborRAState, MCInst &Inst) {
-    auto RAState = BC.MIB->getRAState(Inst);
-    if (!RAState)
-      return;
-    if (!NeighborRAState) {
-      NeighborRAState = *RAState;
-      return;
-    }
-    if (NeighborRAState != *RAState) {
-      BC.outs() << "BOLT-WARNING: Conflicting RAState found in function "
-                << BF.getPrintName() << ". Function will not be optimized.\n";
-      BF.setIgnored();
-    }
-  };
-
-  // Holds the first found RAState from CFG neighbors.
-  std::optional<bool> NeighborRAState = std::nullopt;
-  if (BB.pred_size() != 0) {
-    for (BinaryBasicBlock *PredBB : BB.predecessors()) {
-      //  find last inst of Predecessor with known RA State.
-      auto LI = PredBB->getLastNonPseudo();
-      if (LI == PredBB->rend())
-        continue;
-      MCInst &LastInst = *LI;
-      checkRAState(NeighborRAState, LastInst);
-    }
-  } else if (BB.succ_size() != 0) {
-    for (BinaryBasicBlock *SuccBB : BB.successors()) {
-      //  find first inst of Successor with known RA State.
-      auto FI = SuccBB->getFirstNonPseudo();
-      if (FI == SuccBB->end())
-        continue;
-      MCInst &FirstInst = *FI;
-      checkRAState(NeighborRAState, FirstInst);
-    }
-  } else {
-    llvm_unreachable("Called getRAStateByCFG on a BB with no preds or succs.");
-  }
-
-  return NeighborRAState;
-}
-
-void InsertNegateRAState::fillUnknownBlocksInCFG(BinaryFunction &BF) {
-  BinaryContext &BC = BF.getBinaryContext();
-
-  auto fillUnknowns = [&](BinaryFunction &BF) -> std::pair<int, bool> {
-    int Unknowns = 0;
-    bool Updated = false;
-    for (BinaryBasicBlock &BB : BF) {
-      // Only try to iterate if the BB has either predecessors or successors.
-      if (isUnknownBlock(BC, BB) &&
-          (BB.pred_size() != 0 || BB.succ_size() != 0)) {
-        auto RAStateOpt = getRAStateByCFG(BB, BF);
-        if (RAStateOpt) {
-          markUnknownBlock(BC, BB, *RAStateOpt);
-          Updated = true;
-        } else {
-          Unknowns++;
-        }
-      }
-    }
-    return std::pair<int, bool>{Unknowns, Updated};
-  };
-
-  while (true) {
-    std::pair<int, bool> Iter = fillUnknowns(BF);
-    if (Iter.first == 0)
-      break;
-    if (!Iter.second) {
-      BC.errs() << "BOLT-WARNING: Could not infer RAState for " << Iter.first
-                << " BBs in function " << BF.getPrintName()
-                << ". Function will not be optimized.\n";
-      BF.setIgnored();
-      break;
-    }
-  }
-}
-
 Error InsertNegateRAState::runOnFunctions(BinaryContext &BC) {
   std::atomic<uint64_t> FunctionsModified{0};
   ParallelUtilities::WorkFuncTy WorkFun = [&](BinaryFunction &BF) {

>From c4dab2f53830773fa6c6a1a78b1c1080c51c7373 Mon Sep 17 00:00:00 2001
From: Gergely Balint <gergely.balint at arm.com>
Date: Mon, 24 Nov 2025 13:54:10 +0000
Subject: [PATCH 07/13] [BOLT] Fix format && style

---
 bolt/lib/Passes/InsertNegateRAStatePass.cpp | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/bolt/lib/Passes/InsertNegateRAStatePass.cpp b/bolt/lib/Passes/InsertNegateRAStatePass.cpp
index 9fed477ffd5fb..122337e3cb30b 100644
--- a/bolt/lib/Passes/InsertNegateRAStatePass.cpp
+++ b/bolt/lib/Passes/InsertNegateRAStatePass.cpp
@@ -215,12 +215,11 @@ void InsertNegateRAState::fillUnknownStubs(BinaryFunction &BF) {
           continue;
         }
 
-          if (BC.MIB->isPSignOnLR(PrevInst)) {
-            PrevRAState = true;
-          } else if (BC.MIB->isPAuthOnLR(PrevInst)) {
-            PrevRAState = false;
-          }
-          markUnknownBlock(BC, *BB, *PrevRAState);
+        if (BC.MIB->isPSignOnLR(PrevInst))
+          PrevRAState = true;
+        else if (BC.MIB->isPAuthOnLR(PrevInst))
+          PrevRAState = false;
+        markUnknownBlock(BC, *BB, *PrevRAState);
       }
       if (FirstIter) {
         FirstIter = false;

>From d8d501045b836bdc82146783ce1a1dc27ce3479b Mon Sep 17 00:00:00 2001
From: Gergely Balint <gergely.balint at arm.com>
Date: Mon, 24 Nov 2025 14:46:11 +0000
Subject: [PATCH 08/13] [BOLT] Review

- don't use auto instead of std::optional
- add explicit .has_value() to checking for nullopt (as the
  optional contains a bool)
- fix comment
---
 bolt/lib/Passes/InsertNegateRAStatePass.cpp | 32 +++++++++++----------
 1 file changed, 17 insertions(+), 15 deletions(-)

diff --git a/bolt/lib/Passes/InsertNegateRAStatePass.cpp b/bolt/lib/Passes/InsertNegateRAStatePass.cpp
index 122337e3cb30b..603c7a81545bc 100644
--- a/bolt/lib/Passes/InsertNegateRAStatePass.cpp
+++ b/bolt/lib/Passes/InsertNegateRAStatePass.cpp
@@ -52,8 +52,8 @@ void InsertNegateRAState::runOnFunction(BinaryFunction &BF) {
         MCInst &Inst = *It;
         if (BC.MIB->isCFI(Inst))
           continue;
-        auto RAState = BC.MIB->getRAState(Inst);
-        if (!RAState) {
+        std::optional<bool> RAState = BC.MIB->getRAState(Inst);
+        if (!RAState.has_value()) {
           BC.errs() << "BOLT-ERROR: unknown RAState after inferUnknownStates "
                     << " in function " << BF.getPrintName() << "\n";
           PassFailed = true;
@@ -106,8 +106,8 @@ void InsertNegateRAState::coverFunctionFragmentStart(BinaryFunction &BF,
   // If a function is already split in the input, the first FF can also start
   // with Signed state. This covers that scenario as well.
   auto II = (*FirstNonEmpty)->getFirstNonPseudo();
-  auto RAState = BC.MIB->getRAState(*II);
-  if (!RAState) {
+  std::optional<bool> RAState = BC.MIB->getRAState(*II);
+  if (!RAState.has_value()) {
     BC.errs() << "BOLT-ERROR: unknown RAState after inferUnknownStates "
               << " in function " << BF.getPrintName() << "\n";
     PassFailed = true;
@@ -124,8 +124,8 @@ InsertNegateRAState::getFirstKnownRAState(BinaryContext &BC,
   for (const MCInst &Inst : BB) {
     if (BC.MIB->isCFI(Inst))
       continue;
-    auto RAStateOpt = BC.MIB->getRAState(Inst);
-    if (RAStateOpt)
+    std::optional<bool> RAStateOpt = BC.MIB->getRAState(Inst);
+    if (RAStateOpt.has_value())
       return RAStateOpt;
   }
   return std::nullopt;
@@ -139,8 +139,8 @@ void InsertNegateRAState::fillUnknownStateInBB(BinaryContext &BC,
     return;
   // If the first instruction has unknown RAState, we should copy the first
   // known RAState.
-  auto RAStateOpt = BC.MIB->getRAState(*First);
-  if (!RAStateOpt) {
+  std::optional<bool> RAStateOpt = BC.MIB->getRAState(*First);
+  if (!RAStateOpt.has_value()) {
     auto FirstRAState = getFirstKnownRAState(BC, BB);
     if (!FirstRAState)
       // We fill unknown BBs later.
@@ -160,7 +160,7 @@ void InsertNegateRAState::fillUnknownStateInBB(BinaryContext &BC,
     // No need to check for nullopt: we only entered this loop after the first
     // instruction had its RAState set, and RAState is always set for the
     // previous instruction in the previous iteration of the loop.
-    auto PrevRAState = BC.MIB->getRAState(Prev);
+    std::optional<bool> PrevRAState = BC.MIB->getRAState(Prev);
 
     auto RAState = BC.MIB->getRAState(Inst);
     if (!RAState) {
@@ -179,8 +179,8 @@ bool InsertNegateRAState::isUnknownBlock(BinaryContext &BC,
   for (const MCInst &Inst : BB) {
     if (BC.MIB->isCFI(Inst))
       continue;
-    auto RAState = BC.MIB->getRAState(Inst);
-    if (RAState)
+    std::optional<bool> RAState = BC.MIB->getRAState(Inst);
+    if (RAState.has_value())
       return false;
   }
   return true;
@@ -206,10 +206,12 @@ void InsertNegateRAState::fillUnknownStubs(BinaryFunction &BF) {
   for (FunctionFragment &FF : BF.getLayout().fragments()) {
     for (BinaryBasicBlock *BB : FF) {
       if (!FirstIter && isUnknownBlock(BC, *BB)) {
-        // As of #160989, we have to copy the
-        // PrevInst's RAState, because CFIs are already incorrect here.
-        auto PrevRAState = BC.MIB->getRAState(PrevInst);
-        if (!PrevRAState) {
+        // As exlained in issue #160989, the unwind info is incorrect for stubs.
+        // Indicating the correct RAState without the rest of the unwind info
+        // being correct is not useful. Instead, we copy the RAState from the
+        // previous instruction.
+        std::optional<bool> PrevRAState = BC.MIB->getRAState(PrevInst);
+        if (!PrevRAState.has_value()) {
           llvm_unreachable(
               "Previous Instruction has no RAState in fillUnknownStubs.");
           continue;

>From 9cccd4a933a596e40fd4600f48e2aabba9374bb7 Mon Sep 17 00:00:00 2001
From: Gergely Balint <gergely.balint at arm.com>
Date: Mon, 24 Nov 2025 15:16:57 +0000
Subject: [PATCH 09/13] [BOLT] Fixes

---
 bolt/lib/Passes/InsertNegateRAStatePass.cpp | 26 ++++++++++-----------
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/bolt/lib/Passes/InsertNegateRAStatePass.cpp b/bolt/lib/Passes/InsertNegateRAStatePass.cpp
index 603c7a81545bc..c2caa6a1ee5e7 100644
--- a/bolt/lib/Passes/InsertNegateRAStatePass.cpp
+++ b/bolt/lib/Passes/InsertNegateRAStatePass.cpp
@@ -124,9 +124,9 @@ InsertNegateRAState::getFirstKnownRAState(BinaryContext &BC,
   for (const MCInst &Inst : BB) {
     if (BC.MIB->isCFI(Inst))
       continue;
-    std::optional<bool> RAStateOpt = BC.MIB->getRAState(Inst);
-    if (RAStateOpt.has_value())
-      return RAStateOpt;
+    std::optional<bool> RAState = BC.MIB->getRAState(Inst);
+    if (RAState.has_value())
+      return RAState;
   }
   return std::nullopt;
 }
@@ -139,10 +139,10 @@ void InsertNegateRAState::fillUnknownStateInBB(BinaryContext &BC,
     return;
   // If the first instruction has unknown RAState, we should copy the first
   // known RAState.
-  std::optional<bool> RAStateOpt = BC.MIB->getRAState(*First);
-  if (!RAStateOpt.has_value()) {
-    auto FirstRAState = getFirstKnownRAState(BC, BB);
-    if (!FirstRAState)
+  std::optional<bool> RAState = BC.MIB->getRAState(*First);
+  if (!RAState.has_value()) {
+    std::optional<bool> FirstRAState = getFirstKnownRAState(BC, BB);
+    if (!FirstRAState.has_value())
       // We fill unknown BBs later.
       return;
 
@@ -162,8 +162,8 @@ void InsertNegateRAState::fillUnknownStateInBB(BinaryContext &BC,
     // previous instruction in the previous iteration of the loop.
     std::optional<bool> PrevRAState = BC.MIB->getRAState(Prev);
 
-    auto RAState = BC.MIB->getRAState(Inst);
-    if (!RAState) {
+    std::optional<bool> RAState = BC.MIB->getRAState(Inst);
+    if (!RAState.has_value()) {
       if (BC.MIB->isPSignOnLR(Prev))
         PrevRAState = true;
       else if (BC.MIB->isPAuthOnLR(Prev))
@@ -206,10 +206,10 @@ void InsertNegateRAState::fillUnknownStubs(BinaryFunction &BF) {
   for (FunctionFragment &FF : BF.getLayout().fragments()) {
     for (BinaryBasicBlock *BB : FF) {
       if (!FirstIter && isUnknownBlock(BC, *BB)) {
-        // As exlained in issue #160989, the unwind info is incorrect for stubs.
-        // Indicating the correct RAState without the rest of the unwind info
-        // being correct is not useful. Instead, we copy the RAState from the
-        // previous instruction.
+        // As explained in issue #160989, the unwind info is incorrect for
+        // stubs. Indicating the correct RAState without the rest of the unwind
+        // info being correct is not useful. Instead, we copy the RAState from
+        // the previous instruction.
         std::optional<bool> PrevRAState = BC.MIB->getRAState(PrevInst);
         if (!PrevRAState.has_value()) {
           llvm_unreachable(

>From 4f6e96affdc35c3074713a6fc9dc43728b6a5058 Mon Sep 17 00:00:00 2001
From: Gergely Balint <gergely.balint at arm.com>
Date: Wed, 26 Nov 2025 12:28:44 +0000
Subject: [PATCH 10/13] [BOLT] Update doc

---
 bolt/docs/PacRetDesign.md | 24 +++++++++---------------
 1 file changed, 9 insertions(+), 15 deletions(-)

diff --git a/bolt/docs/PacRetDesign.md b/bolt/docs/PacRetDesign.md
index c7c76cac3a100..2e3cb7b91e0ce 100644
--- a/bolt/docs/PacRetDesign.md
+++ b/bolt/docs/PacRetDesign.md
@@ -201,27 +201,21 @@ Some BOLT passes can add new Instructions. In InsertNegateRAStatePass, we have
 to know what RA state these have.
 
 > [!important]
-> As issue #160989 describes, unwind info is incorrect in stubs with multiple callers.
-> For this same reason, we cannot generate correct pac-specific unwind info: the signess
-> of the _incorrect_ return address is meaningless.
+> As issue #160989 explains, unwind info is missing from stubs.
+> For this same reason, we cannot generate correct pac-specific unwind info: the
+> signedness of the _incorrect_ return address is meaningless.
 
 Assignment of RAStates to newly generated instructions is done in `inferUnknownStates`.
-We have three different cases to cover:
+We have two different cases to cover:
 
 1. If a BasicBlock has some instructions with known RA state, and some without, we
    can copy the RAState of known instructions to the unknown ones. As the control
-   flow only changes between BasicBlocks, instructions in the same BasicBlock have the
-   same return address.
+   flow only changes between BasicBlocks, instructions in the same BasicBlock have
+   the same return address. (The exception is noreturn calls, but these would only
+   cause problems, if the newly inserted instruction is right after the call.)
 
-2. If all instructions in a BasicBlock are unknown, we can look at all CFG neighbors
-   (that is predecessors/successors). The RAState should be the same as of the
-   neighboring blocks. Conflicting RAStates in neighbors indicate an error. Such
-   functions should be ignored.
-
-3. If a BasicBlock has no CFG neighbors, we have to copy the RAState of the previous
-    BasicBlock in layout order.
-
-If any BasicBlocks remain with unknown instructions, the function will be ignored.
+2. If a BasicBlock has no instructions with known RAState, we have to copy the
+   RAState of the previous BasicBlock in layout order.
 
 ### Optimizations requiring special attention
 

>From 7acba333ee14e8eaac2f7f59ab04ee9aa76deeb9 Mon Sep 17 00:00:00 2001
From: Gergely Balint <gergely.balint at arm.com>
Date: Wed, 26 Nov 2025 12:34:18 +0000
Subject: [PATCH 11/13] [BOLT] Fix: skip unneeded iterations

---
 bolt/lib/Passes/InsertNegateRAStatePass.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/bolt/lib/Passes/InsertNegateRAStatePass.cpp b/bolt/lib/Passes/InsertNegateRAStatePass.cpp
index c2caa6a1ee5e7..663266f28b878 100644
--- a/bolt/lib/Passes/InsertNegateRAStatePass.cpp
+++ b/bolt/lib/Passes/InsertNegateRAStatePass.cpp
@@ -152,7 +152,7 @@ void InsertNegateRAState::fillUnknownStateInBB(BinaryContext &BC,
   // At this point we know the RAState of the first instruction,
   // so we can propagate the RAStates to all subsequent unknown instructions.
   MCInst Prev = *First;
-  for (auto It = BB.begin() + 1; It != BB.end(); ++It) {
+  for (auto It = First + 1; It != BB.end(); ++It) {
     MCInst &Inst = *It;
     if (BC.MIB->isCFI(Inst))
       continue;

>From b23fbd1723ce1af2e872b048aef031c8a804308b Mon Sep 17 00:00:00 2001
From: Gergely Balint <gergely.balint at arm.com>
Date: Wed, 26 Nov 2025 12:51:04 +0000
Subject: [PATCH 12/13] [BOLT] Update InsertNegateRAStatePass.h

- remove unused declarations
- update comments
---
 .../bolt/Passes/InsertNegateRAStatePass.h     | 23 +++++--------------
 1 file changed, 6 insertions(+), 17 deletions(-)

diff --git a/bolt/include/bolt/Passes/InsertNegateRAStatePass.h b/bolt/include/bolt/Passes/InsertNegateRAStatePass.h
index c819b1c4914b4..3f003af96162d 100644
--- a/bolt/include/bolt/Passes/InsertNegateRAStatePass.h
+++ b/bolt/include/bolt/Passes/InsertNegateRAStatePass.h
@@ -30,30 +30,19 @@ class InsertNegateRAState : public BinaryFunctionPass {
 private:
   /// Because states are tracked as MCAnnotations on individual instructions,
   /// newly inserted instructions do not have a state associated with them.
-  /// Uses fillUnknownStateInBB, fillUnknownBlocksInCFG and fillUnknownStubs in
-  /// this order of priority.
+  /// Uses fillUnknownStateInBB and fillUnknownStubs.
   void inferUnknownStates(BinaryFunction &BF);
 
   /// Simple case: copy RAStates to unknown insts from previous inst.
-  /// Account for signing and authenticating insts.
+  /// If the first inst has unknown state, copy set it to the first known state.
+  /// Accounts for signing and authenticating insts.
   void fillUnknownStateInBB(BinaryContext &BC, BinaryBasicBlock &BB);
 
-  /// Fill unknown RAStates in BBs with no successors/predecessors. These are
-  /// Stubs inserted by LongJmp. As of #160989, we have to copy the RAState from
-  /// the previous BB in the layout, because CFIs are already incorrect here.
+  /// Fill in RAState in BasicBlocks consisting entirely of new instructions.
+  /// As of #160989, we have to copy the RAState from the previous BB in the
+  /// layout, because CFIs are already incorrect here.
   void fillUnknownStubs(BinaryFunction &BF);
 
-  /// Fills unknowns RAStates of BBs with successors/predecessors. Uses
-  /// getRAStateByCFG to determine the RAState. Does more than one iteration if
-  /// needed. Reports an error, if it cannot find the RAState for all BBs with
-  /// predecessors/successors.
-  void fillUnknownBlocksInCFG(BinaryFunction &BF);
-
-  /// For BBs which only hold instructions with unknown RAState, we check
-  /// CFG neighbors (successors, predecessors) of the BB. If they have different
-  /// RAStates, we report an inconsistency. Otherwise, we return the found
-  /// RAState.
-  std::optional<bool> getRAStateByCFG(BinaryBasicBlock &BB, BinaryFunction &BF);
   /// Returns the first known RAState from \p BB, or std::nullopt if all are
   /// unknown.
   std::optional<bool> getFirstKnownRAState(BinaryContext &BC,

>From 28fa1ebdc25dafb191d852742f4db1df49d6fe8b Mon Sep 17 00:00:00 2001
From: Gergely Balint <gergely.balint at arm.com>
Date: Fri, 28 Nov 2025 11:34:44 +0000
Subject: [PATCH 13/13] Fix for cases where empty BBs are preceding unknown BBs

also added regression test for this.
---
 bolt/lib/Passes/InsertNegateRAStatePass.cpp   | 16 +++++---
 bolt/unittests/Passes/InsertNegateRAState.cpp | 37 +++++++++++++++++++
 2 files changed, 47 insertions(+), 6 deletions(-)

diff --git a/bolt/lib/Passes/InsertNegateRAStatePass.cpp b/bolt/lib/Passes/InsertNegateRAStatePass.cpp
index 663266f28b878..37004aae94937 100644
--- a/bolt/lib/Passes/InsertNegateRAStatePass.cpp
+++ b/bolt/lib/Passes/InsertNegateRAStatePass.cpp
@@ -212,8 +212,9 @@ void InsertNegateRAState::fillUnknownStubs(BinaryFunction &BF) {
         // the previous instruction.
         std::optional<bool> PrevRAState = BC.MIB->getRAState(PrevInst);
         if (!PrevRAState.has_value()) {
-          llvm_unreachable(
-              "Previous Instruction has no RAState in fillUnknownStubs.");
+          // No non-cfi instruction encountered in the function yet.
+          // This means the RAState is the same as at the function entry.
+          markUnknownBlock(BC, *BB, BF.getInitialRAState());
           continue;
         }
 
@@ -226,11 +227,14 @@ void InsertNegateRAState::fillUnknownStubs(BinaryFunction &BF) {
       if (FirstIter) {
         FirstIter = false;
         if (isUnknownBlock(BC, *BB))
-          // Set block to unsigned, because this is the first block of the
-          // function, and all instruction in it are new instructions generated
-          // by BOLT optimizations.
-          markUnknownBlock(BC, *BB, false);
+          // If the first BasicBlock is unknown, the function's entry RAState
+          // should be used.
+          markUnknownBlock(BC, *BB, BF.getInitialRAState());
       }
+      // This function iterates on BasicBlocks, so the PrevInst has to be
+      // updated to the last instruction of the current BasicBlock. If the
+      // BasicBlock is empty, or only has PseudoInstructions, PrevInst will not
+      // be updated.
       auto Last = BB->getLastNonPseudo();
       if (Last != BB->rend())
         PrevInst = *Last;
diff --git a/bolt/unittests/Passes/InsertNegateRAState.cpp b/bolt/unittests/Passes/InsertNegateRAState.cpp
index 939257341a698..123f6a0f1f767 100644
--- a/bolt/unittests/Passes/InsertNegateRAState.cpp
+++ b/bolt/unittests/Passes/InsertNegateRAState.cpp
@@ -282,6 +282,43 @@ TEST_P(PassTester, fillUnknownStubs) {
   EXPECT_EQ(CFILoc.size(), 0u);
 }
 
+TEST_P(PassTester, fillUnknownStubsEmpty) {
+  /*
+   * This test checks that BOLT can set the RAState of unknown BBs,
+   * even if all previous BBs are empty, hence no PrevInst gets set.
+   *
+   * As this means that the current (empty) BB is the first with non-pseudo
+   * instructions, the function's initialRAState should be used.
+   */
+  if (GetParam() != Triple::aarch64)
+    GTEST_SKIP();
+
+  ASSERT_NE(TextSection, nullptr);
+
+  PREPARE_FUNC("FuncWithStub");
+  BF->setInitialRAState(false);
+  BinaryBasicBlock *BB2 = BF->addBasicBlock();
+  BB2->setCFIState(0);
+
+  // BB is empty.
+  BB->addSuccessor(BB2);
+
+  // BB2, in real code it would be a ShortJmp.
+  // Unknown RAState.
+  MCInst StubInst;
+  BC->MIB->createReturn(StubInst);
+  BB2->addInstruction(StubInst);
+
+  Error E = PassManager->runPasses();
+  EXPECT_FALSE(E);
+
+  // Check that BOLT added an RAState to BB2.
+  std::optional<bool> RAState = BC->MIB->getRAState(*(BB2->begin()));
+  ASSERT_TRUE(RAState.has_value());
+  // BB2 should be set to BF.initialRAState (false).
+  ASSERT_FALSE(*RAState);
+}
+
 #ifdef AARCH64_AVAILABLE
 INSTANTIATE_TEST_SUITE_P(AArch64, PassTester,
                          ::testing::Values(Triple::aarch64));



More information about the llvm-commits mailing list