[llvm] [BOLT][binary-analysis] Add initial pac-ret gadget scanner (PR #122304)

Kristof Beyls via llvm-commits llvm-commits at lists.llvm.org
Fri Jan 17 05:16:21 PST 2025


https://github.com/kbeyls updated https://github.com/llvm/llvm-project/pull/122304

>From 0b0137b8feab04db089297cb0efc288255f54298 Mon Sep 17 00:00:00 2001
From: Kristof Beyls <kristof.beyls at arm.com>
Date: Thu, 9 Jan 2025 11:06:57 +0000
Subject: [PATCH 01/14] [BOLT][binary-analysis] Add initial pac-ret gadget
 scanner

This adds an initial pac-ret gadget scanner to the
llvm-bolt-binary-analysis-tool.

The scanner is taken from the prototype that was published last year at
https://github.com/llvm/llvm-project/compare/main...kbeyls:llvm-project:bolt-gadget-scanner-prototype,
and has been discussed in RFC
https://discourse.llvm.org/t/rfc-bolt-based-binary-analysis-tool-to-verify-correctness-of-security-hardening/78148
and in the EuroLLVM 2024 keynote "Does LLVM implement security
hardenings correctly? A BOLT-based static analyzer to the rescue?"
[Video](https://youtu.be/Sn_Fxa0tdpY)
[Slides](https://llvm.org/devmtg/2024-04/slides/Keynote/Beyls_EuroLLVM2024_security_hardening_keynote.pdf)

In the spirit of incremental development, this PR aims to add a minimal
implementation that is "fully working" on its own, but has major
limitations, as described in the bolt/docs/BinaryAnalysis.md
documentation in this proposed commit.  These and other limitations will
be fixed in follow-on PRs, mostly based on code already existing in the
prototype branch. I hope incrementally upstreaming will make it easier
to review the code.

That being said, this patch isn't really small.

I believe that this patch needs 2 "kinds" of reviewers:
1. People familiar with BOLT internals.
2. People familiar with the pac-ret hardening scheme and AArch64.

I expect people familiar with the pac-ret hardening scheme might need to
focus on reviewing mostly what is in file
`NonPacProtectedRetAnalysis.cpp`, `AArch64MCPlusBuilder.cpp`  and the
unit tests.

I expect people familiar with BOLT might mainly need to focus on the
other parts, and the high-level structure of
`NonPacProtectedRetAnalysis.cpp`.

The patch is not very polished currently, but I think it's in the right
shape to start getting it reviewed.

Note that I believe that this could also form the basis of a scanner to
analyze correct implementation of PAuthABI.
---
 bolt/docs/BinaryAnalysis.md                   | 163 ++++-
 bolt/include/bolt/Core/MCPlusBuilder.h        |  16 +
 .../bolt/Passes/NonPacProtectedRetAnalysis.h  | 209 ++++++
 bolt/include/bolt/Utils/CommandLineOpts.h     |   4 +
 bolt/lib/Passes/CMakeLists.txt                |   1 +
 .../lib/Passes/NonPacProtectedRetAnalysis.cpp | 520 +++++++++++++
 bolt/lib/Rewrite/RewriteInstance.cpp          |  25 +-
 .../Target/AArch64/AArch64MCPlusBuilder.cpp   |  63 ++
 .../binary-analysis/AArch64/cmdline-args.test |   8 +-
 .../AArch64/gs-pacret-autiasp.s               | 681 ++++++++++++++++++
 .../AArch64/gs-pacret-multi-bb.s              |  46 ++
 .../binary-analysis/AArch64/lit.local.cfg     |   2 +-
 12 files changed, 1733 insertions(+), 5 deletions(-)
 create mode 100644 bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h
 create mode 100644 bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pacret-multi-bb.s

diff --git a/bolt/docs/BinaryAnalysis.md b/bolt/docs/BinaryAnalysis.md
index f91b77d046de8f..e247ce20b4625c 100644
--- a/bolt/docs/BinaryAnalysis.md
+++ b/bolt/docs/BinaryAnalysis.md
@@ -9,9 +9,168 @@ analyses implemented in the BOLT libraries.
 
 ## Which binary analyses are implemented?
 
-At the moment, no binary analyses are implemented.
+* [Security scanners](#security-scanners)
+  * [pac-ret analysis](#pac-ret-analysis)
 
-The goal is to make it easy using a plug-in framework to add your own analyses.
+### Security scanners
+
+For the past 25 years, a large numbers of exploits have been built and used in
+the wild to undermine computer security. The majority of these exploits abuse
+memory vulnerabilities in programs, see evidence from
+[Microsoft](https://youtu.be/PjbGojjnBZQ?si=oCHCa0SHgaSNr6Gr&t=836),
+[Chromium](https://www.chromium.org/Home/chromium-security/memory-safety/) and
+[Android](https://security.googleblog.com/2021/01/data-driven-security-hardening-in.html).
+
+It is not surprising therefore, that a large number of mitigations have been
+added to instruction sets and toolchains to make it harder to build an exploit
+using a memory vulnerability. Examples are: stack canaries, stack clash,
+pac-ret, shadow stacks, arm64e, and many more.
+
+These mitigations guarantee a so-called "security property" on the binaries they
+produce. For example, for stack canaries, the security property is roughly that
+a canary is located on the stack between the set of saved variables and set of
+local variables. For pac-ret, it is roughly that there are no writes to the
+register containing the return address after either an authenticating
+instruction or a Branch-and-link instruction.
+
+From time to time, however, a bug gets found in the implementation of such
+mitigations in toolchains. Also, code that is written in assembler by hand
+requires the developer to ensure these security properties by hand.
+
+In short, it is sometimes found that a few places in the binary code are not
+protected as expected given the requested mitigations. Attackers could make use
+of those places (sometimes called gadgets) to circumvent to protection that the
+mitigation should give.
+
+One of the reasons that such gadgets, or holes in the mitigation implementation,
+exist is that typically the amount of testing and verification for these
+security properties is pretty limited.
+
+In comparison, for testing functional correctness, or for testing performance,
+toolchain and software in general typically get tested with large test suites
+and benchmarks and testing and benchmarking plans. In contrast, this typically
+does not get done for testing the security properties of binary code.
+
+The security scanners implemented in `llvm-bolt-binary-analysis` aim to enable
+better testing of security hardening implementations that require compilers to
+somehow generate assembly code with specific security properties.
+
+#### pac-ret analysis
+
+`pac-ret` protection is a security hardening scheme implemented in compilers
+such as gcc and clang, using the command line option
+`-mbranch-protection=pac-ret`. This option is enabled by default on most widely
+used distributions.
+
+The hardening scheme mitigates
+[Return-Oriented Programming (ROP)](https://llsoftsec.github.io/llsoftsecbook/#return-oriented-programming)
+attacks by making sure that return addresses are only ever stored to memory with
+a cryptographic hash, called a
+["Pointer Authentication Code" (PAC)](https://llsoftsec.github.io/llsoftsecbook/#pointer-authentication),
+in the upper bits of the pointer. This makes it substantially harder for
+attackers to divert control flow by overwriting a return address with a
+different value.
+
+The hardening scheme relies on compilers producing different code sequences when
+processing return addresses, especially when these are stored to and retrieved
+from memory.
+
+The 'pac-ret' binary analysis can be invoked using the command line option
+`--scanners=pac-ret`. It makes `llvm-bolt-binary-analysis` scan through the
+provided binary and check the following property:
+
+The security property that is checked is:
+
+When a register is used as the address to jump to in a return instruction,
+that register must either:
+
+1. never be changed within this function, i.e. have the same value as when the
+   function started, or
+2. the last write to the register must be by an authenticating instruction.
+
+##### Example 1
+
+For example, a typical non-pac-ret-protected function looks as follows:
+
+```
+stp     x29, x30, [sp, #-0x10]!
+mov     x29, sp
+bl      g at PLT
+add     x0, x0, #0x3
+ldp     x29, x30, [sp], #0x10
+ret
+```
+
+The return instruction `ret` uses register `x30` as containing the address to
+return to. Register `x30` was last written by instruction `ldp`, which is not an
+authenticating instruction. `llvm-bolt-binary-analysis --scanners=pac-ret` will
+report this as follows:
+
+```
+GS-PACRET: non-protected ret found in function f1, basic block .LBB00, at address 10310
+  The return instruction is     00010310:       ret # pacret-gadget: pac-ret-gadget<Ret:MCInstBBRef<BB:.LBB00:6>, Overwriting:[MCInstBBRef<BB:.LBB00:5> ]>
+  The 1 instructions that write to the return register after any authentication are:
+  1.     0001030c:      ldp     x29, x30, [sp], #0x10
+  This happens in the following basic block:
+    000102fc:   stp     x29, x30, [sp, #-0x10]!
+    00010300:   mov     x29, sp
+    00010304:   bl      g at PLT
+    00010308:   add     x0, x0, #0x3
+    0001030c:   ldp     x29, x30, [sp], #0x10
+    00010310:   ret # pacret-gadget: pac-ret-gadget<Ret:MCInstBBRef<BB:.LBB00:6>, Overwriting:[MCInstBBRef<BB:.LBB00:5> ]>
+```
+
+The exact format of how `llvm-bolt-binary-analysis` reports this is expected to
+evolve over time.
+
+##### Example 2: multiple "last-overwriting" instructions
+
+A simple example that show how there can be a set of "last overwriting"
+instructions of a register is the following:
+
+```
+        paciasp
+        stp     x29, x30, [sp, #-16]!
+        ldp     x29, x30, [sp], #16
+        cbnz    x0, 1f
+        autiasp
+1:
+        ret
+```
+
+This will produce the following diagnostic:
+
+```
+GS-PACRET: non-protected ret found in function f_crossbb1, basic block .Ltmp0, at address 102dc
+  The return instruction is     000102dc:       ret # pacret-gadget: pac-ret-gadget<Ret:MCInstBBRef<BB:.Ltmp0:0>, Overwriting:[MCInstBBRef<BB:.LFT0:0> MCInstBBRef<BB:.LBB00:2> ]>
+  The 2 instructions that write to the return register after any authentication are:
+  1.     000102d0:      ldp     x29, x30, [sp], #0x10
+  2.     000102d8:      autiasp
+```
+
+(Yes, this diagnostic could be improved because the second "overwriting"
+instruction, `autiasp`, is an authenticating instruction...)
+
+##### Known false positives or negatives
+
+The following are current known cases of false positives:
+
+1. Not handling "no-return" functions. See issue
+   [#115154](https://github.com/llvm/llvm-project/issues/115154) for details and
+   pointers to open PRs to fix this.
+
+The folowing are current known cases of false negatives:
+
+1. Not handling functions for which the CFG cannot be reconstructed by BOLT. The
+   plan is to implement support for this, picking up the implementation from the
+   [prototype branch](
+   https://github.com/llvm/llvm-project/compare/main...kbeyls:llvm-project:bolt-gadget-scanner-prototype).
+
+BOLT cannot currently handle functions with `cfi_negate_ra_state` correctly,
+i.e. any binaries built with `-mbranch-protection=pac-ret`. The scanner is meant
+to be used on specifically such binaries, so this is a major limitation! Work is
+going on in PR [#120064](https://github.com/llvm/llvm-project/pull/120064) to
+fix this.
 
 ## How to add your own binary analysis
 
diff --git a/bolt/include/bolt/Core/MCPlusBuilder.h b/bolt/include/bolt/Core/MCPlusBuilder.h
index 3634fed9757ceb..9351107e66bdf6 100644
--- a/bolt/include/bolt/Core/MCPlusBuilder.h
+++ b/bolt/include/bolt/Core/MCPlusBuilder.h
@@ -550,6 +550,22 @@ class MCPlusBuilder {
     return Analysis->isReturn(Inst);
   }
 
+  virtual MCPhysReg getAuthenticatedReg(const MCInst &Inst) const {
+    llvm_unreachable("not implemented");
+    return false;
+  }
+
+  virtual bool isAuthenticationOfReg(const MCInst &Inst,
+                                     const unsigned RegAuthenticated) const {
+    llvm_unreachable("not implemented");
+    return false;
+  }
+
+  virtual llvm::MCPhysReg getRegUsedAsRetDest(const MCInst &Inst) const {
+    llvm_unreachable("not implemented");
+    return getNoRegister();
+  }
+
   virtual bool isTerminator(const MCInst &Inst) const;
 
   virtual bool isNoop(const MCInst &Inst) const {
diff --git a/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h b/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h
new file mode 100644
index 00000000000000..a9740896369fa2
--- /dev/null
+++ b/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h
@@ -0,0 +1,209 @@
+//===- bolt/Passes/NonPacProtectedRetAnalysis.h -----------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef BOLT_PASSES_NONPACPROTECTEDRETANALYSIS_H
+#define BOLT_PASSES_NONPACPROTECTEDRETANALYSIS_H
+
+#include "bolt/Core/BinaryContext.h"
+#include "bolt/Core/BinaryFunction.h"
+#include "bolt/Passes/BinaryPasses.h"
+#include "llvm/ADT/SmallSet.h"
+
+namespace llvm {
+namespace bolt {
+
+/// @brief  MCInstReference represents a reference to an MCInst as stored either
+/// in a BinaryFunction (i.e. before a CFG is created), or in a BinaryBasicBlock
+/// (after a CFG is created). It aims to store the necessary information to be
+/// able to find the specific MCInst in either the BinaryFunction or
+/// BinaryBasicBlock data structures later, so that e.g. the InputAddress of
+/// the corresponding instruction can be computed.
+
+struct MCInstInBBReference {
+  BinaryBasicBlock *BB;
+  int64_t BBIndex;
+  MCInstInBBReference(BinaryBasicBlock *BB, int64_t BBIndex)
+      : BB(BB), BBIndex(BBIndex) {}
+  MCInstInBBReference() : BB(nullptr), BBIndex(0) {}
+  static MCInstInBBReference get(const MCInst *Inst, BinaryFunction &BF) {
+    for (BinaryBasicBlock& BB : BF)
+      for (size_t I = 0; I < BB.size(); ++I)
+        if (Inst == &(BB.getInstructionAtIndex(I)))
+          return MCInstInBBReference(&BB, I);
+    return {};
+  }
+  bool operator==(const MCInstInBBReference &RHS) const {
+    return BB == RHS.BB && BBIndex == RHS.BBIndex;
+  }
+  bool operator<(const MCInstInBBReference &RHS) const {
+    if (BB != RHS.BB)
+      return BB < RHS.BB;
+    return BBIndex < RHS.BBIndex;
+  }
+  operator MCInst &() const {
+    assert(BB != nullptr);
+    return BB->getInstructionAtIndex(BBIndex);
+  }
+  uint64_t getAddress() const {
+    // 4 bytes per instruction on AArch64;
+    return BB->getFunction()->getAddress() + BB->getOffset() + BBIndex * 4;
+  }
+};
+
+raw_ostream &operator<<(raw_ostream &OS, const MCInstInBBReference &);
+
+struct MCInstInBFReference {
+  BinaryFunction *BF;
+  uint32_t Offset;
+  MCInstInBFReference(BinaryFunction *BF, uint32_t Offset)
+      : BF(BF), Offset(Offset) {}
+  MCInstInBFReference() : BF(nullptr) {}
+  bool operator==(const MCInstInBFReference &RHS) const {
+    return BF == RHS.BF && Offset == RHS.Offset;
+  }
+  bool operator<(const MCInstInBFReference &RHS) const {
+    if (BF != RHS.BF)
+      return BF < RHS.BF;
+    return Offset < RHS.Offset;
+  }
+  operator MCInst &() const {
+    assert(BF != nullptr);
+    return *(BF->getInstructionAtOffset(Offset));
+  }
+
+  uint64_t getOffset() const { return Offset; }
+
+  uint64_t getAddress() const {
+    return BF->getAddress() + getOffset();
+  }
+};
+
+raw_ostream &operator<<(raw_ostream &OS, const MCInstInBFReference &);
+
+struct MCInstReference {
+  enum StoredIn { _BinaryFunction, _BinaryBasicBlock };
+  StoredIn CurrentLocation;
+  union U {
+    MCInstInBBReference BBRef;
+    MCInstInBFReference BFRef;
+    U(MCInstInBBReference BBRef) : BBRef(BBRef) {}
+    U(MCInstInBFReference BFRef) : BFRef(BFRef) {}
+  } U;
+  MCInstReference(MCInstInBBReference BBRef)
+      : CurrentLocation(_BinaryBasicBlock), U(BBRef) {}
+  MCInstReference(MCInstInBFReference BFRef)
+      : CurrentLocation(_BinaryFunction), U(BFRef) {}
+  MCInstReference(BinaryBasicBlock *BB, int64_t BBIndex)
+      : MCInstReference(MCInstInBBReference(BB, BBIndex)) {}
+  MCInstReference(BinaryFunction *BF, uint32_t Offset)
+      : MCInstReference(MCInstInBFReference(BF, Offset)) {}
+
+  bool operator<(const MCInstReference &RHS) const {
+    if (CurrentLocation != RHS.CurrentLocation)
+      return CurrentLocation < RHS.CurrentLocation;
+    switch (CurrentLocation) {
+    case _BinaryBasicBlock:
+      return U.BBRef < RHS.U.BBRef;
+    case _BinaryFunction:
+      return U.BFRef < RHS.U.BFRef;
+    }
+    llvm_unreachable("");
+  }
+
+  bool operator==(const MCInstReference &RHS) const {
+    if (CurrentLocation != RHS.CurrentLocation)
+      return false;
+    switch (CurrentLocation) {
+    case _BinaryBasicBlock:
+      return U.BBRef == RHS.U.BBRef;
+    case _BinaryFunction:
+      return U.BFRef == RHS.U.BFRef;
+    }
+    llvm_unreachable("");
+  }
+
+  operator MCInst &() const {
+    switch (CurrentLocation) {
+    case _BinaryBasicBlock:
+      return U.BBRef;
+    case _BinaryFunction:
+      return U.BFRef;
+    }
+    llvm_unreachable("");
+  }
+
+  uint64_t getAddress() const {
+    switch (CurrentLocation) {
+    case _BinaryBasicBlock:
+      return U.BBRef.getAddress();
+    case _BinaryFunction:
+      return U.BFRef.getAddress();
+    }
+    llvm_unreachable("");
+  }
+
+  BinaryFunction *getFunction() const {
+    switch (CurrentLocation) {
+    case _BinaryFunction:
+      return U.BFRef.BF;
+    case _BinaryBasicBlock:
+      return U.BBRef.BB->getFunction();
+    }
+    llvm_unreachable("");
+  }
+
+  BinaryBasicBlock *getBasicBlock() const {
+    switch (CurrentLocation) {
+    case _BinaryFunction:
+      return nullptr;
+    case _BinaryBasicBlock:
+      return U.BBRef.BB;
+    }
+    llvm_unreachable("");
+  }
+};
+
+raw_ostream &operator<<(raw_ostream &OS, const MCInstReference &);
+
+struct NonPacProtectedRetGadget {
+  MCInstReference RetInst;
+  std::vector<MCInstReference> OverwritingRetRegInst;
+  bool operator==(const NonPacProtectedRetGadget &RHS) const {
+    return RetInst == RHS.RetInst &&
+           OverwritingRetRegInst == RHS.OverwritingRetRegInst;
+  }
+  NonPacProtectedRetGadget(
+      MCInstReference RetInst,
+      const std::vector<MCInstReference>& OverwritingRetRegInst)
+      : RetInst(RetInst), OverwritingRetRegInst(OverwritingRetRegInst) {}
+};
+
+raw_ostream &operator<<(raw_ostream &OS, const NonPacProtectedRetGadget &NPPRG);
+class PacRetAnalysis;
+
+class NonPacProtectedRetAnalysis : public BinaryFunctionPass {
+  void runOnFunction(BinaryFunction &Function,
+                     MCPlusBuilder::AllocatorIdTy AllocatorId);
+  SmallSet<MCPhysReg, 1>
+  computeDfState(PacRetAnalysis &PRA, BinaryFunction &BF,
+                 MCPlusBuilder::AllocatorIdTy AllocatorId);
+  unsigned GadgetAnnotationIndex;
+
+public:
+  explicit NonPacProtectedRetAnalysis() : BinaryFunctionPass(false) {}
+
+  const char *getName() const override { return "non-pac-protected-rets"; }
+
+  /// Pass entry point
+  Error runOnFunctions(BinaryContext &BC) override;
+};
+
+} // namespace bolt
+} // namespace llvm
+
+#endif
\ No newline at end of file
diff --git a/bolt/include/bolt/Utils/CommandLineOpts.h b/bolt/include/bolt/Utils/CommandLineOpts.h
index 111eb650c37465..fefd7969ef6f4f 100644
--- a/bolt/include/bolt/Utils/CommandLineOpts.h
+++ b/bolt/include/bolt/Utils/CommandLineOpts.h
@@ -80,6 +80,10 @@ extern llvm::cl::opt<unsigned> Verbosity;
 /// Return true if we should process all functions in the binary.
 bool processAllFunctions();
 
+enum GadgetScannerKind { GS_PACRET, GS_ALL };
+
+extern llvm::cl::list<GadgetScannerKind> GadgetScannersToRun;
+
 } // namespace opts
 
 namespace llvm {
diff --git a/bolt/lib/Passes/CMakeLists.txt b/bolt/lib/Passes/CMakeLists.txt
index 1c1273b3d2420d..b3fbb0ba0108f4 100644
--- a/bolt/lib/Passes/CMakeLists.txt
+++ b/bolt/lib/Passes/CMakeLists.txt
@@ -23,6 +23,7 @@ add_llvm_library(LLVMBOLTPasses
   LoopInversionPass.cpp
   LivenessAnalysis.cpp
   MCF.cpp
+  NonPacProtectedRetAnalysis.cpp
   PatchEntries.cpp
   PettisAndHansen.cpp
   PLTCall.cpp
diff --git a/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp b/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
new file mode 100644
index 00000000000000..d6bc902f663fc9
--- /dev/null
+++ b/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
@@ -0,0 +1,520 @@
+//===- bolt/Passes/NonPacProtectedRetAnalysis.cpp -------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file implements a pass that looks for any AArch64 return instructions
+// that may not be protected by PAuth authentication instructions when needed.
+//
+// When needed = the register used to return (almost always X30), is potentially
+// written to between the AUThentication instruction and the RETurn instruction.
+//
+//===----------------------------------------------------------------------===//
+
+#include "bolt/Passes/NonPacProtectedRetAnalysis.h"
+#include "bolt/Core/ParallelUtilities.h"
+#include "bolt/Passes/DataflowAnalysis.h"
+#include "llvm/ADT/SmallSet.h"
+#include "llvm/MC/MCInst.h"
+#include "llvm/Support/Format.h"
+
+#define DEBUG_TYPE "bolt-nonpacprotectedret"
+
+namespace llvm {
+namespace bolt {
+
+raw_ostream &operator<<(raw_ostream &OS, const MCInstInBBReference &Ref) {
+  OS << "MCInstBBRef<";
+  if (Ref.BB == nullptr)
+    OS << "BB:(null)";
+  else
+    OS << "BB:" << Ref.BB->getName() << ":" << Ref.BBIndex;
+  OS << ">";
+  return OS;
+}
+
+raw_ostream &operator<<(raw_ostream &OS, const MCInstInBFReference &Ref) {
+  OS << "MCInstBFRef<";
+  if (Ref.BF == nullptr)
+    OS << "BF:(null)";
+  else
+    OS << "BF:" << Ref.BF->getPrintName() << ":" << Ref.getOffset();
+  OS << ">";
+  return OS;
+}
+
+raw_ostream &operator<<(raw_ostream &OS, const MCInstReference &Ref) {
+  switch (Ref.CurrentLocation) {
+  case MCInstReference::_BinaryBasicBlock:
+    OS << Ref.U.BBRef;
+    return OS;
+  case MCInstReference::_BinaryFunction:
+    OS << Ref.U.BFRef;
+    return OS;
+  }
+  llvm_unreachable("");
+}
+
+raw_ostream &operator<<(raw_ostream &OS,
+                        const NonPacProtectedRetGadget &NPPRG) {
+  OS << "pac-ret-gadget<";
+  OS << "Ret:" << NPPRG.RetInst << ", ";
+  OS << "Overwriting:[";
+  for (auto Ref : NPPRG.OverwritingRetRegInst)
+    OS << Ref << " ";
+  OS << "]>";
+  return OS;
+}
+
+// The security property that is checked is:
+// When a register is used as the address to jump to in a return instruction,
+// that register must either:
+// (a) never be changed within this function, i.e. have the same value as when
+//     the function started, or
+// (b) the last write to the register must be by an authentication instruction.
+
+// This property is checked by using data flow analysis to keep track of which
+// registers have been written (def-ed), since last authenticated. Those are
+// exactly the registers containing values that should not be trusted (as they
+// could have changed since the last time they were authenticated). For pac-ret,
+// any return instruction using such a register is a gadget to be reported. For
+// PAuthABI, any indirect control flow using such a register should be reported?
+
+// This security property is verified using a dataflow analysis.
+
+// Furthermore, when producing a diagnostic for a found non-pac-ret protected
+// return, the analysis also lists the last instructions that wrote to the
+// register used in the return instruction.
+// The total set of registers used in return instructions in a given function is
+// small. It almost always is just `X30`.
+// In order to reduce the memory consumption of storing this additional state
+// during the dataflow analysis, this is computed by running the dataflow
+// analysis twice:
+// 1. In the first run, the dataflow analysis only keeps track of the security
+//    property: i.e. which registers have been overwritten since the last
+//    time they've been authenticated.
+// 2. If the first run finds any return instructions using a
+//    written-to-last-by-an-non-authenticating instruction, the data flow
+//    analysis will be run a second time. The first run will return which
+//    registers are used in the gadgets to be reported. This information is
+//    used in the second run to also track with instructions last wrote to
+//    those registers.
+
+struct State {
+  /// A BitVector containing the registers that have been clobbered, and
+  /// not authenticated.
+  BitVector NonAutClobRegs;
+  /// A vector of sets, only used in the second data flow run.
+  /// Each element in the vector represent one registers for which we
+  /// track the set of last instructions that wrote to this register.
+  /// For pac-ret analysis, the expectations is that almost all return
+  /// instructions only use register `X30`, and therefore, this vector
+  /// will have length 1 in the second run.
+  std::vector<SmallPtrSet<const MCInst *, 4>> LastInstWritingReg;
+  State() {}
+  State(uint16_t NumRegs, uint16_t NumRegsToTrack)
+      : NonAutClobRegs(NumRegs), LastInstWritingReg(NumRegsToTrack) {}
+  State &operator|=(const State &StateIn) {
+    NonAutClobRegs |= StateIn.NonAutClobRegs;
+    for (unsigned I = 0; I < LastInstWritingReg.size(); ++I)
+      for (const MCInst *J : StateIn.LastInstWritingReg[I])
+        LastInstWritingReg[I].insert(J);
+    return *this;
+  }
+  bool operator==(const State &RHS) const {
+    return NonAutClobRegs == RHS.NonAutClobRegs &&
+           LastInstWritingReg == RHS.LastInstWritingReg;
+  }
+  bool operator!=(const State &RHS) const { return !((*this) == RHS); }
+};
+
+static void printLastInsts(
+    raw_ostream &OS,
+    const std::vector<SmallPtrSet<const MCInst *, 4>> &LastInstWritingReg) {
+  OS << "Insts: ";
+  for (unsigned I = 0; I < LastInstWritingReg.size(); ++I) {
+    auto Set = LastInstWritingReg[I];
+    OS << "[" << I << "](";
+    for (const MCInst *MCInstP : Set)
+      OS << MCInstP << " ";
+    OS << ")";
+  }
+}
+
+raw_ostream &operator<<(raw_ostream &OS, const State &S) {
+  OS << "pacret-state<";
+  OS << "NonAutClobRegs: " << S.NonAutClobRegs;
+  OS << "Insts: ";
+  printLastInsts(OS, S.LastInstWritingReg);
+  OS << ">";
+  return OS;
+}
+
+class PacStatePrinter {
+public:
+  void print(raw_ostream &OS, const State &State) const;
+  explicit PacStatePrinter(const BinaryContext &BC) : BC(BC) {}
+
+private:
+  const BinaryContext &BC;
+};
+
+void PacStatePrinter::print(raw_ostream &OS, const State &S) const {
+  RegStatePrinter RegStatePrinter(BC);
+  OS << "pacret-state<";
+  OS << "NonAutClobRegs: ";
+  RegStatePrinter.print(OS, S.NonAutClobRegs);
+  OS << "Insts: ";
+  printLastInsts(OS, S.LastInstWritingReg);
+  OS << ">";
+}
+
+class PacRetAnalysis
+    : public DataflowAnalysis<PacRetAnalysis, State, false /*Backward*/,
+                              PacStatePrinter> {
+  using Parent =
+      DataflowAnalysis<PacRetAnalysis, State, false, PacStatePrinter>;
+  friend Parent;
+
+public:
+  PacRetAnalysis(BinaryFunction &BF, MCPlusBuilder::AllocatorIdTy AllocId,
+                 const std::vector<MCPhysReg> &RegsToTrackInstsFor)
+      : Parent(BF, AllocId), NumRegs(BF.getBinaryContext().MRI->getNumRegs()),
+        RegsToTrackInstsFor(RegsToTrackInstsFor),
+        TrackingLastInsts(RegsToTrackInstsFor.size() != 0),
+        Reg2StateIdx(RegsToTrackInstsFor.size() == 0
+                         ? 0
+                         : *std::max_element(RegsToTrackInstsFor.begin(),
+                                             RegsToTrackInstsFor.end()) +
+                               1,
+                     -1) {
+    for (unsigned I = 0; I < RegsToTrackInstsFor.size(); ++I)
+      Reg2StateIdx[RegsToTrackInstsFor[I]] = I;
+  }
+  virtual ~PacRetAnalysis() {}
+
+  void run() { Parent::run(); }
+
+protected:
+  const uint16_t NumRegs;
+  /// RegToTrackInstsFor is the set of registers for which the dataflow analysis
+  /// must compute which the last set of instructions writing to it are.
+  const std::vector<MCPhysReg> RegsToTrackInstsFor;
+  const bool TrackingLastInsts;
+  /// Reg2StateIdx maps Register to the index in the vector used in State to
+  /// track which instructions last wrote to this register.
+  std::vector<uint16_t> Reg2StateIdx;
+
+  bool trackReg(MCPhysReg Reg) {
+    for (auto R : RegsToTrackInstsFor)
+      if (R == Reg)
+        return true;
+    return false;
+  }
+  uint16_t reg2StateIdx(MCPhysReg Reg) const {
+    assert(Reg < Reg2StateIdx.size());
+    return Reg2StateIdx[Reg];
+  }
+
+  void preflight() {}
+
+  State getStartingStateAtBB(const BinaryBasicBlock &BB) {
+    return State(NumRegs, RegsToTrackInstsFor.size());
+  }
+
+  State getStartingStateAtPoint(const MCInst &Point) {
+    return State(NumRegs, RegsToTrackInstsFor.size());
+  }
+
+  void doConfluence(State &StateOut, const State &StateIn) {
+    PacStatePrinter P(BC);
+    LLVM_DEBUG({
+      dbgs() << " PacRetAnalysis::Confluence(\n";
+      dbgs() << "   State 1: ";
+      P.print(dbgs(), StateOut);
+      dbgs() << "\n";
+      dbgs() << "   State 2: ";
+      P.print(dbgs(), StateIn);
+      dbgs() << ")\n";
+    });
+
+    StateOut |= StateIn;
+
+    LLVM_DEBUG({
+      dbgs() << "   merged state: ";
+      P.print(dbgs(), StateOut);
+      dbgs() << "\n";
+    });
+  }
+
+  State computeNext(const MCInst &Point, const State &Cur) {
+    PacStatePrinter P(BC);
+    LLVM_DEBUG({
+      dbgs() << " PacRetAnalysis::ComputeNext(";
+      BC.InstPrinter->printInst(&const_cast<MCInst &>(Point), 0, "", *BC.STI,
+                                dbgs());
+      dbgs() << ", ";
+      P.print(dbgs(), Cur);
+      dbgs() << ")\n";
+    });
+
+    State Next = Cur;
+    BitVector Written = BitVector(NumRegs, false);
+    BC.MIB->getWrittenRegs(Point, Written);
+    Next.NonAutClobRegs |= Written;
+    // Keep track of this instruction if it writes to any of the registers we
+    // need to track that for:
+    if (TrackingLastInsts) {
+      for (auto WrittenReg : Written.set_bits()) {
+        if (trackReg(WrittenReg)) {
+          Next.LastInstWritingReg[reg2StateIdx(WrittenReg)] = {};
+          Next.LastInstWritingReg[reg2StateIdx(WrittenReg)].insert(&Point);
+        }
+      }
+    }
+
+    MCPhysReg AutReg = BC.MIB->getAuthenticatedReg(Point);
+    if (AutReg != BC.MIB->getNoRegister()) {
+      Next.NonAutClobRegs.reset(
+          BC.MIB->getAliases(AutReg, /*OnlySmaller=*/true));
+      if (TrackingLastInsts && trackReg(AutReg))
+        Next.LastInstWritingReg[reg2StateIdx(AutReg)] = {};
+    }
+
+    LLVM_DEBUG({
+      dbgs() << "  .. result: (";
+      P.print(dbgs(), Next);
+      dbgs() << ")\n";
+    });
+
+    return Next;
+  }
+
+  StringRef getAnnotationName() const { return StringRef("PacRetAnalysis"); }
+
+public:
+  std::vector<MCInstReference>
+  getLastClobberingInsts(const MCInst Ret, BinaryFunction &BF,
+                         const BitVector &DirtyRawRegs) const {
+    if (!TrackingLastInsts)
+      return {};
+    if (auto MaybeState = getStateAt(Ret)) {
+      const State &S = *MaybeState;
+      // Due to aliasing registers, multiple registers may have
+      // been tracked.
+      std::set<const MCInst *> LastWritingInsts;
+      for (MCPhysReg TrackedReg : DirtyRawRegs.set_bits()) {
+        for (const MCInst *LastInstWriting :
+             S.LastInstWritingReg[reg2StateIdx(TrackedReg)])
+          LastWritingInsts.insert(LastInstWriting);
+      }
+      std::vector<MCInstReference> Result;
+      for (const MCInst *LastInstWriting : LastWritingInsts) {
+        MCInstInBBReference Ref = MCInstInBBReference::get(LastInstWriting, BF);
+        assert(Ref.BB != nullptr && "Expected Inst to be found");
+        Result.push_back(MCInstReference(Ref));
+      }
+      return Result;
+    }
+    llvm_unreachable("Expected State to be present");
+  }
+};
+
+SmallSet<MCPhysReg, 1> NonPacProtectedRetAnalysis::computeDfState(
+    PacRetAnalysis &PRA, BinaryFunction &BF,
+    MCPlusBuilder::AllocatorIdTy AllocatorId) {
+  PRA.run();
+  LLVM_DEBUG({
+    dbgs() << " After PacRetAnalysis:\n";
+    BF.dump();
+  });
+  // Now scan the CFG for non-authenticating return instructions that use an
+  // overwritten, non-authenticated register as return address.
+  SmallSet<MCPhysReg, 1> RetRegsWithGadgets;
+  BinaryContext &BC = BF.getBinaryContext();
+  for (BinaryBasicBlock &BB : BF) {
+    for (int64_t I = BB.size() - 1; I >= 0; --I) {
+      MCInst &Inst = BB.getInstructionAtIndex(I);
+      if (BC.MIB->isReturn(Inst)) {
+        MCPhysReg RetReg = BC.MIB->getRegUsedAsRetDest(Inst);
+        LLVM_DEBUG({
+          dbgs() << "  Found RET inst: ";
+          BC.printInstruction(dbgs(), Inst);
+          dbgs() << "    RetReg: " << BC.MRI->getName(RetReg)
+                 << "; authenticatesReg: "
+                 << BC.MIB->isAuthenticationOfReg(Inst, RetReg) << "\n";
+        });
+        if (BC.MIB->isAuthenticationOfReg(Inst, RetReg))
+          break;
+        BitVector DirtyRawRegs = PRA.getStateAt(Inst)->NonAutClobRegs;
+        LLVM_DEBUG({
+          dbgs() << "  DirtyRawRegs at Ret: ";
+          RegStatePrinter RSP(BC);
+          RSP.print(dbgs(), DirtyRawRegs);
+          dbgs() << "\n";
+        });
+        DirtyRawRegs &= BC.MIB->getAliases(RetReg, /*OnlySmaller=*/true);
+        LLVM_DEBUG({
+          dbgs() << "  Intersection with RetReg: ";
+          RegStatePrinter RSP(BC);
+          RSP.print(dbgs(), DirtyRawRegs);
+          dbgs() << "\n";
+        });
+        if (DirtyRawRegs.any()) {
+          // This return instruction needs to be reported
+          // First remove the annotation that the first, fast run of
+          // the dataflow analysis may have added.
+          if (BC.MIB->hasAnnotation(Inst, GadgetAnnotationIndex))
+            BC.MIB->removeAnnotation(Inst, GadgetAnnotationIndex);
+          BC.MIB->addAnnotation(
+              Inst, GadgetAnnotationIndex,
+              NonPacProtectedRetGadget(
+                  MCInstInBBReference(&BB, I),
+                  PRA.getLastClobberingInsts(Inst, BF, DirtyRawRegs)),
+              AllocatorId);
+          LLVM_DEBUG({
+            dbgs() << "  Added gadget info annotation: ";
+            BC.printInstruction(dbgs(), Inst, 0, &BF);
+            dbgs() << "\n";
+          });
+          for (MCPhysReg RetRegWithGadget : DirtyRawRegs.set_bits())
+            RetRegsWithGadgets.insert(RetRegWithGadget);
+        }
+      }
+    }
+  }
+  return RetRegsWithGadgets;
+}
+
+void NonPacProtectedRetAnalysis::runOnFunction(
+    BinaryFunction &BF, MCPlusBuilder::AllocatorIdTy AllocatorId) {
+  LLVM_DEBUG({
+    dbgs() << "Analyzing in function " << BF.getPrintName() << ", AllocatorId "
+           << AllocatorId << "\n";
+  });
+  LLVM_DEBUG({ BF.dump(); });
+
+  if (BF.hasCFG()) {
+    PacRetAnalysis PRA(BF, AllocatorId, {});
+    SmallSet<MCPhysReg, 1> RetRegsWithGadgets =
+        computeDfState(PRA, BF, AllocatorId);
+    if (!RetRegsWithGadgets.empty()) {
+      // Redo the analysis, but now also track which instructions last wrote
+      // to any of the registers in RetRegsWithGadgets, so that better
+      // diagnostics can be produced.
+      std::vector<MCPhysReg> RegsToTrack;
+      for (MCPhysReg R : RetRegsWithGadgets)
+        RegsToTrack.push_back(R);
+      PacRetAnalysis PRWIA(BF, AllocatorId, RegsToTrack);
+      SmallSet<MCPhysReg, 1> RetRegsWithGadgets =
+          computeDfState(PRWIA, BF, AllocatorId);
+    }
+  }
+}
+
+void printBB(const BinaryContext &BC, const BinaryBasicBlock *BB,
+             size_t StartIndex = 0, size_t EndIndex = -1) {
+  if (EndIndex == (size_t)-1)
+    EndIndex = BB->size() - 1;
+  const BinaryFunction *BF = BB->getFunction();
+  for (unsigned I = StartIndex; I <= EndIndex; ++I) {
+    uint64_t Address = BB->getOffset() + BF->getAddress() + 4 * I;
+    const MCInst &Inst = BB->getInstructionAtIndex(I);
+    if (BC.MIB->isCFI(Inst))
+      continue;
+    BC.printInstruction(outs(), Inst, Address, BF);
+  }
+}
+
+void reportFoundGadgetInSingleBBSingleOverwInst(const BinaryContext &BC,
+                                                const MCInstReference OverwInst,
+                                                const MCInstReference RetInst) {
+  BinaryBasicBlock *BB = RetInst.getBasicBlock();
+  assert(OverwInst.CurrentLocation == MCInstReference::_BinaryBasicBlock);
+  assert(RetInst.CurrentLocation == MCInstReference::_BinaryBasicBlock);
+  MCInstInBBReference OverwInstBB = OverwInst.U.BBRef;
+  if (BB == OverwInstBB.BB) {
+    // overwriting inst and ret instruction are in the same basic block.
+    assert(OverwInstBB.BBIndex < RetInst.U.BBRef.BBIndex);
+    outs() << "  This happens in the following basic block:\n";
+    printBB(BC, BB);
+  }
+}
+
+void reportFoundGadget(const BinaryContext &BC, const MCInst &Inst,
+                       unsigned int GadgetAnnotationIndex) {
+  auto NPPRG = BC.MIB->getAnnotationAs<NonPacProtectedRetGadget>(
+      Inst, GadgetAnnotationIndex);
+  MCInstReference RetInst = NPPRG.RetInst;
+  BinaryFunction *BF = RetInst.getFunction();
+  BinaryBasicBlock *BB = RetInst.getBasicBlock();
+
+  outs() << "\nGS-PACRET: " << "non-protected ret found in function "
+         << BF->getPrintName();
+  if (BB)
+    outs() << ", basic block " << BB->getName();
+  outs() << ", at address " << llvm::format("%x", RetInst.getAddress()) << "\n";
+  outs() << "  The return instruction is ";
+  BC.printInstruction(outs(), RetInst, RetInst.getAddress(), BF);
+  outs() << "  The " << NPPRG.OverwritingRetRegInst.size()
+         << " instructions that write to the return register after any "
+            "authentication are:\n";
+  // Sort by address to ensure output is deterministic.
+  std::sort(std::begin(NPPRG.OverwritingRetRegInst),
+            std::end(NPPRG.OverwritingRetRegInst),
+            [](const MCInstReference &A, const MCInstReference &B) {
+              return A.getAddress() < B.getAddress();
+            });
+  for (unsigned I = 0; I < NPPRG.OverwritingRetRegInst.size(); ++I) {
+    MCInstReference InstRef = NPPRG.OverwritingRetRegInst[I];
+    outs() << "  " << (I + 1) << ". ";
+    BC.printInstruction(outs(), InstRef, InstRef.getAddress(), BF);
+  };
+  LLVM_DEBUG({
+    dbgs() << "  .. OverWritingRetRegInst:\n";
+    for (MCInstReference Ref : NPPRG.OverwritingRetRegInst) {
+      dbgs() << "    " << Ref << "\n";
+    }
+  });
+  if (NPPRG.OverwritingRetRegInst.size() == 1) {
+    const MCInstReference OverwInst = NPPRG.OverwritingRetRegInst[0];
+    assert(OverwInst.CurrentLocation == MCInstReference::_BinaryBasicBlock);
+    reportFoundGadgetInSingleBBSingleOverwInst(BC, OverwInst, RetInst);
+  }
+}
+
+Error NonPacProtectedRetAnalysis::runOnFunctions(BinaryContext &BC) {
+  GadgetAnnotationIndex = BC.MIB->getOrCreateAnnotationIndex("pacret-gadget");
+
+  ParallelUtilities::WorkFuncWithAllocTy WorkFun =
+      [&](BinaryFunction &BF, MCPlusBuilder::AllocatorIdTy AllocatorId) {
+        runOnFunction(BF, AllocatorId);
+      };
+
+  ParallelUtilities::PredicateTy SkipFunc = [&](const BinaryFunction &BF) {
+    return false; // BF.shouldPreserveNops();
+  };
+
+  ParallelUtilities::runOnEachFunctionWithUniqueAllocId(
+      BC, ParallelUtilities::SchedulingPolicy::SP_INST_LINEAR, WorkFun,
+      SkipFunc, "NonPacProtectedRetAnalysis");
+
+  for (BinaryFunction *BF : BC.getAllBinaryFunctions())
+    if (BF->hasCFG()) {
+      for (BinaryBasicBlock &BB : *BF) {
+        for (int64_t I = BB.size() - 1; I >= 0; --I) {
+          MCInst &Inst = BB.getInstructionAtIndex(I);
+          if (BC.MIB->hasAnnotation(Inst, GadgetAnnotationIndex)) {
+            reportFoundGadget(BC, Inst, GadgetAnnotationIndex);
+          }
+        }
+      }
+    }
+  return Error::success();
+}
+
+} // namespace bolt
+} // namespace llvm
\ No newline at end of file
diff --git a/bolt/lib/Rewrite/RewriteInstance.cpp b/bolt/lib/Rewrite/RewriteInstance.cpp
index 4329235d470497..0a45c35f02da32 100644
--- a/bolt/lib/Rewrite/RewriteInstance.cpp
+++ b/bolt/lib/Rewrite/RewriteInstance.cpp
@@ -20,6 +20,7 @@
 #include "bolt/Passes/BinaryPasses.h"
 #include "bolt/Passes/CacheMetrics.h"
 #include "bolt/Passes/IdenticalCodeFolding.h"
+#include "bolt/Passes/NonPacProtectedRetAnalysis.h"
 #include "bolt/Passes/ReorderFunctions.h"
 #include "bolt/Profile/BoltAddressTranslation.h"
 #include "bolt/Profile/DataAggregator.h"
@@ -245,6 +246,12 @@ static cl::opt<bool> WriteBoltInfoSection(
     "bolt-info", cl::desc("write bolt info section in the output binary"),
     cl::init(true), cl::Hidden, cl::cat(BoltOutputCategory));
 
+cl::list<GadgetScannerKind> GadgetScannersToRun(
+    "scanners", cl::desc("which gadget scanners to run"),
+    cl::values(clEnumValN(GS_PACRET, "pacret", "pac-ret"),
+               clEnumValN(GS_ALL, "all", "all")),
+    cl::ZeroOrMore, cl::CommaSeparated, cl::cat(BinaryAnalysisCategory));
+
 } // namespace opts
 
 // FIXME: implement a better way to mark sections for replacement.
@@ -3490,7 +3497,23 @@ void RewriteInstance::runOptimizationPasses() {
   BC->logBOLTErrorsAndQuitOnFatal(BinaryFunctionPassManager::runAllPasses(*BC));
 }
 
-void RewriteInstance::runBinaryAnalyses() {}
+void RewriteInstance::runBinaryAnalyses() {
+  NamedRegionTimer T("runBinaryAnalyses", "run binary analysis passes",
+                     TimerGroupName, TimerGroupDesc, opts::TimeRewrite);
+  BinaryFunctionPassManager Manager(*BC);
+  // FIXME: add a pass that warns about which functions do not have CFG,
+  // and therefore, analysis is most likely to be less accurate.
+  using GSK = opts::GadgetScannerKind;
+  // if no command line option was given, act as if "all" was specified.
+  if (opts::GadgetScannersToRun.size() == 0)
+    opts::GadgetScannersToRun.addValue(GSK::GS_ALL);
+  for (GSK ScannerToRun : opts::GadgetScannersToRun) {
+    if (ScannerToRun == GSK::GS_PACRET || ScannerToRun == GSK::GS_ALL)
+      Manager.registerPass(std::make_unique<NonPacProtectedRetAnalysis>());
+  }
+
+  BC->logBOLTErrorsAndQuitOnFatal(Manager.runPasses());
+}
 
 void RewriteInstance::preregisterSections() {
   // Preregister sections before emission to set their order in the output.
diff --git a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
index 679c9774c767f7..1840750822aa3f 100644
--- a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
+++ b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
@@ -148,6 +148,69 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
     return false;
   }
 
+  MCPhysReg getAuthenticatedReg(const MCInst &Inst) const override {
+    switch (Inst.getOpcode()) {
+    case AArch64::AUTIAZ:
+    case AArch64::AUTIBZ:
+    case AArch64::AUTIASP:
+    case AArch64::AUTIBSP:
+    case AArch64::RETAA:
+    case AArch64::RETAB:
+      return AArch64::LR;
+    case AArch64::AUTIA1716:
+    case AArch64::AUTIB1716:
+      return AArch64::X17;
+    case AArch64::ERETAA:
+    case AArch64::ERETAB:
+      return AArch64::LR;
+
+    case AArch64::AUTIA:
+    case AArch64::AUTIB:
+    case AArch64::AUTDA:
+    case AArch64::AUTDB:
+    case AArch64::AUTIZA:
+    case AArch64::AUTIZB:
+    case AArch64::AUTDZA:
+    case AArch64::AUTDZB:
+      assert(Inst.getOperand(0).isReg());
+      return Inst.getOperand(0).getReg();
+
+    default:
+      return getNoRegister();
+    }
+  }
+
+  bool isAuthenticationOfReg(const MCInst &Inst,
+                             const unsigned RegAuthenticated) const override {
+    if (RegAuthenticated == getNoRegister())
+      return false;
+    return getAuthenticatedReg(Inst) == RegAuthenticated;
+  }
+
+  llvm::MCPhysReg getRegUsedAsRetDest(const MCInst &Inst) const override {
+    assert(isReturn(Inst));
+    switch (Inst.getOpcode()) {
+    case AArch64::RET:
+      // There should be one register that the return reads, and
+      // that's the one being used as the jump target?
+      for (unsigned OpIdx = 0, EndIdx = Inst.getNumOperands(); OpIdx < EndIdx;
+           ++OpIdx) {
+        const MCOperand &MO = Inst.getOperand(OpIdx);
+        if (MO.isReg())
+          return MO.getReg();
+      }
+      return getNoRegister();
+    case AArch64::RETAA:
+    case AArch64::RETAB:
+    case AArch64::ERETAA:
+    case AArch64::ERETAB:
+    case AArch64::ERET:
+      return AArch64::LR;
+    default:
+      llvm_unreachable("Unhandled return instruction");
+    }
+  }
+
   bool isADRP(const MCInst &Inst) const override {
     return Inst.getOpcode() == AArch64::ADRP;
   }
diff --git a/bolt/test/binary-analysis/AArch64/cmdline-args.test b/bolt/test/binary-analysis/AArch64/cmdline-args.test
index e414818644a3b0..1204d5b1289af4 100644
--- a/bolt/test/binary-analysis/AArch64/cmdline-args.test
+++ b/bolt/test/binary-analysis/AArch64/cmdline-args.test
@@ -13,7 +13,7 @@ NONEXISTINGFILEARG:       llvm-bolt-binary-analysis: 'non-existing-file': No suc
 RUN: not llvm-bolt-binary-analysis %p/Inputs/dummy.txt 2>&1 | FileCheck -check-prefix=NOELFFILEARG %s
 NOELFFILEARG:       llvm-bolt-binary-analysis: '{{.*}}/Inputs/dummy.txt': The file was not recognized as a valid object file.
 
-RUN: %clang %cflags %p/../../Inputs/asm_foo.s %p/../../Inputs/asm_main.c -o %t.exe
+RUN: %clang %cflags -Wl,--emit-relocs %p/../../Inputs/asm_foo.s %p/../../Inputs/asm_main.c -o %t.exe
 RUN: llvm-bolt-binary-analysis %t.exe 2>&1 | FileCheck -check-prefix=VALIDELFFILEARG --allow-empty %s
 # Check that there are no BOLT-WARNING or BOLT-ERROR output lines
 VALIDELFFILEARG:     BOLT-INFO:
@@ -30,4 +30,10 @@ HELP-NEXT:  USAGE: llvm-bolt-binary-analysis [options] <executable>
 HELP-EMPTY:
 HELP-NEXT:  OPTIONS:
 HELP-EMPTY:
+HELP-NEXT:  BinaryAnalysis options:
+HELP-EMPTY:
+HELP-NEXT:   --scanners=<value> - which gadget scanners to run
+HELP-NEXT:   =pacret - pac-ret
+HELP-NEXT:   =all - all
+HELP-EMPTY:
 HELP-NEXT:  Generic Options:
diff --git a/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s b/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s
new file mode 100644
index 00000000000000..8731bb46362867
--- /dev/null
+++ b/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s
@@ -0,0 +1,681 @@
+// RUN: %clang %cflags -march=armv8.3-a -mbranch-protection=pac-ret %s %p/../../Inputs/asm_main.c -o %t.exe
+// RUN: llvm-bolt-binary-analysis --scanners=pacret %t.exe 2>&1 | FileCheck -check-prefix=CHECK --allow-empty %s
+
+        .text
+
+        .globl  f1
+        .type   f1, at function
+f1:
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        ldp     x29, x30, [sp], #16
+        // autiasp
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f1, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT:    The return instruction is     {{[0-9a-f]+}}:       ret
+// CHECK-NEXT:    The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT:    1. {{[0-9a-f]+}}: ldp     x29, x30, [sp], #0x10
+// CHECK-NEXT:  This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}:   paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}:   stp     x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}:   mov     x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}:   bl      g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}:   add     x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}:   ldp     x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}:   ret
+        ret
+        .size f1, .-f1
+
+
+        .globl  f_intermediate_overwrite1
+        .type   f_intermediate_overwrite1, at function
+f_intermediate_overwrite1:
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        autiasp
+        ldp     x29, x30, [sp], #16
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_intermediate_overwrite1, basic block .LBB
+// CHECK-NEXT:    The return instruction is     {{[0-9a-f]+}}:       ret
+// CHECK-NEXT:    The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT:    1. {{[0-9a-f]+}}: ldp     x29, x30, [sp], #0x10
+// CHECK-NEXT:  This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}:   paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}:   stp     x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}:   mov     x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}:   bl      g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}:   add     x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}:   autiasp
+// CHECK-NEXT: {{[0-9a-f]+}}:   ldp     x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}:   ret
+        ret
+        .size f_intermediate_overwrite1, .-f_intermediate_overwrite1
+
+        .globl  f_intermediate_overwrite2
+        .type   f_intermediate_overwrite2, at function
+f_intermediate_overwrite2:
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        ldp     x29, x30, [sp], #16
+        autiasp
+        mov     x30, x0
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_intermediate_overwrite2, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT:    The return instruction is     {{[0-9a-f]+}}:       ret
+// CHECK-NEXT:    The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT:    1. {{[0-9a-f]+}}: mov     x30, x0
+// CHECK-NEXT:  This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}:   paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}:   stp     x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}:   mov     x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}:   bl      g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}:   add     x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}:   ldp     x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}:   autiasp
+// CHECK-NEXT: {{[0-9a-f]+}}:   mov     x30, x0
+// CHECK-NEXT: {{[0-9a-f]+}}:   ret
+        ret
+        .size f_intermediate_overwrite2, .-f_intermediate_overwrite2
+
+        .globl  f_intermediate_read
+        .type   f_intermediate_read, at function
+f_intermediate_read:
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        ldp     x29, x30, [sp], #16
+        autiasp
+        mov     x0, x30
+// CHECK-NOT: function f_intermediate_read
+        ret
+        .size f_intermediate_read, .-f_intermediate_read
+
+        .globl  f_intermediate_overwrite3
+        .type   f_intermediate_overwrite3, at function
+f_intermediate_overwrite3:
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        ldp     x29, x30, [sp], #16
+        autiasp
+        mov     w30, w0
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_intermediate_overwrite3, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT:    The return instruction is     {{[0-9a-f]+}}:       ret
+// CHECK-NEXT:    The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT:    1. {{[0-9a-f]+}}: mov     w30, w0
+// CHECK-NEXT:  This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}:   paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}:   stp     x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}:   mov     x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}:   bl      g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}:   add     x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}:   ldp     x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}:   autiasp
+// CHECK-NEXT: {{[0-9a-f]+}}:   mov     w30, w0
+// CHECK-NEXT: {{[0-9a-f]+}}:   ret
+        ret
+        .size f_intermediate_overwrite3, .-f_intermediate_overwrite3
+
+        .globl  f_nonx30_ret
+        .type   f_nonx30_ret, at function
+f_nonx30_ret:
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        ldp     x29, x30, [sp], #16
+        mov     x16, x30
+        autiasp
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_nonx30_ret, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT:    The return instruction is     {{[0-9a-f]+}}:       ret     x16
+// CHECK-NEXT:    The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT:    1. {{[0-9a-f]+}}: mov     x16, x30
+// CHECK-NEXT:  This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}:   paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}:   stp     x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}:   mov     x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}:   bl      g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}:   add     x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}:   ldp     x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}:   mov     x16, x30
+// CHECK-NEXT: {{[0-9a-f]+}}:   autiasp
+// CHECK-NEXT: {{[0-9a-f]+}}:   ret     x16
+        ret     x16
+        .size f_nonx30_ret, .-f_nonx30_ret
+
+
+/// Now do a basic sanity check on every different Authentication instruction:
+
+        .globl  f_autiasp
+        .type   f_autiasp, at function
+f_autiasp:
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        ldp     x29, x30, [sp], #16
+        autiasp
+// CHECK-NOT: function f_autiasp
+        ret
+        .size f_autiasp, .-f_autiasp
+
+        .globl  f_autibsp
+        .type   f_autibsp, at function
+f_autibsp:
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        ldp     x29, x30, [sp], #16
+        autibsp
+// CHECK-NOT: function f_autibsp
+        ret
+        .size f_autibsp, .-f_autibsp
+
+        .globl  f_autiaz
+        .type   f_autiaz, at function
+f_autiaz:
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        ldp     x29, x30, [sp], #16
+        autiaz
+// CHECK-NOT: function f_autiaz
+        ret
+        .size f_autiaz, .-f_autiaz
+
+        .globl  f_autibz
+        .type   f_autibz, at function
+f_autibz:
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        ldp     x29, x30, [sp], #16
+        autiaz
+// CHECK-NOT: function f_autibz
+        ret
+        .size f_autibz, .-f_autibz
+
+        .globl  f_autia1716
+        .type   f_autia1716, at function
+f_autia1716:
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        ldp     x29, x30, [sp], #16
+        autia1716
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_autia1716, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT:    The return instruction is     {{[0-9a-f]+}}:       ret
+// CHECK-NEXT:    The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT:    1. {{[0-9a-f]+}}: ldp     x29, x30, [sp], #0x10
+// CHECK-NEXT:  This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}:   paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}:   stp     x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}:   mov     x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}:   bl      g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}:   add     x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}:   ldp     x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}:   autia1716
+// CHECK-NEXT: {{[0-9a-f]+}}:   ret
+        ret
+        .size f_autia1716, .-f_autia1716
+
+        .globl  f_autib1716
+        .type   f_autib1716, at function
+f_autib1716:
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        ldp     x29, x30, [sp], #16
+        autib1716
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_autib1716, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT:    The return instruction is     {{[0-9a-f]+}}:       ret
+// CHECK-NEXT:    The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT:    1. {{[0-9a-f]+}}: ldp     x29, x30, [sp], #0x10
+// CHECK-NEXT:  This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}:   paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}:   stp     x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}:   mov     x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}:   bl      g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}:   add     x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}:   ldp     x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}:   autib1716
+// CHECK-NEXT: {{[0-9a-f]+}}:   ret
+        ret
+        .size f_autib1716, .-f_autib1716
+
+        .globl  f_autiax12
+        .type   f_autiax12, at function
+f_autiax12:
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        ldp     x29, x30, [sp], #16
+        autia   x12, sp
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_autiax12, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT:    The return instruction is     {{[0-9a-f]+}}:       ret
+// CHECK-NEXT:    The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT:    1. {{[0-9a-f]+}}: ldp     x29, x30, [sp], #0x10
+// CHECK-NEXT:  This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}:   paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}:   stp     x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}:   mov     x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}:   bl      g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}:   add     x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}:   ldp     x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}:   autia   x12, sp
+// CHECK-NEXT: {{[0-9a-f]+}}:   ret
+        ret
+        .size f_autiax12, .-f_autiax12
+
+        .globl  f_autibx12
+        .type   f_autibx12, at function
+f_autibx12:
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        ldp     x29, x30, [sp], #16
+        autib   x12, sp
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_autibx12, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT:    The return instruction is     {{[0-9a-f]+}}:       ret
+// CHECK-NEXT:    The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT:    1. {{[0-9a-f]+}}: ldp     x29, x30, [sp], #0x10
+// CHECK-NEXT:  This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}:   paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}:   stp     x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}:   mov     x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}:   bl      g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}:   add     x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}:   ldp     x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}:   autib   x12, sp
+// CHECK-NEXT: {{[0-9a-f]+}}:   ret
+        ret
+        .size f_autibx12, .-f_autibx12
+
+        .globl  f_autiax30
+        .type   f_autiax30, at function
+f_autiax30:
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        ldp     x29, x30, [sp], #16
+        autia   x30, sp
+// CHECK-NOT: function f_autiax30
+        ret
+        .size f_autiax30, .-f_autiax30
+
+        .globl  f_autibx30
+        .type   f_autibx30, at function
+f_autibx30:
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        ldp     x29, x30, [sp], #16
+        autib   x30, sp
+// CHECK-NOT: function f_autibx30
+        ret
+        .size f_autibx30, .-f_autibx30
+
+
+        .globl  f_autdax12
+        .type   f_autdax12, at function
+f_autdax12:
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        ldp     x29, x30, [sp], #16
+        autda   x12, sp
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_autdax12, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT:    The return instruction is     {{[0-9a-f]+}}:       ret
+// CHECK-NEXT:    The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT:    1. {{[0-9a-f]+}}: ldp     x29, x30, [sp], #0x10
+// CHECK-NEXT:  This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}:   paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}:   stp     x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}:   mov     x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}:   bl      g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}:   add     x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}:   ldp     x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}:   autda   x12, sp
+// CHECK-NEXT: {{[0-9a-f]+}}:   ret
+        ret
+        .size f_autdax12, .-f_autdax12
+
+        .globl  f_autdbx12
+        .type   f_autdbx12, at function
+f_autdbx12:
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        ldp     x29, x30, [sp], #16
+        autdb   x12, sp
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_autdbx12, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT:    The return instruction is     {{[0-9a-f]+}}:       ret
+// CHECK-NEXT:    The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT:    1. {{[0-9a-f]+}}: ldp     x29, x30, [sp], #0x10
+// CHECK-NEXT:  This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}:   paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}:   stp     x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}:   mov     x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}:   bl      g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}:   add     x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}:   ldp     x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}:   autdb   x12, sp
+// CHECK-NEXT: {{[0-9a-f]+}}:   ret
+        ret
+        .size f_autdbx12, .-f_autdbx12
+
+        .globl  f_autdax30
+        .type   f_autdax30, at function
+f_autdax30:
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        ldp     x29, x30, [sp], #16
+        autda   x30, sp
+// CHECK-NOT: function f_autdax30
+        ret
+        .size f_autdax30, .-f_autdax30
+
+        .globl  f_autdbx30
+        .type   f_autdbx30, at function
+f_autdbx30:
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        ldp     x29, x30, [sp], #16
+        autdb   x30, sp
+// CHECK-NOT: function f_autdbx30
+        ret
+        .size f_autdbx30, .-f_autdbx30
+
+
+        .globl  f_autizax12
+        .type   f_autizax12, at function
+f_autizax12:
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        ldp     x29, x30, [sp], #16
+        autiza  x12
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_autizax12, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT:    The return instruction is     {{[0-9a-f]+}}:       ret
+// CHECK-NEXT:    The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT:    1. {{[0-9a-f]+}}: ldp     x29, x30, [sp], #0x10
+// CHECK-NEXT:  This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}:   paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}:   stp     x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}:   mov     x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}:   bl      g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}:   add     x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}:   ldp     x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}:   autiza  x12
+// CHECK-NEXT: {{[0-9a-f]+}}:   ret
+        ret
+        .size f_autizax12, .-f_autizax12
+
+        .globl  f_autizbx12
+        .type   f_autizbx12, at function
+f_autizbx12:
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        ldp     x29, x30, [sp], #16
+        autizb  x12
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_autizbx12, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT:    The return instruction is     {{[0-9a-f]+}}:       ret
+// CHECK-NEXT:    The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT:    1. {{[0-9a-f]+}}: ldp     x29, x30, [sp], #0x10
+// CHECK-NEXT:  This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}:   paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}:   stp     x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}:   mov     x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}:   bl      g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}:   add     x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}:   ldp     x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}:   autizb  x12
+// CHECK-NEXT: {{[0-9a-f]+}}:   ret
+        ret
+        .size f_autizbx12, .-f_autizbx12
+
+        .globl  f_autizax30
+        .type   f_autizax30, at function
+f_autizax30:
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        ldp     x29, x30, [sp], #16
+        autiza  x30
+// CHECK-NOT: function f_autizax30
+        ret
+        .size f_autizax30, .-f_autizax30
+
+        .globl  f_autizbx30
+        .type   f_autizbx30, at function
+f_autizbx30:
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        ldp     x29, x30, [sp], #16
+        autizb  x30
+// CHECK-NOT: function f_autizbx30
+        ret
+        .size f_autizbx30, .-f_autizbx30
+
+
+        .globl  f_autdzax12
+        .type   f_autdzax12, at function
+f_autdzax12:
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        ldp     x29, x30, [sp], #16
+        autdza  x12
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_autdzax12, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT:    The return instruction is     {{[0-9a-f]+}}:       ret
+// CHECK-NEXT:    The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT:    1. {{[0-9a-f]+}}: ldp     x29, x30, [sp], #0x10
+// CHECK-NEXT:  This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}:   paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}:   stp     x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}:   mov     x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}:   bl      g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}:   add     x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}:   ldp     x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}:   autdza  x12
+// CHECK-NEXT: {{[0-9a-f]+}}:   ret
+        ret
+        .size f_autdzax12, .-f_autdzax12
+
+        .globl  f_autdzbx12
+        .type   f_autdzbx12, at function
+f_autdzbx12:
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        ldp     x29, x30, [sp], #16
+        autdzb  x12
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_autdzbx12, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT:    The return instruction is     {{[0-9a-f]+}}:       ret
+// CHECK-NEXT:    The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT:    1. {{[0-9a-f]+}}: ldp     x29, x30, [sp], #0x10
+// CHECK-NEXT:  This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}:   paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}:   stp     x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}:   mov     x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}:   bl      g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}:   add     x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}:   ldp     x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}:   autdzb  x12
+// CHECK-NEXT: {{[0-9a-f]+}}:   ret
+        ret
+        .size f_autdzbx12, .-f_autdzbx12
+
+        .globl  f_autdzax30
+        .type   f_autdzax30, at function
+f_autdzax30:
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        ldp     x29, x30, [sp], #16
+        autdza  x30
+// CHECK-NOT: function f_autdzax30
+        ret
+        .size f_autdzax30, .-f_autdzax30
+
+        .globl  f_autdzbx30
+        .type   f_autdzbx30, at function
+f_autdzbx30:
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        ldp     x29, x30, [sp], #16
+        autdzb  x30
+// CHECK-NOT: function f_autdzbx30
+        ret
+        .size f_autdzbx30, .-f_autdzbx30
+
+        .globl  f_retaa
+        .type   f_retaa, at function
+f_retaa:
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        ldp     x29, x30, [sp], #16
+// CHECK-NOT: function f_retaa
+        retaa
+        .size f_retaa, .-f_retaa
+
+        .globl  f_retab
+        .type   f_retab, at function
+f_retab:
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        ldp     x29, x30, [sp], #16
+// CHECK-NOT: function f_retab
+        retab
+        .size f_retab, .-f_retab
+
+        .globl  f_eretaa
+        .type   f_eretaa, at function
+f_eretaa:
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        ldp     x29, x30, [sp], #16
+// CHECK-NOT: function f_eretaa
+        eretaa
+        .size f_eretaa, .-f_eretaa
+
+        .globl  f_eretab
+        .type   f_eretab, at function
+f_eretab:
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        ldp     x29, x30, [sp], #16
+// CHECK-NOT: function f_eretab
+        eretab
+        .size f_eretab, .-f_eretab
+
+        .globl  f_eret
+        .type   f_eret, at function
+f_eret:
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        ldp     x29, x30, [sp], #16
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_eret, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT:    The return instruction is     {{[0-9a-f]+}}:       eret
+// CHECK-NEXT:    The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT:    1. {{[0-9a-f]+}}: ldp     x29, x30, [sp], #0x10
+// CHECK-NEXT:  This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}:   paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}:   stp     x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}:   mov     x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}:   bl      g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}:   add     x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}:   ldp     x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}:   eret
+        eret
+        .size f_eret, .-f_eret
+
+        .globl f_movx30reg
+        .type   f_movx30reg, at function
+f_movx30reg:
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_movx30reg, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT:    The return instruction is     {{[0-9a-f]+}}:       ret
+// CHECK-NEXT:    The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT:    1. {{[0-9a-f]+}}: mov x30, x22
+// CHECK-NEXT:  This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}:   mov     x30, x22
+// CHECK-NEXT: {{[0-9a-f]+}}:   ret
+        mov     x30, x22
+        ret
+        .size f_movx30reg, .-f_movx30reg
+
+// TODO: add test to see if registers clobbered by a call are picked up.
diff --git a/bolt/test/binary-analysis/AArch64/gs-pacret-multi-bb.s b/bolt/test/binary-analysis/AArch64/gs-pacret-multi-bb.s
new file mode 100644
index 00000000000000..402bdf6591dcb1
--- /dev/null
+++ b/bolt/test/binary-analysis/AArch64/gs-pacret-multi-bb.s
@@ -0,0 +1,46 @@
+// RUN: %clang %cflags -march=armv8.3-a -mbranch-protection=pac-ret %s %p/../../Inputs/asm_main.c -o %t.exe
+// RUN: llvm-bolt-binary-analysis --scanners=pacret %t.exe 2>&1 | FileCheck -check-prefix=CHECK --allow-empty %s
+
+
+// Verify that we can also detect gadgets across basic blocks
+
+        .globl f_crossbb1
+        .type   f_crossbb1, at function
+f_crossbb1:
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        ldp     x29, x30, [sp], #16
+        cbnz    x0, 1f
+        autiasp
+1:
+        ret
+        .size f_crossbb1, .-f_crossbb1
+// CHECK-LABEL:     GS-PACRET: non-protected ret found in function f_crossbb1, basic block .L{{[^,]+}}, at address
+// CHECK-NEXT:  The return instruction is     {{[0-9a-f]+}}:       ret
+// CHECK-NEXT:  The 2 instructions that write to the return register after any authentication are:
+// CHECK-NEXT:  1.     {{[0-9a-f]+}}:      ldp     x29, x30, [sp], #0x10
+// CHECK-NEXT:  2.     {{[0-9a-f]+}}:      autiasp
+
+// A test that checks that the dataflow state tracking across when merging BBs
+// seems to work:
+        .globl f_mergebb1
+        .type   f_mergebb1, at function
+f_mergebb1:
+        hint    #25
+2:
+        stp     x29, x30, [sp, #-16]!
+        ldp     x29, x30, [sp], #16
+        sub     x0, x0, #1
+        cbnz    x0, 1f
+        autiasp
+        b       2b
+1:
+        ret
+        .size f_mergebb1, .-f_mergebb1
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_mergebb1, basic block .L{{[^,]+}}, at address
+// CHECK-NEXT:    The return instruction is     {{[0-9a-f]+}}:       ret
+// CHECK-NEXT:    The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT:    1.     {{[0-9a-f]+}}:      ldp     x29, x30, [sp], #0x10
+
+// TODO: also verify that false negatives exist in across-BB gadgets in functions
+// for which bolt cannot reconstruct the call graph.
diff --git a/bolt/test/binary-analysis/AArch64/lit.local.cfg b/bolt/test/binary-analysis/AArch64/lit.local.cfg
index 6f247dd52e82f8..6ac7d3cc0fec71 100644
--- a/bolt/test/binary-analysis/AArch64/lit.local.cfg
+++ b/bolt/test/binary-analysis/AArch64/lit.local.cfg
@@ -1,7 +1,7 @@
 if "AArch64" not in config.root.targets:
     config.unsupported = True
 
-flags = "--target=aarch64-linux-gnu -nostartfiles -nostdlib -ffreestanding -Wl,--emit-relocs"
+flags = "--target=aarch64-linux-gnu -nostartfiles -nostdlib -ffreestanding"
 
 config.substitutions.insert(0, ("%cflags", f"%cflags {flags}"))
 config.substitutions.insert(0, ("%cxxflags", f"%cxxflags {flags}"))

>From 35e1c44f36869c0378e932dd4d9a2562b9404d0f Mon Sep 17 00:00:00 2001
From: Kristof Beyls <kristof.beyls at arm.com>
Date: Thu, 9 Jan 2025 15:53:57 +0000
Subject: [PATCH 02/14] format cleanup

---
 bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h |  8 +++-----
 bolt/lib/Rewrite/RewriteInstance.cpp                  | 11 ++++++-----
 2 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h b/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h
index a9740896369fa2..21d089ed7b8506 100644
--- a/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h
+++ b/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h
@@ -31,7 +31,7 @@ struct MCInstInBBReference {
       : BB(BB), BBIndex(BBIndex) {}
   MCInstInBBReference() : BB(nullptr), BBIndex(0) {}
   static MCInstInBBReference get(const MCInst *Inst, BinaryFunction &BF) {
-    for (BinaryBasicBlock& BB : BF)
+    for (BinaryBasicBlock &BB : BF)
       for (size_t I = 0; I < BB.size(); ++I)
         if (Inst == &(BB.getInstructionAtIndex(I)))
           return MCInstInBBReference(&BB, I);
@@ -78,9 +78,7 @@ struct MCInstInBFReference {
 
   uint64_t getOffset() const { return Offset; }
 
-  uint64_t getAddress() const {
-    return BF->getAddress() + getOffset();
-  }
+  uint64_t getAddress() const { return BF->getAddress() + getOffset(); }
 };
 
 raw_ostream &operator<<(raw_ostream &OS, const MCInstInBFReference &);
@@ -179,7 +177,7 @@ struct NonPacProtectedRetGadget {
   }
   NonPacProtectedRetGadget(
       MCInstReference RetInst,
-      const std::vector<MCInstReference>& OverwritingRetRegInst)
+      const std::vector<MCInstReference> &OverwritingRetRegInst)
       : RetInst(RetInst), OverwritingRetRegInst(OverwritingRetRegInst) {}
 };
 
diff --git a/bolt/lib/Rewrite/RewriteInstance.cpp b/bolt/lib/Rewrite/RewriteInstance.cpp
index 0a45c35f02da32..135cfc63de4d37 100644
--- a/bolt/lib/Rewrite/RewriteInstance.cpp
+++ b/bolt/lib/Rewrite/RewriteInstance.cpp
@@ -246,11 +246,12 @@ static cl::opt<bool> WriteBoltInfoSection(
     "bolt-info", cl::desc("write bolt info section in the output binary"),
     cl::init(true), cl::Hidden, cl::cat(BoltOutputCategory));
 
-cl::list<GadgetScannerKind> GadgetScannersToRun(
-    "scanners", cl::desc("which gadget scanners to run"),
-    cl::values(clEnumValN(GS_PACRET, "pacret", "pac-ret"),
-               clEnumValN(GS_ALL, "all", "all")),
-    cl::ZeroOrMore, cl::CommaSeparated, cl::cat(BinaryAnalysisCategory));
+cl::list<GadgetScannerKind>
+    GadgetScannersToRun("scanners", cl::desc("which gadget scanners to run"),
+                        cl::values(clEnumValN(GS_PACRET, "pacret", "pac-ret"),
+                                   clEnumValN(GS_ALL, "all", "all")),
+                        cl::ZeroOrMore, cl::CommaSeparated,
+                        cl::cat(BinaryAnalysisCategory));
 
 } // namespace opts
 

>From 3b1bd2458e4bc3fe9e05c88da1b2c20ac65f8bea Mon Sep 17 00:00:00 2001
From: Kristof Beyls <kristof.beyls at arm.com>
Date: Fri, 10 Jan 2025 14:32:42 +0000
Subject: [PATCH 03/14] Address first round of review feedback by Peter Smith

---
 bolt/docs/BinaryAnalysis.md | 53 ++++++++++++++++++++-----------------
 1 file changed, 29 insertions(+), 24 deletions(-)

diff --git a/bolt/docs/BinaryAnalysis.md b/bolt/docs/BinaryAnalysis.md
index e247ce20b4625c..17e45513462af2 100644
--- a/bolt/docs/BinaryAnalysis.md
+++ b/bolt/docs/BinaryAnalysis.md
@@ -28,39 +28,46 @@ pac-ret, shadow stacks, arm64e, and many more.
 
 These mitigations guarantee a so-called "security property" on the binaries they
 produce. For example, for stack canaries, the security property is roughly that
-a canary is located on the stack between the set of saved variables and set of
-local variables. For pac-ret, it is roughly that there are no writes to the
-register containing the return address after either an authenticating
-instruction or a Branch-and-link instruction.
+a canary is located on the stack between the set of saved variables and the set
+of local variables. For pac-ret, it is roughly that either the return address is
+never stored/retrieved to/from memory; or, there are no writes to the register
+containing the return address between an instruction authenticating it and a
+return instruction using it.
 
 From time to time, however, a bug gets found in the implementation of such
 mitigations in toolchains. Also, code that is written in assembler by hand
 requires the developer to ensure these security properties by hand.
 
 In short, it is sometimes found that a few places in the binary code are not
-protected as expected given the requested mitigations. Attackers could make use
-of those places (sometimes called gadgets) to circumvent to protection that the
-mitigation should give.
+protected as well as expected given the requested mitigations. Attackers could
+make use of those places (sometimes called gadgets) to circumvent the protection
+that the mitigation should give.
 
 One of the reasons that such gadgets, or holes in the mitigation implementation,
 exist is that typically the amount of testing and verification for these
-security properties is pretty limited.
+security properties is limited to checking results on specific examples.
 
 In comparison, for testing functional correctness, or for testing performance,
 toolchain and software in general typically get tested with large test suites
-and benchmarks and testing and benchmarking plans. In contrast, this typically
-does not get done for testing the security properties of binary code.
+and benchmarks. In contrast, this typically does not get done for testing the
+security properties of binary code.
+
+Unlike functional correctness where compilation errors result in test failures,
+and performance where speed and size differences are measurable, broken security
+properties cannot be easily observed using existing testing and benchmarking
+tools.
 
 The security scanners implemented in `llvm-bolt-binary-analysis` aim to enable
-better testing of security hardening implementations that require compilers to
-somehow generate assembly code with specific security properties.
+the testing of security hardening in arbitrary programs and not just specific
+examples.
+
 
 #### pac-ret analysis
 
 `pac-ret` protection is a security hardening scheme implemented in compilers
 such as gcc and clang, using the command line option
 `-mbranch-protection=pac-ret`. This option is enabled by default on most widely
-used distributions.
+used linux distributions.
 
 The hardening scheme mitigates
 [Return-Oriented Programming (ROP)](https://llsoftsec.github.io/llsoftsecbook/#return-oriented-programming)
@@ -77,16 +84,14 @@ from memory.
 
 The 'pac-ret' binary analysis can be invoked using the command line option
 `--scanners=pac-ret`. It makes `llvm-bolt-binary-analysis` scan through the
-provided binary and check the following property:
-
-The security property that is checked is:
+provided binary, checking each function for the following security property:
 
-When a register is used as the address to jump to in a return instruction,
-that register must either:
+For each procedure and exception return instruction, the destination register
+must have one of the following properties:
 
-1. never be changed within this function, i.e. have the same value as when the
-   function started, or
-2. the last write to the register must be by an authenticating instruction.
+1. be immutable within the function, or
+2. the last write to the register must be by an authenticating instruction. This
+   includes combined authentication and return instructions such as `RETAA`.
 
 ##### Example 1
 
@@ -101,7 +106,7 @@ ldp     x29, x30, [sp], #0x10
 ret
 ```
 
-The return instruction `ret` uses register `x30` as containing the address to
+The return instruction `ret` implicitly uses register `x30` as the address to
 return to. Register `x30` was last written by instruction `ldp`, which is not an
 authenticating instruction. `llvm-bolt-binary-analysis --scanners=pac-ret` will
 report this as follows:
@@ -125,8 +130,8 @@ evolve over time.
 
 ##### Example 2: multiple "last-overwriting" instructions
 
-A simple example that show how there can be a set of "last overwriting"
-instructions of a register is the following:
+A simple example that shows how there can be a set of "last overwriting"
+instructions of a register follows:
 
 ```
         paciasp

>From 39d0e54d9ec0e1c9f9a52cb03f2dab4a571498ca Mon Sep 17 00:00:00 2001
From: Kristof Beyls <kristof.beyls at arm.com>
Date: Fri, 10 Jan 2025 15:41:25 +0000
Subject: [PATCH 04/14] Address first round of review feedback by @atrosinenko

---
 bolt/include/bolt/Core/MCPlusBuilder.h        |  7 +--
 .../bolt/Passes/NonPacProtectedRetAnalysis.h  | 45 ++++++++++---------
 .../lib/Passes/NonPacProtectedRetAnalysis.cpp | 43 +++++++++---------
 .../Target/AArch64/AArch64MCPlusBuilder.cpp   | 10 ++---
 4 files changed, 53 insertions(+), 52 deletions(-)

diff --git a/bolt/include/bolt/Core/MCPlusBuilder.h b/bolt/include/bolt/Core/MCPlusBuilder.h
index 9351107e66bdf6..79f0176b47776a 100644
--- a/bolt/include/bolt/Core/MCPlusBuilder.h
+++ b/bolt/include/bolt/Core/MCPlusBuilder.h
@@ -27,6 +27,7 @@
 #include "llvm/MC/MCInstrAnalysis.h"
 #include "llvm/MC/MCInstrDesc.h"
 #include "llvm/MC/MCInstrInfo.h"
+#include "llvm/MC/MCRegister.h"
 #include "llvm/Support/Allocator.h"
 #include "llvm/Support/Casting.h"
 #include "llvm/Support/ErrorHandling.h"
@@ -552,16 +553,16 @@ class MCPlusBuilder {
 
   virtual MCPhysReg getAuthenticatedReg(const MCInst &Inst) const {
     llvm_unreachable("not implemented");
-    return false;
+    return getNoRegister();
   }
 
   virtual bool isAuthenticationOfReg(const MCInst &Inst,
-                                     const unsigned RegAuthenticated) const {
+                                     MCPhysReg AuthenticatedReg) const {
     llvm_unreachable("not implemented");
     return false;
   }
 
-  virtual llvm::MCPhysReg getRegUsedAsRetDest(const MCInst &Inst) const {
+  virtual MCPhysReg getRegUsedAsRetDest(const MCInst &Inst) const {
     llvm_unreachable("not implemented");
     return getNoRegister();
   }
diff --git a/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h b/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h
index 21d089ed7b8506..b98a99a18d3425 100644
--- a/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h
+++ b/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h
@@ -51,6 +51,9 @@ struct MCInstInBBReference {
   }
   uint64_t getAddress() const {
     // 4 bytes per instruction on AArch64;
+    // FIXME: the assumption of 4 byte per instruction needs to be fixed before
+    // this method gets used on any non-AArch64 binaries (but should be fine for
+    // pac-ret analysis, as that is an AArch64-specific feature).
     return BB->getFunction()->getAddress() + BB->getOffset() + BBIndex * 4;
   }
 };
@@ -84,8 +87,8 @@ struct MCInstInBFReference {
 raw_ostream &operator<<(raw_ostream &OS, const MCInstInBFReference &);
 
 struct MCInstReference {
-  enum StoredIn { _BinaryFunction, _BinaryBasicBlock };
-  StoredIn CurrentLocation;
+  enum ParentKind { BinaryFunction, BinaryBasicBlock };
+  ParentKind CurrentLocation;
   union U {
     MCInstInBBReference BBRef;
     MCInstInBFReference BFRef;
@@ -93,21 +96,21 @@ struct MCInstReference {
     U(MCInstInBFReference BFRef) : BFRef(BFRef) {}
   } U;
   MCInstReference(MCInstInBBReference BBRef)
-      : CurrentLocation(_BinaryBasicBlock), U(BBRef) {}
+      : CurrentLocation(BinaryBasicBlock), U(BBRef) {}
   MCInstReference(MCInstInBFReference BFRef)
-      : CurrentLocation(_BinaryFunction), U(BFRef) {}
-  MCInstReference(BinaryBasicBlock *BB, int64_t BBIndex)
+      : CurrentLocation(BinaryFunction), U(BFRef) {}
+  MCInstReference(class BinaryBasicBlock *BB, int64_t BBIndex)
       : MCInstReference(MCInstInBBReference(BB, BBIndex)) {}
-  MCInstReference(BinaryFunction *BF, uint32_t Offset)
+  MCInstReference(class BinaryFunction *BF, uint32_t Offset)
       : MCInstReference(MCInstInBFReference(BF, Offset)) {}
 
   bool operator<(const MCInstReference &RHS) const {
     if (CurrentLocation != RHS.CurrentLocation)
       return CurrentLocation < RHS.CurrentLocation;
     switch (CurrentLocation) {
-    case _BinaryBasicBlock:
+    case BinaryBasicBlock:
       return U.BBRef < RHS.U.BBRef;
-    case _BinaryFunction:
+    case BinaryFunction:
       return U.BFRef < RHS.U.BFRef;
     }
     llvm_unreachable("");
@@ -117,9 +120,9 @@ struct MCInstReference {
     if (CurrentLocation != RHS.CurrentLocation)
       return false;
     switch (CurrentLocation) {
-    case _BinaryBasicBlock:
+    case BinaryBasicBlock:
       return U.BBRef == RHS.U.BBRef;
-    case _BinaryFunction:
+    case BinaryFunction:
       return U.BFRef == RHS.U.BFRef;
     }
     llvm_unreachable("");
@@ -127,9 +130,9 @@ struct MCInstReference {
 
   operator MCInst &() const {
     switch (CurrentLocation) {
-    case _BinaryBasicBlock:
+    case BinaryBasicBlock:
       return U.BBRef;
-    case _BinaryFunction:
+    case BinaryFunction:
       return U.BFRef;
     }
     llvm_unreachable("");
@@ -137,29 +140,29 @@ struct MCInstReference {
 
   uint64_t getAddress() const {
     switch (CurrentLocation) {
-    case _BinaryBasicBlock:
+    case BinaryBasicBlock:
       return U.BBRef.getAddress();
-    case _BinaryFunction:
+    case BinaryFunction:
       return U.BFRef.getAddress();
     }
     llvm_unreachable("");
   }
 
-  BinaryFunction *getFunction() const {
+  class BinaryFunction *getFunction() const {
     switch (CurrentLocation) {
-    case _BinaryFunction:
+    case BinaryFunction:
       return U.BFRef.BF;
-    case _BinaryBasicBlock:
+    case BinaryBasicBlock:
       return U.BBRef.BB->getFunction();
     }
     llvm_unreachable("");
   }
 
-  BinaryBasicBlock *getBasicBlock() const {
+  class BinaryBasicBlock *getBasicBlock() const {
     switch (CurrentLocation) {
-    case _BinaryFunction:
+    case BinaryFunction:
       return nullptr;
-    case _BinaryBasicBlock:
+    case BinaryBasicBlock:
       return U.BBRef.BB;
     }
     llvm_unreachable("");
@@ -204,4 +207,4 @@ class NonPacProtectedRetAnalysis : public BinaryFunctionPass {
 } // namespace bolt
 } // namespace llvm
 
-#endif
\ No newline at end of file
+#endif
diff --git a/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp b/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
index d6bc902f663fc9..fe456d8fcaa503 100644
--- a/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
+++ b/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
@@ -48,10 +48,10 @@ raw_ostream &operator<<(raw_ostream &OS, const MCInstInBFReference &Ref) {
 
 raw_ostream &operator<<(raw_ostream &OS, const MCInstReference &Ref) {
   switch (Ref.CurrentLocation) {
-  case MCInstReference::_BinaryBasicBlock:
+  case MCInstReference::BinaryBasicBlock:
     OS << Ref.U.BBRef;
     return OS;
-  case MCInstReference::_BinaryFunction:
+  case MCInstReference::BinaryFunction:
     OS << Ref.U.BFRef;
     return OS;
   }
@@ -173,7 +173,7 @@ void PacStatePrinter::print(raw_ostream &OS, const State &S) const {
 }
 
 class PacRetAnalysis
-    : public DataflowAnalysis<PacRetAnalysis, State, false /*Backward*/,
+    : public DataflowAnalysis<PacRetAnalysis, State, /*Backward=*/false,
                               PacStatePrinter> {
   using Parent =
       DataflowAnalysis<PacRetAnalysis, State, false, PacStatePrinter>;
@@ -187,17 +187,13 @@ class PacRetAnalysis
         TrackingLastInsts(RegsToTrackInstsFor.size() != 0),
         Reg2StateIdx(RegsToTrackInstsFor.size() == 0
                          ? 0
-                         : *std::max_element(RegsToTrackInstsFor.begin(),
-                                             RegsToTrackInstsFor.end()) +
-                               1,
+                         : *llvm::max_element(RegsToTrackInstsFor) + 1,
                      -1) {
     for (unsigned I = 0; I < RegsToTrackInstsFor.size(); ++I)
       Reg2StateIdx[RegsToTrackInstsFor[I]] = I;
   }
   virtual ~PacRetAnalysis() {}
 
-  void run() { Parent::run(); }
-
 protected:
   const uint16_t NumRegs;
   /// RegToTrackInstsFor is the set of registers for which the dataflow analysis
@@ -208,7 +204,7 @@ class PacRetAnalysis
   /// track which instructions last wrote to this register.
   std::vector<uint16_t> Reg2StateIdx;
 
-  bool trackReg(MCPhysReg Reg) {
+  bool doTrackReg(MCPhysReg Reg) const {
     for (auto R : RegsToTrackInstsFor)
       if (R == Reg)
         return true;
@@ -269,7 +265,7 @@ class PacRetAnalysis
     // need to track that for:
     if (TrackingLastInsts) {
       for (auto WrittenReg : Written.set_bits()) {
-        if (trackReg(WrittenReg)) {
+        if (doTrackReg(WrittenReg)) {
           Next.LastInstWritingReg[reg2StateIdx(WrittenReg)] = {};
           Next.LastInstWritingReg[reg2StateIdx(WrittenReg)].insert(&Point);
         }
@@ -280,7 +276,7 @@ class PacRetAnalysis
     if (AutReg != BC.MIB->getNoRegister()) {
       Next.NonAutClobRegs.reset(
           BC.MIB->getAliases(AutReg, /*OnlySmaller=*/true));
-      if (TrackingLastInsts && trackReg(AutReg))
+      if (TrackingLastInsts && doTrackReg(AutReg))
         Next.LastInstWritingReg[reg2StateIdx(AutReg)] = {};
     }
 
@@ -394,8 +390,8 @@ void NonPacProtectedRetAnalysis::runOnFunction(
   LLVM_DEBUG({
     dbgs() << "Analyzing in function " << BF.getPrintName() << ", AllocatorId "
            << AllocatorId << "\n";
+    BF.dump();
   });
-  LLVM_DEBUG({ BF.dump(); });
 
   if (BF.hasCFG()) {
     PacRetAnalysis PRA(BF, AllocatorId, {});
@@ -421,6 +417,9 @@ void printBB(const BinaryContext &BC, const BinaryBasicBlock *BB,
     EndIndex = BB->size() - 1;
   const BinaryFunction *BF = BB->getFunction();
   for (unsigned I = StartIndex; I <= EndIndex; ++I) {
+    // FIXME: this assumes all instructions are 4 bytes in size. This is true
+    // for AArch64, but it might be good to extract this function so it can be
+    // used elsewhere and for other targets too.
     uint64_t Address = BB->getOffset() + BF->getAddress() + 4 * I;
     const MCInst &Inst = BB->getInstructionAtIndex(I);
     if (BC.MIB->isCFI(Inst))
@@ -433,8 +432,8 @@ void reportFoundGadgetInSingleBBSingleOverwInst(const BinaryContext &BC,
                                                 const MCInstReference OverwInst,
                                                 const MCInstReference RetInst) {
   BinaryBasicBlock *BB = RetInst.getBasicBlock();
-  assert(OverwInst.CurrentLocation == MCInstReference::_BinaryBasicBlock);
-  assert(RetInst.CurrentLocation == MCInstReference::_BinaryBasicBlock);
+  assert(OverwInst.CurrentLocation == MCInstReference::BinaryBasicBlock);
+  assert(RetInst.CurrentLocation == MCInstReference::BinaryBasicBlock);
   MCInstInBBReference OverwInstBB = OverwInst.U.BBRef;
   if (BB == OverwInstBB.BB) {
     // overwriting inst and ret instruction are in the same basic block.
@@ -463,11 +462,10 @@ void reportFoundGadget(const BinaryContext &BC, const MCInst &Inst,
          << " instructions that write to the return register after any "
             "authentication are:\n";
   // Sort by address to ensure output is deterministic.
-  std::sort(std::begin(NPPRG.OverwritingRetRegInst),
-            std::end(NPPRG.OverwritingRetRegInst),
-            [](const MCInstReference &A, const MCInstReference &B) {
-              return A.getAddress() < B.getAddress();
-            });
+  llvm::sort(NPPRG.OverwritingRetRegInst,
+             [](const MCInstReference &A, const MCInstReference &B) {
+               return A.getAddress() < B.getAddress();
+             });
   for (unsigned I = 0; I < NPPRG.OverwritingRetRegInst.size(); ++I) {
     MCInstReference InstRef = NPPRG.OverwritingRetRegInst[I];
     outs() << "  " << (I + 1) << ". ";
@@ -481,7 +479,7 @@ void reportFoundGadget(const BinaryContext &BC, const MCInst &Inst,
   });
   if (NPPRG.OverwritingRetRegInst.size() == 1) {
     const MCInstReference OverwInst = NPPRG.OverwritingRetRegInst[0];
-    assert(OverwInst.CurrentLocation == MCInstReference::_BinaryBasicBlock);
+    assert(OverwInst.CurrentLocation == MCInstReference::BinaryBasicBlock);
     reportFoundGadgetInSingleBBSingleOverwInst(BC, OverwInst, RetInst);
   }
 }
@@ -505,8 +503,7 @@ Error NonPacProtectedRetAnalysis::runOnFunctions(BinaryContext &BC) {
   for (BinaryFunction *BF : BC.getAllBinaryFunctions())
     if (BF->hasCFG()) {
       for (BinaryBasicBlock &BB : *BF) {
-        for (int64_t I = BB.size() - 1; I >= 0; --I) {
-          MCInst &Inst = BB.getInstructionAtIndex(I);
+        for (MCInst &Inst : BB) {
           if (BC.MIB->hasAnnotation(Inst, GadgetAnnotationIndex)) {
             reportFoundGadget(BC, Inst, GadgetAnnotationIndex);
           }
@@ -517,4 +514,4 @@ Error NonPacProtectedRetAnalysis::runOnFunctions(BinaryContext &BC) {
 }
 
 } // namespace bolt
-} // namespace llvm
\ No newline at end of file
+} // namespace llvm
diff --git a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
index 1840750822aa3f..7bbcb1faff3689 100644
--- a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
+++ b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
@@ -23,6 +23,7 @@
 #include "llvm/MC/MCFixupKindInfo.h"
 #include "llvm/MC/MCInstBuilder.h"
 #include "llvm/MC/MCInstrInfo.h"
+#include "llvm/MC/MCRegister.h"
 #include "llvm/MC/MCRegisterInfo.h"
 #include "llvm/Support/DataExtractor.h"
 #include "llvm/Support/Debug.h"
@@ -172,7 +173,6 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
     case AArch64::AUTIZB:
     case AArch64::AUTDZA:
     case AArch64::AUTDZB:
-      assert(Inst.getOperand(0).isReg());
       return Inst.getOperand(0).getReg();
 
     default:
@@ -181,13 +181,13 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
   }
 
   bool isAuthenticationOfReg(const MCInst &Inst,
-                             const unsigned RegAuthenticated) const override {
-    if (RegAuthenticated == getNoRegister())
+                             MCPhysReg AuthenticatedReg) const override {
+    if (AuthenticatedReg == getNoRegister())
       return false;
-    return getAuthenticatedReg(Inst) == RegAuthenticated;
+    return getAuthenticatedReg(Inst) == AuthenticatedReg;
   }
 
-  llvm::MCPhysReg getRegUsedAsRetDest(const MCInst &Inst) const override {
+  MCPhysReg getRegUsedAsRetDest(const MCInst &Inst) const override {
     assert(isReturn(Inst));
     switch (Inst.getOpcode()) {
     case AArch64::RET:

>From 6edaa012f9d727866beb5bd7c1f1a6e4a0bf4e20 Mon Sep 17 00:00:00 2001
From: Kristof Beyls <kristof.beyls at arm.com>
Date: Fri, 10 Jan 2025 16:17:48 +0000
Subject: [PATCH 05/14] doTrackReg -> isTrackingReg

---
 bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp b/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
index fe456d8fcaa503..7053fd7c579c26 100644
--- a/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
+++ b/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
@@ -204,7 +204,7 @@ class PacRetAnalysis
   /// track which instructions last wrote to this register.
   std::vector<uint16_t> Reg2StateIdx;
 
-  bool doTrackReg(MCPhysReg Reg) const {
+  bool isTrackingReg(MCPhysReg Reg) const {
     for (auto R : RegsToTrackInstsFor)
       if (R == Reg)
         return true;
@@ -265,7 +265,7 @@ class PacRetAnalysis
     // need to track that for:
     if (TrackingLastInsts) {
       for (auto WrittenReg : Written.set_bits()) {
-        if (doTrackReg(WrittenReg)) {
+        if (isTrackingReg(WrittenReg)) {
           Next.LastInstWritingReg[reg2StateIdx(WrittenReg)] = {};
           Next.LastInstWritingReg[reg2StateIdx(WrittenReg)].insert(&Point);
         }
@@ -276,7 +276,7 @@ class PacRetAnalysis
     if (AutReg != BC.MIB->getNoRegister()) {
       Next.NonAutClobRegs.reset(
           BC.MIB->getAliases(AutReg, /*OnlySmaller=*/true));
-      if (TrackingLastInsts && doTrackReg(AutReg))
+      if (TrackingLastInsts && isTrackingReg(AutReg))
         Next.LastInstWritingReg[reg2StateIdx(AutReg)] = {};
     }
 

>From dd20d447c63259a5269fdc5b1d7e0edf598c2af3 Mon Sep 17 00:00:00 2001
From: Kristof Beyls <kristof.beyls at arm.com>
Date: Mon, 13 Jan 2025 10:27:49 +0000
Subject: [PATCH 06/14] ParentKind->Kind; CurrentLocation->ParentKind

---
 .../bolt/Passes/NonPacProtectedRetAnalysis.h  | 54 +++++++++----------
 .../lib/Passes/NonPacProtectedRetAnalysis.cpp | 12 ++---
 2 files changed, 33 insertions(+), 33 deletions(-)

diff --git a/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h b/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h
index b98a99a18d3425..1a9b2c41dbaf84 100644
--- a/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h
+++ b/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h
@@ -87,8 +87,8 @@ struct MCInstInBFReference {
 raw_ostream &operator<<(raw_ostream &OS, const MCInstInBFReference &);
 
 struct MCInstReference {
-  enum ParentKind { BinaryFunction, BinaryBasicBlock };
-  ParentKind CurrentLocation;
+  enum Kind { FunctionParent, BasicBlockParent };
+  Kind ParentKind;
   union U {
     MCInstInBBReference BBRef;
     MCInstInBFReference BFRef;
@@ -96,73 +96,73 @@ struct MCInstReference {
     U(MCInstInBFReference BFRef) : BFRef(BFRef) {}
   } U;
   MCInstReference(MCInstInBBReference BBRef)
-      : CurrentLocation(BinaryBasicBlock), U(BBRef) {}
+      : ParentKind(BasicBlockParent), U(BBRef) {}
   MCInstReference(MCInstInBFReference BFRef)
-      : CurrentLocation(BinaryFunction), U(BFRef) {}
+      : ParentKind(FunctionParent), U(BFRef) {}
   MCInstReference(class BinaryBasicBlock *BB, int64_t BBIndex)
       : MCInstReference(MCInstInBBReference(BB, BBIndex)) {}
   MCInstReference(class BinaryFunction *BF, uint32_t Offset)
       : MCInstReference(MCInstInBFReference(BF, Offset)) {}
 
   bool operator<(const MCInstReference &RHS) const {
-    if (CurrentLocation != RHS.CurrentLocation)
-      return CurrentLocation < RHS.CurrentLocation;
-    switch (CurrentLocation) {
-    case BinaryBasicBlock:
+    if (ParentKind != RHS.ParentKind)
+      return ParentKind < RHS.ParentKind;
+    switch (ParentKind) {
+    case BasicBlockParent:
       return U.BBRef < RHS.U.BBRef;
-    case BinaryFunction:
+    case FunctionParent:
       return U.BFRef < RHS.U.BFRef;
     }
     llvm_unreachable("");
   }
 
   bool operator==(const MCInstReference &RHS) const {
-    if (CurrentLocation != RHS.CurrentLocation)
+    if (ParentKind != RHS.ParentKind)
       return false;
-    switch (CurrentLocation) {
-    case BinaryBasicBlock:
+    switch (ParentKind) {
+    case BasicBlockParent:
       return U.BBRef == RHS.U.BBRef;
-    case BinaryFunction:
+    case FunctionParent:
       return U.BFRef == RHS.U.BFRef;
     }
     llvm_unreachable("");
   }
 
   operator MCInst &() const {
-    switch (CurrentLocation) {
-    case BinaryBasicBlock:
+    switch (ParentKind) {
+    case BasicBlockParent:
       return U.BBRef;
-    case BinaryFunction:
+    case FunctionParent:
       return U.BFRef;
     }
     llvm_unreachable("");
   }
 
   uint64_t getAddress() const {
-    switch (CurrentLocation) {
-    case BinaryBasicBlock:
+    switch (ParentKind) {
+    case BasicBlockParent:
       return U.BBRef.getAddress();
-    case BinaryFunction:
+    case FunctionParent:
       return U.BFRef.getAddress();
     }
     llvm_unreachable("");
   }
 
-  class BinaryFunction *getFunction() const {
-    switch (CurrentLocation) {
-    case BinaryFunction:
+  BinaryFunction *getFunction() const {
+    switch (ParentKind) {
+    case FunctionParent:
       return U.BFRef.BF;
-    case BinaryBasicBlock:
+    case BasicBlockParent:
       return U.BBRef.BB->getFunction();
     }
     llvm_unreachable("");
   }
 
-  class BinaryBasicBlock *getBasicBlock() const {
-    switch (CurrentLocation) {
-    case BinaryFunction:
+  BinaryBasicBlock *getBasicBlock() const {
+    switch (ParentKind) {
+    case FunctionParent:
       return nullptr;
-    case BinaryBasicBlock:
+    case BasicBlockParent:
       return U.BBRef.BB;
     }
     llvm_unreachable("");
diff --git a/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp b/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
index 7053fd7c579c26..c91e56ac941167 100644
--- a/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
+++ b/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
@@ -47,11 +47,11 @@ raw_ostream &operator<<(raw_ostream &OS, const MCInstInBFReference &Ref) {
 }
 
 raw_ostream &operator<<(raw_ostream &OS, const MCInstReference &Ref) {
-  switch (Ref.CurrentLocation) {
-  case MCInstReference::BinaryBasicBlock:
+  switch (Ref.ParentKind) {
+  case MCInstReference::BasicBlockParent:
     OS << Ref.U.BBRef;
     return OS;
-  case MCInstReference::BinaryFunction:
+  case MCInstReference::FunctionParent:
     OS << Ref.U.BFRef;
     return OS;
   }
@@ -432,8 +432,8 @@ void reportFoundGadgetInSingleBBSingleOverwInst(const BinaryContext &BC,
                                                 const MCInstReference OverwInst,
                                                 const MCInstReference RetInst) {
   BinaryBasicBlock *BB = RetInst.getBasicBlock();
-  assert(OverwInst.CurrentLocation == MCInstReference::BinaryBasicBlock);
-  assert(RetInst.CurrentLocation == MCInstReference::BinaryBasicBlock);
+  assert(OverwInst.ParentKind == MCInstReference::BasicBlockParent);
+  assert(RetInst.ParentKind == MCInstReference::BasicBlockParent);
   MCInstInBBReference OverwInstBB = OverwInst.U.BBRef;
   if (BB == OverwInstBB.BB) {
     // overwriting inst and ret instruction are in the same basic block.
@@ -479,7 +479,7 @@ void reportFoundGadget(const BinaryContext &BC, const MCInst &Inst,
   });
   if (NPPRG.OverwritingRetRegInst.size() == 1) {
     const MCInstReference OverwInst = NPPRG.OverwritingRetRegInst[0];
-    assert(OverwInst.CurrentLocation == MCInstReference::BinaryBasicBlock);
+    assert(OverwInst.ParentKind == MCInstReference::BasicBlockParent);
     reportFoundGadgetInSingleBBSingleOverwInst(BC, OverwInst, RetInst);
   }
 }

>From a30116dfb526b852244f2bd13f3e96049569ae8e Mon Sep 17 00:00:00 2001
From: Kristof Beyls <kristof.beyls at arm.com>
Date: Mon, 13 Jan 2025 11:00:28 +0000
Subject: [PATCH 07/14] Address the easy-to-address feedback by Jacob

---
 .../lib/Passes/NonPacProtectedRetAnalysis.cpp | 20 +++++++------------
 .../Target/AArch64/AArch64MCPlusBuilder.cpp   | 10 +---------
 2 files changed, 8 insertions(+), 22 deletions(-)

diff --git a/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp b/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
index c91e56ac941167..bb95c9f31f3106 100644
--- a/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
+++ b/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
@@ -9,9 +9,6 @@
 // This file implements a pass that looks for any AArch64 return instructions
 // that may not be protected by PAuth authentication instructions when needed.
 //
-// When needed = the register used to return (almost always X30), is potentially
-// written to between the AUThentication instruction and the RETurn instruction.
-//
 //===----------------------------------------------------------------------===//
 
 #include "bolt/Passes/NonPacProtectedRetAnalysis.h"
@@ -76,15 +73,13 @@ raw_ostream &operator<<(raw_ostream &OS,
 //     the function started, or
 // (b) the last write to the register must be by an authentication instruction.
 
-// This property is checked by using data flow analysis to keep track of which
+// This property is checked by using dataflow analysis to keep track of which
 // registers have been written (def-ed), since last authenticated. Those are
 // exactly the registers containing values that should not be trusted (as they
 // could have changed since the last time they were authenticated). For pac-ret,
 // any return instruction using such a register is a gadget to be reported. For
 // PAuthABI, any indirect control flow using such a register should be reported?
 
-// This security property is verified using a dataflow analysis.
-
 // Furthermore, when producing a diagnostic for a found non-pac-ret protected
 // return, the analysis also lists the last instructions that wrote to the
 // register used in the return instruction.
@@ -96,12 +91,11 @@ raw_ostream &operator<<(raw_ostream &OS,
 // 1. In the first run, the dataflow analysis only keeps track of the security
 //    property: i.e. which registers have been overwritten since the last
 //    time they've been authenticated.
-// 2. If the first run finds any return instructions using a
-//    written-to-last-by-an-non-authenticating instruction, the data flow
-//    analysis will be run a second time. The first run will return which
-//    registers are used in the gadgets to be reported. This information is
-//    used in the second run to also track with instructions last wrote to
-//    those registers.
+// 2. If the first run finds any return instructions using a register last
+//    written by a non-authenticating instruction, the dataflow analysis will
+//    be run a second time. The first run will return which registers are used
+//    in the gadgets to be reported. This information is used in the second run
+//    to also track with instructions last wrote to those registers.
 
 struct State {
   /// A BitVector containing the registers that have been clobbered, and
@@ -110,7 +104,7 @@ struct State {
   /// A vector of sets, only used in the second data flow run.
   /// Each element in the vector represent one registers for which we
   /// track the set of last instructions that wrote to this register.
-  /// For pac-ret analysis, the expectations is that almost all return
+  /// For pac-ret analysis, the expectation is that almost all return
   /// instructions only use register `X30`, and therefore, this vector
   /// will have length 1 in the second run.
   std::vector<SmallPtrSet<const MCInst *, 4>> LastInstWritingReg;
diff --git a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
index 7bbcb1faff3689..01bb53d9610e98 100644
--- a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
+++ b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
@@ -191,15 +191,7 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
     assert(isReturn(Inst));
     switch (Inst.getOpcode()) {
     case AArch64::RET:
-      // There should be one register that the return reads, and
-      // that's the one being used as the jump target?
-      for (unsigned OpIdx = 0, EndIdx = Inst.getNumOperands(); OpIdx < EndIdx;
-           ++OpIdx) {
-        const MCOperand &MO = Inst.getOperand(OpIdx);
-        if (MO.isReg())
-          return MO.getReg();
-      }
-      return getNoRegister();
+      return Inst.getOperand(0).getReg();
     case AArch64::RETAA:
     case AArch64::RETAB:
     case AArch64::ERETAA:

>From f44f9bfb53b8e4c18581fb2855b9d19bdeef95e2 Mon Sep 17 00:00:00 2001
From: Kristof Beyls <kristof.beyls at arm.com>
Date: Mon, 13 Jan 2025 13:44:05 +0000
Subject: [PATCH 08/14] Handle erets by producing diagnostic that analysis
 cannot analyze them correctly.

ERETAA/ERETAB/ERET instructions return to the address stored in one of
the system registers ELR_EL1, ELR_EL2 or ELR_EL3, depending on the
current "Exception Level" state.

I'm not aware of any examples of use of ERETA{A|B} in open source
software currently. My understanding is that could be used as follows to create a
pac-ret-like hardening for ERET:

       exception_entry:
         MRS x0, elr_el1 // Could be elr_el2, elr_el3 depending on whether
                         // this code is designed to run at EL1, EL2 or EL3.
         PACIA x0, SP
         STR   x0, [some place]
         ...
         LDR   x0, [some place]
         MSR   elr_el1, x0
         ERETAA

Assuming my understanding above is correct, this makes it impossible to
write a fully static "pac-ret"-like checker for exception returns,
because the static analyzer doesn't know whether the code will run at
EL1, EL2 or EL3.  Therefore, the static analyzer can't know whether to
check for writes to elr_el1, elr_el2 or elr_el3.

Therefore, in this commit, changes are made to produce a warning when
any of the ERET* instructions are encountered, stating that the analyzer
can't analyze them.

This in turn results in the non-pac-ret-protected-return analyzer now
needing to produce 2 different kinds of diagnostics.

I found that this was most nicely modelled by creating a class hierarchy
for these diagnostics.

However, it seems that MCAnnotations currently do not work (well or at
all) when trying to store any object from a class hierarchy, rather than
an object with fixed type.
At which point I realized that it was a bit strange that the diagnostics
that need to be produced are stored using MCAnnotations at all...
So, this patch redesigns that so that the diagnostics that need to be
produced are not stored as MCAnnotations on an MCInst at all...

Overall, it seems like this is a cleaner design and a better example for
any future binary analyses.
---
 bolt/include/bolt/Core/MCPlusBuilder.h        |   4 +-
 .../bolt/Passes/NonPacProtectedRetAnalysis.h  |  68 ++++++-
 .../lib/Passes/NonPacProtectedRetAnalysis.cpp | 175 ++++++++++--------
 .../Target/AArch64/AArch64MCPlusBuilder.cpp   |  42 ++++-
 .../AArch64/gs-pacret-autiasp.s               |  23 +--
 5 files changed, 208 insertions(+), 104 deletions(-)

diff --git a/bolt/include/bolt/Core/MCPlusBuilder.h b/bolt/include/bolt/Core/MCPlusBuilder.h
index 79f0176b47776a..fd62f2e22d3e1b 100644
--- a/bolt/include/bolt/Core/MCPlusBuilder.h
+++ b/bolt/include/bolt/Core/MCPlusBuilder.h
@@ -551,7 +551,7 @@ class MCPlusBuilder {
     return Analysis->isReturn(Inst);
   }
 
-  virtual MCPhysReg getAuthenticatedReg(const MCInst &Inst) const {
+  virtual ErrorOr<MCPhysReg> getAuthenticatedReg(const MCInst &Inst) const {
     llvm_unreachable("not implemented");
     return getNoRegister();
   }
@@ -562,7 +562,7 @@ class MCPlusBuilder {
     return false;
   }
 
-  virtual MCPhysReg getRegUsedAsRetDest(const MCInst &Inst) const {
+  virtual ErrorOr<MCPhysReg> getRegUsedAsRetDest(const MCInst &Inst) const {
     llvm_unreachable("not implemented");
     return getNoRegister();
   }
diff --git a/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h b/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h
index 1a9b2c41dbaf84..702709c0fba9ab 100644
--- a/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h
+++ b/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h
@@ -13,6 +13,9 @@
 #include "bolt/Core/BinaryFunction.h"
 #include "bolt/Passes/BinaryPasses.h"
 #include "llvm/ADT/SmallSet.h"
+#include "llvm/Support/raw_ostream.h"
+#include "llvm/Testing/Annotations/Annotations.h"
+#include <memory>
 
 namespace llvm {
 namespace bolt {
@@ -171,29 +174,82 @@ struct MCInstReference {
 
 raw_ostream &operator<<(raw_ostream &OS, const MCInstReference &);
 
-struct NonPacProtectedRetGadget {
+struct GeneralDiagnostic {
+  std::string Text;
+  GeneralDiagnostic(const std::string &Text) : Text(Text) {}
+  bool operator==(const GeneralDiagnostic &RHS) const {
+    return Text == RHS.Text;
+  }
+};
+
+struct NonPacProtectedRetAnnotation {
   MCInstReference RetInst;
+  NonPacProtectedRetAnnotation(MCInstReference RetInst) : RetInst(RetInst) {}
+  virtual bool operator==(const NonPacProtectedRetAnnotation &RHS) const {
+    return RetInst == RHS.RetInst;
+  }
+  NonPacProtectedRetAnnotation &
+  operator=(const NonPacProtectedRetAnnotation &Other) {
+    if (this == &Other)
+      return *this;
+    RetInst = Other.RetInst;
+    return *this;
+  }
+  virtual ~NonPacProtectedRetAnnotation() {}
+  virtual void generateReport(raw_ostream &OS,
+                              const BinaryContext &BC) const = 0;
+  virtual void print(raw_ostream &OS) const;
+};
+
+struct NonPacProtectedRetGadget : public NonPacProtectedRetAnnotation {
   std::vector<MCInstReference> OverwritingRetRegInst;
-  bool operator==(const NonPacProtectedRetGadget &RHS) const {
-    return RetInst == RHS.RetInst &&
+  virtual bool operator==(const NonPacProtectedRetGadget &RHS) const {
+    return NonPacProtectedRetAnnotation::operator==(RHS) &&
            OverwritingRetRegInst == RHS.OverwritingRetRegInst;
   }
   NonPacProtectedRetGadget(
       MCInstReference RetInst,
       const std::vector<MCInstReference> &OverwritingRetRegInst)
-      : RetInst(RetInst), OverwritingRetRegInst(OverwritingRetRegInst) {}
+      : NonPacProtectedRetAnnotation(RetInst),
+        OverwritingRetRegInst(OverwritingRetRegInst) {}
+  virtual void generateReport(raw_ostream &OS,
+                              const BinaryContext &BC) const override;
+  virtual void print(raw_ostream &OS) const override;
+};
+
+struct NonPacProtectedRetGenDiag : public NonPacProtectedRetAnnotation {
+  GeneralDiagnostic Diag;
+  virtual bool operator==(const NonPacProtectedRetGenDiag &RHS) const {
+    return NonPacProtectedRetAnnotation::operator==(RHS) && Diag == RHS.Diag;
+  }
+  NonPacProtectedRetGenDiag(MCInstReference RetInst, const std::string &Text)
+      : NonPacProtectedRetAnnotation(RetInst), Diag(Text) {}
+  virtual void generateReport(raw_ostream &OS,
+                              const BinaryContext &BC) const override;
+  virtual void print(raw_ostream &OS) const override;
 };
 
 raw_ostream &operator<<(raw_ostream &OS, const NonPacProtectedRetGadget &NPPRG);
+raw_ostream &operator<<(raw_ostream &OS, const GeneralDiagnostic &Diag);
+raw_ostream &operator<<(raw_ostream &OS, const NonPacProtectedRetGenDiag &Diag);
+raw_ostream &operator<<(raw_ostream &OS, const NonPacProtectedRetAnnotation &A);
+
 class PacRetAnalysis;
 
+struct FunctionAnalysisResult {
+  SmallSet<MCPhysReg, 1> RegistersAffected;
+  std::vector<std::shared_ptr<NonPacProtectedRetAnnotation>> Diagnostics;
+};
+
 class NonPacProtectedRetAnalysis : public BinaryFunctionPass {
   void runOnFunction(BinaryFunction &Function,
                      MCPlusBuilder::AllocatorIdTy AllocatorId);
-  SmallSet<MCPhysReg, 1>
+  FunctionAnalysisResult
   computeDfState(PacRetAnalysis &PRA, BinaryFunction &BF,
                  MCPlusBuilder::AllocatorIdTy AllocatorId);
-  unsigned GadgetAnnotationIndex;
+
+  std::map<const BinaryFunction *, FunctionAnalysisResult> AnalysisResults;
+  std::mutex AnalysisResultsMutex;
 
 public:
   explicit NonPacProtectedRetAnalysis() : BinaryFunctionPass(false) {}
diff --git a/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp b/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
index bb95c9f31f3106..e46469a4fd68c4 100644
--- a/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
+++ b/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
@@ -17,6 +17,7 @@
 #include "llvm/ADT/SmallSet.h"
 #include "llvm/MC/MCInst.h"
 #include "llvm/Support/Format.h"
+#include <memory>
 
 #define DEBUG_TYPE "bolt-nonpacprotectedret"
 
@@ -57,15 +58,37 @@ raw_ostream &operator<<(raw_ostream &OS, const MCInstReference &Ref) {
 
 raw_ostream &operator<<(raw_ostream &OS,
                         const NonPacProtectedRetGadget &NPPRG) {
-  OS << "pac-ret-gadget<";
-  OS << "Ret:" << NPPRG.RetInst << ", ";
+  OS << "pac-ret<Ret:" << NPPRG.RetInst << ", ";
+  OS << "gadget<";
   OS << "Overwriting:[";
   for (auto Ref : NPPRG.OverwritingRetRegInst)
     OS << Ref << " ";
   OS << "]>";
+  OS << ">";
+  return OS;
+}
+
+raw_ostream &operator<<(raw_ostream &OS, const GeneralDiagnostic &Diag) {
+  OS << "diag<'" << Diag.Text << "'>";
+  return OS;
+}
+
+raw_ostream &operator<<(raw_ostream &OS,
+                        const NonPacProtectedRetGenDiag &Diag) {
+  OS << "pac-ret<Ret:" << Diag.RetInst << ", " << Diag.Diag << ">";
   return OS;
 }
 
+raw_ostream &operator<<(raw_ostream &OS,
+                        const NonPacProtectedRetAnnotation &Diag) {
+  OS << "pac-ret<Ret:" << Diag.RetInst << ">";
+  return OS;
+}
+
+void NonPacProtectedRetAnnotation::print(raw_ostream &OS) const { OS << *this; }
+void NonPacProtectedRetGadget::print(raw_ostream &OS) const { OS << *this; }
+void NonPacProtectedRetGenDiag::print(raw_ostream &OS) const { OS << *this; }
+
 // The security property that is checked is:
 // When a register is used as the address to jump to in a return instruction,
 // that register must either:
@@ -106,7 +129,7 @@ struct State {
   /// track the set of last instructions that wrote to this register.
   /// For pac-ret analysis, the expectation is that almost all return
   /// instructions only use register `X30`, and therefore, this vector
-  /// will have length 1 in the second run.
+  /// will probably have length 1 in the second run.
   std::vector<SmallPtrSet<const MCInst *, 4>> LastInstWritingReg;
   State() {}
   State(uint16_t NumRegs, uint16_t NumRegsToTrack)
@@ -266,12 +289,12 @@ class PacRetAnalysis
       }
     }
 
-    MCPhysReg AutReg = BC.MIB->getAuthenticatedReg(Point);
-    if (AutReg != BC.MIB->getNoRegister()) {
+    ErrorOr<MCPhysReg> AutReg = BC.MIB->getAuthenticatedReg(Point);
+    if (AutReg && *AutReg != BC.MIB->getNoRegister()) {
       Next.NonAutClobRegs.reset(
-          BC.MIB->getAliases(AutReg, /*OnlySmaller=*/true));
-      if (TrackingLastInsts && isTrackingReg(AutReg))
-        Next.LastInstWritingReg[reg2StateIdx(AutReg)] = {};
+          BC.MIB->getAliases(*AutReg, /*OnlySmaller=*/true));
+      if (TrackingLastInsts && isTrackingReg(*AutReg))
+        Next.LastInstWritingReg[reg2StateIdx(*AutReg)] = {};
     }
 
     LLVM_DEBUG({
@@ -313,7 +336,7 @@ class PacRetAnalysis
   }
 };
 
-SmallSet<MCPhysReg, 1> NonPacProtectedRetAnalysis::computeDfState(
+FunctionAnalysisResult NonPacProtectedRetAnalysis::computeDfState(
     PacRetAnalysis &PRA, BinaryFunction &BF,
     MCPlusBuilder::AllocatorIdTy AllocatorId) {
   PRA.run();
@@ -321,15 +344,25 @@ SmallSet<MCPhysReg, 1> NonPacProtectedRetAnalysis::computeDfState(
     dbgs() << " After PacRetAnalysis:\n";
     BF.dump();
   });
+
+  FunctionAnalysisResult Result;
   // Now scan the CFG for non-authenticating return instructions that use an
   // overwritten, non-authenticated register as return address.
-  SmallSet<MCPhysReg, 1> RetRegsWithGadgets;
   BinaryContext &BC = BF.getBinaryContext();
   for (BinaryBasicBlock &BB : BF) {
     for (int64_t I = BB.size() - 1; I >= 0; --I) {
       MCInst &Inst = BB.getInstructionAtIndex(I);
       if (BC.MIB->isReturn(Inst)) {
-        MCPhysReg RetReg = BC.MIB->getRegUsedAsRetDest(Inst);
+        ErrorOr<MCPhysReg> MaybeRetReg = BC.MIB->getRegUsedAsRetDest(Inst);
+        if (MaybeRetReg.getError()) {
+          Result.Diagnostics.push_back(
+              std::make_shared<NonPacProtectedRetGenDiag>(
+                  MCInstInBBReference(&BB, I),
+                  "Warning: pac-ret analysis could not analyze this return "
+                  "instruction"));
+          continue;
+        }
+        MCPhysReg RetReg = *MaybeRetReg;
         LLVM_DEBUG({
           dbgs() << "  Found RET inst: ";
           BC.printInstruction(dbgs(), Inst);
@@ -355,28 +388,17 @@ SmallSet<MCPhysReg, 1> NonPacProtectedRetAnalysis::computeDfState(
         });
         if (DirtyRawRegs.any()) {
           // This return instruction needs to be reported
-          // First remove the annotation that the first, fast run of
-          // the dataflow analysis may have added.
-          if (BC.MIB->hasAnnotation(Inst, GadgetAnnotationIndex))
-            BC.MIB->removeAnnotation(Inst, GadgetAnnotationIndex);
-          BC.MIB->addAnnotation(
-              Inst, GadgetAnnotationIndex,
-              NonPacProtectedRetGadget(
+          Result.Diagnostics.push_back(
+              std::make_shared<NonPacProtectedRetGadget>(
                   MCInstInBBReference(&BB, I),
-                  PRA.getLastClobberingInsts(Inst, BF, DirtyRawRegs)),
-              AllocatorId);
-          LLVM_DEBUG({
-            dbgs() << "  Added gadget info annotation: ";
-            BC.printInstruction(dbgs(), Inst, 0, &BF);
-            dbgs() << "\n";
-          });
+                  PRA.getLastClobberingInsts(Inst, BF, DirtyRawRegs)));
           for (MCPhysReg RetRegWithGadget : DirtyRawRegs.set_bits())
-            RetRegsWithGadgets.insert(RetRegWithGadget);
+            Result.RegistersAffected.insert(RetRegWithGadget);
         }
       }
     }
   }
-  return RetRegsWithGadgets;
+  return Result;
 }
 
 void NonPacProtectedRetAnalysis::runOnFunction(
@@ -389,18 +411,20 @@ void NonPacProtectedRetAnalysis::runOnFunction(
 
   if (BF.hasCFG()) {
     PacRetAnalysis PRA(BF, AllocatorId, {});
-    SmallSet<MCPhysReg, 1> RetRegsWithGadgets =
-        computeDfState(PRA, BF, AllocatorId);
-    if (!RetRegsWithGadgets.empty()) {
+    FunctionAnalysisResult FAR = computeDfState(PRA, BF, AllocatorId);
+    if (!FAR.RegistersAffected.empty()) {
       // Redo the analysis, but now also track which instructions last wrote
       // to any of the registers in RetRegsWithGadgets, so that better
       // diagnostics can be produced.
       std::vector<MCPhysReg> RegsToTrack;
-      for (MCPhysReg R : RetRegsWithGadgets)
+      for (MCPhysReg R : FAR.RegistersAffected)
         RegsToTrack.push_back(R);
       PacRetAnalysis PRWIA(BF, AllocatorId, RegsToTrack);
-      SmallSet<MCPhysReg, 1> RetRegsWithGadgets =
-          computeDfState(PRWIA, BF, AllocatorId);
+      FAR = computeDfState(PRWIA, BF, AllocatorId);
+    }
+    {
+      std::lock_guard<std::mutex> Lock(AnalysisResultsMutex);
+      AnalysisResults[&BF] = FAR;
     }
   }
 }
@@ -422,9 +446,9 @@ void printBB(const BinaryContext &BC, const BinaryBasicBlock *BB,
   }
 }
 
-void reportFoundGadgetInSingleBBSingleOverwInst(const BinaryContext &BC,
-                                                const MCInstReference OverwInst,
-                                                const MCInstReference RetInst) {
+static void reportFoundGadgetInSingleBBSingleOverwInst(
+    raw_ostream &OS, const BinaryContext &BC, const MCInstReference OverwInst,
+    const MCInstReference RetInst) {
   BinaryBasicBlock *BB = RetInst.getBasicBlock();
   assert(OverwInst.ParentKind == MCInstReference::BasicBlockParent);
   assert(RetInst.ParentKind == MCInstReference::BasicBlockParent);
@@ -432,55 +456,64 @@ void reportFoundGadgetInSingleBBSingleOverwInst(const BinaryContext &BC,
   if (BB == OverwInstBB.BB) {
     // overwriting inst and ret instruction are in the same basic block.
     assert(OverwInstBB.BBIndex < RetInst.U.BBRef.BBIndex);
-    outs() << "  This happens in the following basic block:\n";
+    OS << "  This happens in the following basic block:\n";
     printBB(BC, BB);
   }
 }
 
-void reportFoundGadget(const BinaryContext &BC, const MCInst &Inst,
-                       unsigned int GadgetAnnotationIndex) {
-  auto NPPRG = BC.MIB->getAnnotationAs<NonPacProtectedRetGadget>(
-      Inst, GadgetAnnotationIndex);
-  MCInstReference RetInst = NPPRG.RetInst;
+void NonPacProtectedRetGadget::generateReport(raw_ostream &OS,
+                                              const BinaryContext &BC) const {
   BinaryFunction *BF = RetInst.getFunction();
   BinaryBasicBlock *BB = RetInst.getBasicBlock();
 
-  outs() << "\nGS-PACRET: " << "non-protected ret found in function "
-         << BF->getPrintName();
+  OS << "\nGS-PACRET: " << "non-protected ret found in function "
+     << BF->getPrintName();
   if (BB)
-    outs() << ", basic block " << BB->getName();
-  outs() << ", at address " << llvm::format("%x", RetInst.getAddress()) << "\n";
-  outs() << "  The return instruction is ";
-  BC.printInstruction(outs(), RetInst, RetInst.getAddress(), BF);
-  outs() << "  The " << NPPRG.OverwritingRetRegInst.size()
-         << " instructions that write to the return register after any "
-            "authentication are:\n";
+    OS << ", basic block " << BB->getName();
+  OS << ", at address " << llvm::format("%x", RetInst.getAddress()) << "\n";
+  OS << "  The return instruction is ";
+  BC.printInstruction(OS, RetInst, RetInst.getAddress(), BF);
+  OS << "  The " << OverwritingRetRegInst.size()
+     << " instructions that write to the return register after any "
+        "authentication are:\n";
   // Sort by address to ensure output is deterministic.
-  llvm::sort(NPPRG.OverwritingRetRegInst,
-             [](const MCInstReference &A, const MCInstReference &B) {
-               return A.getAddress() < B.getAddress();
-             });
-  for (unsigned I = 0; I < NPPRG.OverwritingRetRegInst.size(); ++I) {
-    MCInstReference InstRef = NPPRG.OverwritingRetRegInst[I];
-    outs() << "  " << (I + 1) << ". ";
-    BC.printInstruction(outs(), InstRef, InstRef.getAddress(), BF);
+  std::vector<MCInstReference> ORRI = OverwritingRetRegInst;
+  llvm::sort(ORRI, [](const MCInstReference &A, const MCInstReference &B) {
+    return A.getAddress() < B.getAddress();
+  });
+  for (unsigned I = 0; I < ORRI.size(); ++I) {
+    MCInstReference InstRef = ORRI[I];
+    OS << "  " << (I + 1) << ". ";
+    BC.printInstruction(OS, InstRef, InstRef.getAddress(), BF);
   };
   LLVM_DEBUG({
     dbgs() << "  .. OverWritingRetRegInst:\n";
-    for (MCInstReference Ref : NPPRG.OverwritingRetRegInst) {
+    for (MCInstReference Ref : OverwritingRetRegInst) {
       dbgs() << "    " << Ref << "\n";
     }
   });
-  if (NPPRG.OverwritingRetRegInst.size() == 1) {
-    const MCInstReference OverwInst = NPPRG.OverwritingRetRegInst[0];
+  if (OverwritingRetRegInst.size() == 1) {
+    const MCInstReference OverwInst = OverwritingRetRegInst[0];
     assert(OverwInst.ParentKind == MCInstReference::BasicBlockParent);
-    reportFoundGadgetInSingleBBSingleOverwInst(BC, OverwInst, RetInst);
+    reportFoundGadgetInSingleBBSingleOverwInst(OS, BC, OverwInst, RetInst);
   }
 }
 
-Error NonPacProtectedRetAnalysis::runOnFunctions(BinaryContext &BC) {
-  GadgetAnnotationIndex = BC.MIB->getOrCreateAnnotationIndex("pacret-gadget");
+void NonPacProtectedRetGenDiag::generateReport(raw_ostream &OS,
+                                               const BinaryContext &BC) const {
+  BinaryFunction *BF = RetInst.getFunction();
+  BinaryBasicBlock *BB = RetInst.getBasicBlock();
 
+  OS << "\nGS-PACRET: " << "" << Diag.Text;
+  OS << " in function " << BF->getPrintName();
+  if (BB)
+    OS << ", basic block " << BB->getName();
+  OS << ", at address " << llvm::format("%x", RetInst.getAddress()) << "\n";
+  OS << "  The return instruction is ";
+  BC.printInstruction(OS, RetInst, RetInst.getAddress(), BF);
+}
+
+Error NonPacProtectedRetAnalysis::runOnFunctions(BinaryContext &BC) {
   ParallelUtilities::WorkFuncWithAllocTy WorkFun =
       [&](BinaryFunction &BF, MCPlusBuilder::AllocatorIdTy AllocatorId) {
         runOnFunction(BF, AllocatorId);
@@ -495,14 +528,10 @@ Error NonPacProtectedRetAnalysis::runOnFunctions(BinaryContext &BC) {
       SkipFunc, "NonPacProtectedRetAnalysis");
 
   for (BinaryFunction *BF : BC.getAllBinaryFunctions())
-    if (BF->hasCFG()) {
-      for (BinaryBasicBlock &BB : *BF) {
-        for (MCInst &Inst : BB) {
-          if (BC.MIB->hasAnnotation(Inst, GadgetAnnotationIndex)) {
-            reportFoundGadget(BC, Inst, GadgetAnnotationIndex);
-          }
-        }
-      }
+    if (AnalysisResults.count(BF) > 0) {
+      for (const std::shared_ptr<NonPacProtectedRetAnnotation> &A :
+           AnalysisResults[BF].Diagnostics)
+        A->generateReport(outs(), BC);
     }
   return Error::success();
 }
diff --git a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
index 01bb53d9610e98..ea37ec140079ae 100644
--- a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
+++ b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
@@ -149,21 +149,42 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
     return false;
   }
 
-  MCPhysReg getAuthenticatedReg(const MCInst &Inst) const override {
+  ErrorOr<MCPhysReg> getAuthenticatedReg(const MCInst &Inst) const override {
     switch (Inst.getOpcode()) {
     case AArch64::AUTIAZ:
     case AArch64::AUTIBZ:
     case AArch64::AUTIASP:
     case AArch64::AUTIBSP:
+    case AArch64::AUTIASPPCi:
+    case AArch64::AUTIBSPPCi:
+    case AArch64::AUTIASPPCr:
+    case AArch64::AUTIBSPPCr:
     case AArch64::RETAA:
     case AArch64::RETAB:
+    case AArch64::RETAASPPCi:
+    case AArch64::RETABSPPCi:
+    case AArch64::RETAASPPCr:
+    case AArch64::RETABSPPCr:
       return AArch64::LR;
+
     case AArch64::AUTIA1716:
     case AArch64::AUTIB1716:
+    case AArch64::AUTIA171615:
+    case AArch64::AUTIB171615:
       return AArch64::X17;
+
     case AArch64::ERETAA:
     case AArch64::ERETAB:
-      return AArch64::LR;
+      // The ERETA{A,B} instructions use either register ELR_EL1, ELR_EL2 or
+      // ELR_EL3, depending on the current Exception Level at run-time.
+      //
+      // Furthermore, these registers are not modelled by LLVM as a regular
+      // MCPhysReg.... So there is no way to indicate that through the current
+      // API.
+      //
+      // Therefore, return an error to indicate that LLVM/BOLT cannot model
+      // this.
+      return make_error_code(std::errc::result_out_of_range);
 
     case AArch64::AUTIA:
     case AArch64::AUTIB:
@@ -175,29 +196,32 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
     case AArch64::AUTDZB:
       return Inst.getOperand(0).getReg();
 
+      // FIXME: BL?RA(A|B)Z? and LDRA(A|B) should probably be handled here too.
+
     default:
       return getNoRegister();
     }
   }
 
-  bool isAuthenticationOfReg(const MCInst &Inst,
-                             MCPhysReg AuthenticatedReg) const override {
-    if (AuthenticatedReg == getNoRegister())
+  bool isAuthenticationOfReg(const MCInst &Inst, MCPhysReg Reg) const override {
+    if (Reg == getNoRegister())
       return false;
-    return getAuthenticatedReg(Inst) == AuthenticatedReg;
+    ErrorOr<MCPhysReg> AuthenticatedReg = getAuthenticatedReg(Inst);
+    return AuthenticatedReg.getError() ? false : *AuthenticatedReg == Reg;
   }
 
-  MCPhysReg getRegUsedAsRetDest(const MCInst &Inst) const override {
+  ErrorOr<MCPhysReg> getRegUsedAsRetDest(const MCInst &Inst) const override {
     assert(isReturn(Inst));
     switch (Inst.getOpcode()) {
     case AArch64::RET:
       return Inst.getOperand(0).getReg();
     case AArch64::RETAA:
     case AArch64::RETAB:
+      return AArch64::LR;
+    case AArch64::ERET:
     case AArch64::ERETAA:
     case AArch64::ERETAB:
-    case AArch64::ERET:
-      return AArch64::LR;
+      return make_error_code(std::errc::result_out_of_range);
     default:
       llvm_unreachable("Unhandled return instruction");
     }
diff --git a/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s b/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s
index 8731bb46362867..ac16346fd815bb 100644
--- a/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s
+++ b/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s
@@ -623,7 +623,8 @@ f_eretaa:
         bl      g
         add     x0, x0, #3
         ldp     x29, x30, [sp], #16
-// CHECK-NOT: function f_eretaa
+// CHECK-LABEL: GS-PACRET: Warning: pac-ret analysis could not analyze this return instruction in function f_eretaa, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT:   The return instruction is     {{[0-9a-f]+}}:       eretaa
         eretaa
         .size f_eretaa, .-f_eretaa
 
@@ -636,7 +637,8 @@ f_eretab:
         bl      g
         add     x0, x0, #3
         ldp     x29, x30, [sp], #16
-// CHECK-NOT: function f_eretab
+// CHECK-LABEL: GS-PACRET: Warning: pac-ret analysis could not analyze this return instruction in function f_eretab, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT:   The return instruction is     {{[0-9a-f]+}}:       eretab
         eretab
         .size f_eretab, .-f_eretab
 
@@ -649,18 +651,8 @@ f_eret:
         bl      g
         add     x0, x0, #3
         ldp     x29, x30, [sp], #16
-// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_eret, basic block .LBB{{[0-9]+}}, at address
-// CHECK-NEXT:    The return instruction is     {{[0-9a-f]+}}:       eret
-// CHECK-NEXT:    The 1 instructions that write to the return register after any authentication are:
-// CHECK-NEXT:    1. {{[0-9a-f]+}}: ldp     x29, x30, [sp], #0x10
-// CHECK-NEXT:  This happens in the following basic block:
-// CHECK-NEXT: {{[0-9a-f]+}}:   paciasp
-// CHECK-NEXT: {{[0-9a-f]+}}:   stp     x29, x30, [sp, #-0x10]!
-// CHECK-NEXT: {{[0-9a-f]+}}:   mov     x29, sp
-// CHECK-NEXT: {{[0-9a-f]+}}:   bl      g at PLT
-// CHECK-NEXT: {{[0-9a-f]+}}:   add     x0, x0, #0x3
-// CHECK-NEXT: {{[0-9a-f]+}}:   ldp     x29, x30, [sp], #0x10
-// CHECK-NEXT: {{[0-9a-f]+}}:   eret
+// CHECK-LABEL: GS-PACRET: Warning: pac-ret analysis could not analyze this return instruction in function f_eret, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT:   The return instruction is     {{[0-9a-f]+}}:       eret
         eret
         .size f_eret, .-f_eret
 
@@ -678,4 +670,7 @@ f_movx30reg:
         ret
         .size f_movx30reg, .-f_movx30reg
 
+// FIXME: add regression tests for the instructions added in v9.5: AUTI{A,B}SPPC{i,r}, RETI{A,B}SPPC{i,r}, AUTI{A,B}171615.
+
+
 // TODO: add test to see if registers clobbered by a call are picked up.

>From 89d999462c12d1622c26b8ad0c2453b9718fd8d4 Mon Sep 17 00:00:00 2001
From: Kristof Beyls <kristof.beyls at arm.com>
Date: Wed, 15 Jan 2025 18:26:04 +0000
Subject: [PATCH 09/14] Improve comment related to pauthabi checking

---
 bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp b/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
index e46469a4fd68c4..f17211168d3cfc 100644
--- a/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
+++ b/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
@@ -101,7 +101,8 @@ void NonPacProtectedRetGenDiag::print(raw_ostream &OS) const { OS << *this; }
 // exactly the registers containing values that should not be trusted (as they
 // could have changed since the last time they were authenticated). For pac-ret,
 // any return instruction using such a register is a gadget to be reported. For
-// PAuthABI, any indirect control flow using such a register should be reported?
+// PAuthABI, probably at least any indirect control flow using such a register
+// should be reported.
 
 // Furthermore, when producing a diagnostic for a found non-pac-ret protected
 // return, the analysis also lists the last instructions that wrote to the

>From 833ef9fa8f10ca3da69ee87776f571bc0607e812 Mon Sep 17 00:00:00 2001
From: Kristof Beyls <kristof.beyls at arm.com>
Date: Wed, 15 Jan 2025 18:42:03 +0000
Subject: [PATCH 10/14] Make a few trivial-to-apply improvements suggested by
 @atrosinenko

---
 bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h  |  6 +++---
 bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp         | 10 ++++------
 bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s  |  4 ++--
 bolt/test/binary-analysis/AArch64/gs-pacret-multi-bb.s |  2 +-
 4 files changed, 10 insertions(+), 12 deletions(-)

diff --git a/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h b/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h
index 702709c0fba9ab..c37a802724ece0 100644
--- a/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h
+++ b/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h
@@ -36,7 +36,7 @@ struct MCInstInBBReference {
   static MCInstInBBReference get(const MCInst *Inst, BinaryFunction &BF) {
     for (BinaryBasicBlock &BB : BF)
       for (size_t I = 0; I < BB.size(); ++I)
-        if (Inst == &(BB.getInstructionAtIndex(I)))
+        if (Inst == &BB.getInstructionAtIndex(I))
           return MCInstInBBReference(&BB, I);
     return {};
   }
@@ -53,7 +53,7 @@ struct MCInstInBBReference {
     return BB->getInstructionAtIndex(BBIndex);
   }
   uint64_t getAddress() const {
-    // 4 bytes per instruction on AArch64;
+    // 4 bytes per instruction on AArch64.
     // FIXME: the assumption of 4 byte per instruction needs to be fixed before
     // this method gets used on any non-AArch64 binaries (but should be fine for
     // pac-ret analysis, as that is an AArch64-specific feature).
@@ -79,7 +79,7 @@ struct MCInstInBFReference {
   }
   operator MCInst &() const {
     assert(BF != nullptr);
-    return *(BF->getInstructionAtOffset(Offset));
+    return *BF->getInstructionAtOffset(Offset);
   }
 
   uint64_t getOffset() const { return Offset; }
diff --git a/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp b/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
index f17211168d3cfc..467f2ddc1f2942 100644
--- a/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
+++ b/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
@@ -202,8 +202,8 @@ class PacRetAnalysis
                  const std::vector<MCPhysReg> &RegsToTrackInstsFor)
       : Parent(BF, AllocId), NumRegs(BF.getBinaryContext().MRI->getNumRegs()),
         RegsToTrackInstsFor(RegsToTrackInstsFor),
-        TrackingLastInsts(RegsToTrackInstsFor.size() != 0),
-        Reg2StateIdx(RegsToTrackInstsFor.size() == 0
+        TrackingLastInsts(!RegsToTrackInstsFor.empty()),
+        Reg2StateIdx(RegsToTrackInstsFor.empty()
                          ? 0
                          : *llvm::max_element(RegsToTrackInstsFor) + 1,
                      -1) {
@@ -223,11 +223,9 @@ class PacRetAnalysis
   std::vector<uint16_t> Reg2StateIdx;
 
   bool isTrackingReg(MCPhysReg Reg) const {
-    for (auto R : RegsToTrackInstsFor)
-      if (R == Reg)
-        return true;
-    return false;
+    return llvm::is_contained(RegsToTrackInstsFor, Reg);
   }
+
   uint16_t reg2StateIdx(MCPhysReg Reg) const {
     assert(Reg < Reg2StateIdx.size());
     return Reg2StateIdx[Reg];
diff --git a/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s b/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s
index ac16346fd815bb..9a20fd38636612 100644
--- a/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s
+++ b/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s
@@ -1,5 +1,5 @@
 // RUN: %clang %cflags -march=armv8.3-a -mbranch-protection=pac-ret %s %p/../../Inputs/asm_main.c -o %t.exe
-// RUN: llvm-bolt-binary-analysis --scanners=pacret %t.exe 2>&1 | FileCheck -check-prefix=CHECK --allow-empty %s
+// RUN: llvm-bolt-binary-analysis --scanners=pacret %t.exe 2>&1 | FileCheck %s
 
         .text
 
@@ -208,7 +208,7 @@ f_autibz:
         bl      g
         add     x0, x0, #3
         ldp     x29, x30, [sp], #16
-        autiaz
+        autibz
 // CHECK-NOT: function f_autibz
         ret
         .size f_autibz, .-f_autibz
diff --git a/bolt/test/binary-analysis/AArch64/gs-pacret-multi-bb.s b/bolt/test/binary-analysis/AArch64/gs-pacret-multi-bb.s
index 402bdf6591dcb1..aaf82b5d188495 100644
--- a/bolt/test/binary-analysis/AArch64/gs-pacret-multi-bb.s
+++ b/bolt/test/binary-analysis/AArch64/gs-pacret-multi-bb.s
@@ -1,5 +1,5 @@
 // RUN: %clang %cflags -march=armv8.3-a -mbranch-protection=pac-ret %s %p/../../Inputs/asm_main.c -o %t.exe
-// RUN: llvm-bolt-binary-analysis --scanners=pacret %t.exe 2>&1 | FileCheck -check-prefix=CHECK --allow-empty %s
+// RUN: llvm-bolt-binary-analysis --scanners=pacret %t.exe 2>&1 | FileCheck %s
 
 
 // Verify that we can also detect gadgets across basic blocks

>From fcd288c4daffe56c85b4ade0c16f320e519f3a0e Mon Sep 17 00:00:00 2001
From: Kristof Beyls <kristof.beyls at arm.com>
Date: Thu, 16 Jan 2025 11:10:00 +0000
Subject: [PATCH 11/14] Add regression tests for the instructions added in
 pauth-lr: AUTI{A,B}SPPC{i,r}, RETI{A,B}SPPC{i,r}, AUTI{A,B}171615.

---
 .../Target/AArch64/AArch64MCPlusBuilder.cpp   |   4 +
 .../AArch64/gs-pacret-autiasp.s               | 179 +++++++++++++++++-
 llvm/lib/Target/AArch64/AArch64InstrInfo.td   |   2 +
 3 files changed, 183 insertions(+), 2 deletions(-)

diff --git a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
index ea37ec140079ae..167ad2e34c5918 100644
--- a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
+++ b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
@@ -217,6 +217,10 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
       return Inst.getOperand(0).getReg();
     case AArch64::RETAA:
     case AArch64::RETAB:
+    case AArch64::RETAASPPCi:
+    case AArch64::RETABSPPCi:
+    case AArch64::RETAASPPCr:
+    case AArch64::RETABSPPCr:
       return AArch64::LR;
     case AArch64::ERET:
     case AArch64::ERETAA:
diff --git a/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s b/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s
index 9a20fd38636612..ed7189b501f6a4 100644
--- a/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s
+++ b/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s
@@ -1,4 +1,4 @@
-// RUN: %clang %cflags -march=armv8.3-a -mbranch-protection=pac-ret %s %p/../../Inputs/asm_main.c -o %t.exe
+// RUN: %clang %cflags -march=armv9.5-a+pauth-lr -mbranch-protection=pac-ret %s %p/../../Inputs/asm_main.c -o %t.exe
 // RUN: llvm-bolt-binary-analysis --scanners=pacret %t.exe 2>&1 | FileCheck %s
 
         .text
@@ -670,7 +670,182 @@ f_movx30reg:
         ret
         .size f_movx30reg, .-f_movx30reg
 
-// FIXME: add regression tests for the instructions added in v9.5: AUTI{A,B}SPPC{i,r}, RETI{A,B}SPPC{i,r}, AUTI{A,B}171615.
+        .globl  f_autiasppci
+        .type   f_autiasppci, at function
+f_autiasppci:
+0:
+        pacnbiasppc
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        ldp     x29, x30, [sp], #16
+        autiasppc 0b
+// CHECK-NOT: function f_autiasppci
+        ret
+        .size f_autiasppci, .-f_autiasppci
+
+        .globl  f_autibsppci
+        .type   f_autibsppci, at function
+f_autibsppci:
+0:
+        pacnbibsppc
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        ldp     x29, x30, [sp], #16
+        autibsppc 0b
+// CHECK-NOT: function f_autibsppci
+        ret
+        .size f_autibsppci, .-f_autibsppci
+
+        .globl  f_autiasppcr
+        .type   f_autiasppcr, at function
+
+f_autiasppcr:
+0:
+        pacnbiasppc
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        ldp     x29, x30, [sp], #16
+        adr     x28, 0b
+        autiasppcr x28
+// CHECK-NOT: function f_autiasppcr
+        ret
+        .size f_autiasppcr, .-f_autiasppcr
+
+f_autibsppcr:
+0:
+        pacnbibsppc
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        ldp     x29, x30, [sp], #16
+        adr     x28, 0b
+        autibsppcr x28
+// CHECK-NOT: function f_autibsppcr
+        ret
+        .size f_autibsppcr, .-f_autibsppcr
+
+        .globl  f_retaasppci
+        .type   f_retaasppci, at function
+f_retaasppci:
+0:
+        pacnbiasppc
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        ldp     x29, x30, [sp], #16
+// CHECK-NOT: function f_retaasppci
+        retaasppc 0b
+        .size f_retaasppci, .-f_retaasppci
+
+        .globl  f_retabsppci
+        .type   f_retabsppci, at function
+f_retabsppci:
+0:
+        pacnbibsppc
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        ldp     x29, x30, [sp], #16
+// CHECK-NOT: function f_retabsppci
+        retabsppc 0b
+        .size f_retabsppci, .-f_retabsppci
+
+        .globl  f_retaasppcr
+        .type   f_retaasppcr, at function
+
+f_retaasppcr:
+0:
+        pacnbiasppc
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        ldp     x29, x30, [sp], #16
+        adr     x28, 0b
+// CHECK-NOT: function f_retaasppcr
+        retaasppcr x28
+        .size f_retaasppcr, .-f_retaasppcr
 
+f_retabsppcr:
+0:
+        pacnbibsppc
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        ldp     x29, x30, [sp], #16
+        adr     x28, 0b
+// CHECK-NOT: function f_retabsppcr
+        retabsppcr x28
+        .size f_retabsppcr, .-f_retabsppcr
+
+        .globl  f_autia171615
+        .type   f_autia171615, at function
+f_autia171615:
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        ldp     x29, x30, [sp], #16
+        autia171615
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_autia171615, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT:    The return instruction is     {{[0-9a-f]+}}:       ret
+// CHECK-NEXT:    The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT:    1. {{[0-9a-f]+}}: ldp     x29, x30, [sp], #0x10
+// CHECK-NEXT:  This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}:   paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}:   stp     x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}:   mov     x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}:   bl      g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}:   add     x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}:   ldp     x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}:   autia171615
+// CHECK-NEXT: {{[0-9a-f]+}}:   ret
+        ret
+        .size f_autia171615, .-f_autia171615
+
+        .globl  f_autib171615
+        .type   f_autib171615, at function
+f_autib171615:
+        hint    #25
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        ldp     x29, x30, [sp], #16
+        autib171615
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_autib171615, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT:    The return instruction is     {{[0-9a-f]+}}:       ret
+// CHECK-NEXT:    The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT:    1. {{[0-9a-f]+}}: ldp     x29, x30, [sp], #0x10
+// CHECK-NEXT:  This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}:   paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}:   stp     x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}:   mov     x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}:   bl      g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}:   add     x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}:   ldp     x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}:   autib171615
+// CHECK-NEXT: {{[0-9a-f]+}}:   ret
+        ret
+        .size f_autib171615, .-f_autib171615
 
 // TODO: add test to see if registers clobbered by a call are picked up.
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.td b/llvm/lib/Target/AArch64/AArch64InstrInfo.td
index ec891ea4bac85e..aa4aa56cdc45b4 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.td
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.td
@@ -2022,6 +2022,8 @@ let Predicates = [HasPAuthLR] in {
     //                              opcode2, opcode,   asm
     def AUTIASPPCr : SignAuthOneReg<0b00001, 0b100100, "autiasppcr">;
     def AUTIBSPPCr : SignAuthOneReg<0b00001, 0b100101, "autibsppcr">;
+  }
+  let Defs = [X17], Uses = [X15, X16, X17] in {
     //                                  opcode2, opcode,   asm
     def PACIA171615 : SignAuthFixedRegs<0b00001, 0b100010, "pacia171615">;
     def PACIB171615 : SignAuthFixedRegs<0b00001, 0b100011, "pacib171615">;

>From 0480ce79a05d88f942aaf72a426c96cccbc93eff Mon Sep 17 00:00:00 2001
From: Kristof Beyls <kristof.beyls at arm.com>
Date: Thu, 16 Jan 2025 15:31:17 +0000
Subject: [PATCH 12/14] Introduce lastWritingInsts helper function to improve
 readability

---
 .../lib/Passes/NonPacProtectedRetAnalysis.cpp | 32 +++++++++----------
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp b/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
index 467f2ddc1f2942..9c7b676efe30fb 100644
--- a/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
+++ b/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
@@ -222,13 +222,19 @@ class PacRetAnalysis
   /// track which instructions last wrote to this register.
   std::vector<uint16_t> Reg2StateIdx;
 
-  bool isTrackingReg(MCPhysReg Reg) const {
-    return llvm::is_contained(RegsToTrackInstsFor, Reg);
+  SmallPtrSet<const MCInst *, 4> &lastWritingInsts(State &S,
+                                                   MCPhysReg Reg) const {
+    assert(Reg < Reg2StateIdx.size());
+    return S.LastInstWritingReg[Reg2StateIdx[Reg]];
   }
-
-  uint16_t reg2StateIdx(MCPhysReg Reg) const {
+  const SmallPtrSet<const MCInst *, 4> &lastWritingInsts(const State &S,
+                                                         MCPhysReg Reg) const {
     assert(Reg < Reg2StateIdx.size());
-    return Reg2StateIdx[Reg];
+    return S.LastInstWritingReg[Reg2StateIdx[Reg]];
+  }
+
+  bool isTrackingReg(MCPhysReg Reg) const {
+    return llvm::is_contained(RegsToTrackInstsFor, Reg);
   }
 
   void preflight() {}
@@ -279,21 +285,16 @@ class PacRetAnalysis
     Next.NonAutClobRegs |= Written;
     // Keep track of this instruction if it writes to any of the registers we
     // need to track that for:
-    if (TrackingLastInsts) {
-      for (auto WrittenReg : Written.set_bits()) {
-        if (isTrackingReg(WrittenReg)) {
-          Next.LastInstWritingReg[reg2StateIdx(WrittenReg)] = {};
-          Next.LastInstWritingReg[reg2StateIdx(WrittenReg)].insert(&Point);
-        }
-      }
-    }
+    for (MCPhysReg Reg : RegsToTrackInstsFor)
+      if (Written[Reg])
+        lastWritingInsts(Next, Reg) = {&Point};
 
     ErrorOr<MCPhysReg> AutReg = BC.MIB->getAuthenticatedReg(Point);
     if (AutReg && *AutReg != BC.MIB->getNoRegister()) {
       Next.NonAutClobRegs.reset(
           BC.MIB->getAliases(*AutReg, /*OnlySmaller=*/true));
       if (TrackingLastInsts && isTrackingReg(*AutReg))
-        Next.LastInstWritingReg[reg2StateIdx(*AutReg)] = {};
+        lastWritingInsts(Next, *AutReg).clear();
     }
 
     LLVM_DEBUG({
@@ -319,8 +320,7 @@ class PacRetAnalysis
       // been tracked.
       std::set<const MCInst *> LastWritingInsts;
       for (MCPhysReg TrackedReg : DirtyRawRegs.set_bits()) {
-        for (const MCInst *LastInstWriting :
-             S.LastInstWritingReg[reg2StateIdx(TrackedReg)])
+        for (const MCInst *LastInstWriting : lastWritingInsts(S, TrackedReg))
           LastWritingInsts.insert(LastInstWriting);
       }
       std::vector<MCInstReference> Result;

>From a05ae624bad01175f9dea943a59da9b30d80f7e0 Mon Sep 17 00:00:00 2001
From: Kristof Beyls <kristof.beyls at arm.com>
Date: Thu, 16 Jan 2025 18:57:09 +0000
Subject: [PATCH 13/14] Add a few more regression tests

---
 .../AArch64/gs-pacret-autiasp.s               | 86 +++++++++----------
 .../AArch64/gs-pacret-multi-bb.s              | 33 ++++++-
 2 files changed, 74 insertions(+), 45 deletions(-)

diff --git a/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s b/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s
index ed7189b501f6a4..7b73bc9cebe855 100644
--- a/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s
+++ b/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s
@@ -6,7 +6,7 @@
         .globl  f1
         .type   f1, at function
 f1:
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         mov     x29, sp
         bl      g
@@ -32,7 +32,7 @@ f1:
         .globl  f_intermediate_overwrite1
         .type   f_intermediate_overwrite1, at function
 f_intermediate_overwrite1:
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         mov     x29, sp
         bl      g
@@ -58,7 +58,7 @@ f_intermediate_overwrite1:
         .globl  f_intermediate_overwrite2
         .type   f_intermediate_overwrite2, at function
 f_intermediate_overwrite2:
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         mov     x29, sp
         bl      g
@@ -86,7 +86,7 @@ f_intermediate_overwrite2:
         .globl  f_intermediate_read
         .type   f_intermediate_read, at function
 f_intermediate_read:
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         mov     x29, sp
         bl      g
@@ -101,7 +101,7 @@ f_intermediate_read:
         .globl  f_intermediate_overwrite3
         .type   f_intermediate_overwrite3, at function
 f_intermediate_overwrite3:
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         mov     x29, sp
         bl      g
@@ -129,7 +129,7 @@ f_intermediate_overwrite3:
         .globl  f_nonx30_ret
         .type   f_nonx30_ret, at function
 f_nonx30_ret:
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         mov     x29, sp
         bl      g
@@ -160,7 +160,7 @@ f_nonx30_ret:
         .globl  f_autiasp
         .type   f_autiasp, at function
 f_autiasp:
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         mov     x29, sp
         bl      g
@@ -174,7 +174,7 @@ f_autiasp:
         .globl  f_autibsp
         .type   f_autibsp, at function
 f_autibsp:
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         mov     x29, sp
         bl      g
@@ -188,7 +188,7 @@ f_autibsp:
         .globl  f_autiaz
         .type   f_autiaz, at function
 f_autiaz:
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         mov     x29, sp
         bl      g
@@ -202,7 +202,7 @@ f_autiaz:
         .globl  f_autibz
         .type   f_autibz, at function
 f_autibz:
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         mov     x29, sp
         bl      g
@@ -216,7 +216,7 @@ f_autibz:
         .globl  f_autia1716
         .type   f_autia1716, at function
 f_autia1716:
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         mov     x29, sp
         bl      g
@@ -242,7 +242,7 @@ f_autia1716:
         .globl  f_autib1716
         .type   f_autib1716, at function
 f_autib1716:
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         mov     x29, sp
         bl      g
@@ -268,7 +268,7 @@ f_autib1716:
         .globl  f_autiax12
         .type   f_autiax12, at function
 f_autiax12:
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         mov     x29, sp
         bl      g
@@ -294,7 +294,7 @@ f_autiax12:
         .globl  f_autibx12
         .type   f_autibx12, at function
 f_autibx12:
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         mov     x29, sp
         bl      g
@@ -320,7 +320,7 @@ f_autibx12:
         .globl  f_autiax30
         .type   f_autiax30, at function
 f_autiax30:
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         mov     x29, sp
         bl      g
@@ -334,7 +334,7 @@ f_autiax30:
         .globl  f_autibx30
         .type   f_autibx30, at function
 f_autibx30:
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         mov     x29, sp
         bl      g
@@ -349,7 +349,7 @@ f_autibx30:
         .globl  f_autdax12
         .type   f_autdax12, at function
 f_autdax12:
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         mov     x29, sp
         bl      g
@@ -375,7 +375,7 @@ f_autdax12:
         .globl  f_autdbx12
         .type   f_autdbx12, at function
 f_autdbx12:
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         mov     x29, sp
         bl      g
@@ -401,7 +401,7 @@ f_autdbx12:
         .globl  f_autdax30
         .type   f_autdax30, at function
 f_autdax30:
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         mov     x29, sp
         bl      g
@@ -415,7 +415,7 @@ f_autdax30:
         .globl  f_autdbx30
         .type   f_autdbx30, at function
 f_autdbx30:
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         mov     x29, sp
         bl      g
@@ -430,7 +430,7 @@ f_autdbx30:
         .globl  f_autizax12
         .type   f_autizax12, at function
 f_autizax12:
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         mov     x29, sp
         bl      g
@@ -456,7 +456,7 @@ f_autizax12:
         .globl  f_autizbx12
         .type   f_autizbx12, at function
 f_autizbx12:
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         mov     x29, sp
         bl      g
@@ -482,7 +482,7 @@ f_autizbx12:
         .globl  f_autizax30
         .type   f_autizax30, at function
 f_autizax30:
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         mov     x29, sp
         bl      g
@@ -496,7 +496,7 @@ f_autizax30:
         .globl  f_autizbx30
         .type   f_autizbx30, at function
 f_autizbx30:
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         mov     x29, sp
         bl      g
@@ -511,7 +511,7 @@ f_autizbx30:
         .globl  f_autdzax12
         .type   f_autdzax12, at function
 f_autdzax12:
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         mov     x29, sp
         bl      g
@@ -537,7 +537,7 @@ f_autdzax12:
         .globl  f_autdzbx12
         .type   f_autdzbx12, at function
 f_autdzbx12:
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         mov     x29, sp
         bl      g
@@ -563,7 +563,7 @@ f_autdzbx12:
         .globl  f_autdzax30
         .type   f_autdzax30, at function
 f_autdzax30:
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         mov     x29, sp
         bl      g
@@ -577,7 +577,7 @@ f_autdzax30:
         .globl  f_autdzbx30
         .type   f_autdzbx30, at function
 f_autdzbx30:
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         mov     x29, sp
         bl      g
@@ -591,7 +591,7 @@ f_autdzbx30:
         .globl  f_retaa
         .type   f_retaa, at function
 f_retaa:
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         mov     x29, sp
         bl      g
@@ -604,7 +604,7 @@ f_retaa:
         .globl  f_retab
         .type   f_retab, at function
 f_retab:
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         mov     x29, sp
         bl      g
@@ -617,7 +617,7 @@ f_retab:
         .globl  f_eretaa
         .type   f_eretaa, at function
 f_eretaa:
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         mov     x29, sp
         bl      g
@@ -631,7 +631,7 @@ f_eretaa:
         .globl  f_eretab
         .type   f_eretab, at function
 f_eretab:
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         mov     x29, sp
         bl      g
@@ -645,7 +645,7 @@ f_eretab:
         .globl  f_eret
         .type   f_eret, at function
 f_eret:
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         mov     x29, sp
         bl      g
@@ -675,7 +675,7 @@ f_movx30reg:
 f_autiasppci:
 0:
         pacnbiasppc
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         mov     x29, sp
         bl      g
@@ -691,7 +691,7 @@ f_autiasppci:
 f_autibsppci:
 0:
         pacnbibsppc
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         mov     x29, sp
         bl      g
@@ -708,7 +708,7 @@ f_autibsppci:
 f_autiasppcr:
 0:
         pacnbiasppc
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         mov     x29, sp
         bl      g
@@ -723,7 +723,7 @@ f_autiasppcr:
 f_autibsppcr:
 0:
         pacnbibsppc
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         mov     x29, sp
         bl      g
@@ -740,7 +740,7 @@ f_autibsppcr:
 f_retaasppci:
 0:
         pacnbiasppc
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         mov     x29, sp
         bl      g
@@ -755,7 +755,7 @@ f_retaasppci:
 f_retabsppci:
 0:
         pacnbibsppc
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         mov     x29, sp
         bl      g
@@ -771,7 +771,7 @@ f_retabsppci:
 f_retaasppcr:
 0:
         pacnbiasppc
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         mov     x29, sp
         bl      g
@@ -785,7 +785,7 @@ f_retaasppcr:
 f_retabsppcr:
 0:
         pacnbibsppc
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         mov     x29, sp
         bl      g
@@ -799,7 +799,7 @@ f_retabsppcr:
         .globl  f_autia171615
         .type   f_autia171615, at function
 f_autia171615:
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         mov     x29, sp
         bl      g
@@ -825,7 +825,7 @@ f_autia171615:
         .globl  f_autib171615
         .type   f_autib171615, at function
 f_autib171615:
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         mov     x29, sp
         bl      g
diff --git a/bolt/test/binary-analysis/AArch64/gs-pacret-multi-bb.s b/bolt/test/binary-analysis/AArch64/gs-pacret-multi-bb.s
index aaf82b5d188495..e9401e48239270 100644
--- a/bolt/test/binary-analysis/AArch64/gs-pacret-multi-bb.s
+++ b/bolt/test/binary-analysis/AArch64/gs-pacret-multi-bb.s
@@ -7,7 +7,7 @@
         .globl f_crossbb1
         .type   f_crossbb1, at function
 f_crossbb1:
-        hint    #25
+        paciasp
         stp     x29, x30, [sp, #-16]!
         ldp     x29, x30, [sp], #16
         cbnz    x0, 1f
@@ -26,7 +26,7 @@ f_crossbb1:
         .globl f_mergebb1
         .type   f_mergebb1, at function
 f_mergebb1:
-        hint    #25
+        paciasp
 2:
         stp     x29, x30, [sp, #-16]!
         ldp     x29, x30, [sp], #16
@@ -42,5 +42,34 @@ f_mergebb1:
 // CHECK-NEXT:    The 1 instructions that write to the return register after any authentication are:
 // CHECK-NEXT:    1.     {{[0-9a-f]+}}:      ldp     x29, x30, [sp], #0x10
 
+        .globl f_shrinkwrapping
+        .type   f_shrinkwrapping, at function
+f_shrinkwrapping:
+        cbz     x0, 1f
+        paciasp
+        stp     x29, x30, [sp, #-16]!
+        ldp     x29, x30, [sp], #16
+        autiasp
+1:
+        ret
+        .size f_shrinkwrapping, .-f_shrinkwrapping
+// CHECK-NOT: f_shrinkwrapping
+
+        .globl f_multi_auth_insts
+        .type   f_multi_auth_insts, at function
+f_multi_auth_insts:
+        paciasp
+        stp     x29, x30, [sp, #-16]!
+        ldp     x29, x30, [sp], #16
+        cbnz x0, 1f
+        autibsp
+        b 2f
+1:
+        autiasp
+2:
+        ret
+        .size f_multi_auth_insts, .-f_multi_auth_insts
+// CHECK-NOT: f_multi_auth_insts
+
 // TODO: also verify that false negatives exist in across-BB gadgets in functions
 // for which bolt cannot reconstruct the call graph.

>From 35dfd8def3ff4fe86c7a4bc585bdddfb9a2dcde7 Mon Sep 17 00:00:00 2001
From: Kristof Beyls <kristof.beyls at arm.com>
Date: Fri, 17 Jan 2025 13:15:10 +0000
Subject: [PATCH 14/14] Add a test for correct uses of non-LR return. This also
 adds a FIXME for a false positive category

---
 bolt/docs/BinaryAnalysis.md                   |  9 ++++++
 .../AArch64/gs-pacret-autiasp.s               | 29 +++++++++++++++++++
 2 files changed, 38 insertions(+)

diff --git a/bolt/docs/BinaryAnalysis.md b/bolt/docs/BinaryAnalysis.md
index 17e45513462af2..0a3510caabbd01 100644
--- a/bolt/docs/BinaryAnalysis.md
+++ b/bolt/docs/BinaryAnalysis.md
@@ -163,6 +163,15 @@ The following are current known cases of false positives:
 1. Not handling "no-return" functions. See issue
    [#115154](https://github.com/llvm/llvm-project/issues/115154) for details and
    pointers to open PRs to fix this.
+2. Not recognizing that a move of a properly authenticated value between registers,
+   results in the destination register having a properly authenticated value.
+   For example, the scanner currently produces a false negative for the following
+   code sequence:
+   ```
+   autiasp
+   mov     x16, x30
+   ret
+   ```
 
 The folowing are current known cases of false negatives:
 
diff --git a/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s b/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s
index 7b73bc9cebe855..ae6c98bb2b99ee 100644
--- a/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s
+++ b/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s
@@ -154,6 +154,35 @@ f_nonx30_ret:
         ret     x16
         .size f_nonx30_ret, .-f_nonx30_ret
 
+        .globl  f_nonx30_ret_ok
+        .type   f_nonx30_ret_ok, at function
+f_nonx30_ret_ok:
+        paciasp
+        stp     x29, x30, [sp, #-16]!
+        mov     x29, sp
+        bl      g
+        add     x0, x0, #3
+        ldp     x29, x30, [sp], #16
+        // FIXME: Should the scanner understand that an authenticated register (below x30,
+        //        after the autiasp instruction), is OK to be moved to another register
+        //        and then that register being used to return?
+        //        This respects that pac-ret hardening intent, but the scanner currently
+        //        will produce a false positive for this.
+        //        Is it worthwhile to make the scanner more complex for this case?
+        //        So far, scanning many millions of instructions across a linux distro,
+        //        I haven't encountered such an example.
+        //        The ".if 0" block below tests this case and currently fails.
+.if 0
+        autiasp
+        mov     x16, x30
+.else
+        mov     x16, x30
+        autia   x16, sp
+.endif
+// CHECK-NOT: function f_nonx30_ret_ok
+        ret     x16
+        .size f_nonx30_ret_ok, .-f_nonx30_ret_ok
+
 
 /// Now do a basic sanity check on every different Authentication instruction:
 



More information about the llvm-commits mailing list