[llvm] [BOLT][binary-analysis] Add initial pac-ret gadget scanner (PR #122304)
Kristof Beyls via llvm-commits
llvm-commits at lists.llvm.org
Thu Jan 9 07:56:21 PST 2025
https://github.com/kbeyls updated https://github.com/llvm/llvm-project/pull/122304
>From 0b0137b8feab04db089297cb0efc288255f54298 Mon Sep 17 00:00:00 2001
From: Kristof Beyls <kristof.beyls at arm.com>
Date: Thu, 9 Jan 2025 11:06:57 +0000
Subject: [PATCH 1/2] [BOLT][binary-analysis] Add initial pac-ret gadget
scanner
This adds an initial pac-ret gadget scanner to the
llvm-bolt-binary-analysis-tool.
The scanner is taken from the prototype that was published last year at
https://github.com/llvm/llvm-project/compare/main...kbeyls:llvm-project:bolt-gadget-scanner-prototype,
and has been discussed in RFC
https://discourse.llvm.org/t/rfc-bolt-based-binary-analysis-tool-to-verify-correctness-of-security-hardening/78148
and in the EuroLLVM 2024 keynote "Does LLVM implement security
hardenings correctly? A BOLT-based static analyzer to the rescue?"
[Video](https://youtu.be/Sn_Fxa0tdpY)
[Slides](https://llvm.org/devmtg/2024-04/slides/Keynote/Beyls_EuroLLVM2024_security_hardening_keynote.pdf)
In the spirit of incremental development, this PR aims to add a minimal
implementation that is "fully working" on its own, but has major
limitations, as described in the bolt/docs/BinaryAnalysis.md
documentation in this proposed commit. These and other limitations will
be fixed in follow-on PRs, mostly based on code already existing in the
prototype branch. I hope incrementally upstreaming will make it easier
to review the code.
That being said, this patch isn't really small.
I believe that this patch needs 2 "kinds" of reviewers:
1. People familiar with BOLT internals.
2. People familiar with the pac-ret hardening scheme and AArch64.
I expect people familiar with the pac-ret hardening scheme might need to
focus on reviewing mostly what is in file
`NonPacProtectedRetAnalysis.cpp`, `AArch64MCPlusBuilder.cpp` and the
unit tests.
I expect people familiar with BOLT might mainly need to focus on the
other parts, and the high-level structure of
`NonPacProtectedRetAnalysis.cpp`.
The patch is not very polished currently, but I think it's in the right
shape to start getting it reviewed.
Note that I believe that this could also form the basis of a scanner to
analyze correct implementation of PAuthABI.
---
bolt/docs/BinaryAnalysis.md | 163 ++++-
bolt/include/bolt/Core/MCPlusBuilder.h | 16 +
.../bolt/Passes/NonPacProtectedRetAnalysis.h | 209 ++++++
bolt/include/bolt/Utils/CommandLineOpts.h | 4 +
bolt/lib/Passes/CMakeLists.txt | 1 +
.../lib/Passes/NonPacProtectedRetAnalysis.cpp | 520 +++++++++++++
bolt/lib/Rewrite/RewriteInstance.cpp | 25 +-
.../Target/AArch64/AArch64MCPlusBuilder.cpp | 63 ++
.../binary-analysis/AArch64/cmdline-args.test | 8 +-
.../AArch64/gs-pacret-autiasp.s | 681 ++++++++++++++++++
.../AArch64/gs-pacret-multi-bb.s | 46 ++
.../binary-analysis/AArch64/lit.local.cfg | 2 +-
12 files changed, 1733 insertions(+), 5 deletions(-)
create mode 100644 bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h
create mode 100644 bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
create mode 100644 bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s
create mode 100644 bolt/test/binary-analysis/AArch64/gs-pacret-multi-bb.s
diff --git a/bolt/docs/BinaryAnalysis.md b/bolt/docs/BinaryAnalysis.md
index f91b77d046de8f..e247ce20b4625c 100644
--- a/bolt/docs/BinaryAnalysis.md
+++ b/bolt/docs/BinaryAnalysis.md
@@ -9,9 +9,168 @@ analyses implemented in the BOLT libraries.
## Which binary analyses are implemented?
-At the moment, no binary analyses are implemented.
+* [Security scanners](#security-scanners)
+ * [pac-ret analysis](#pac-ret-analysis)
-The goal is to make it easy using a plug-in framework to add your own analyses.
+### Security scanners
+
+For the past 25 years, a large numbers of exploits have been built and used in
+the wild to undermine computer security. The majority of these exploits abuse
+memory vulnerabilities in programs, see evidence from
+[Microsoft](https://youtu.be/PjbGojjnBZQ?si=oCHCa0SHgaSNr6Gr&t=836),
+[Chromium](https://www.chromium.org/Home/chromium-security/memory-safety/) and
+[Android](https://security.googleblog.com/2021/01/data-driven-security-hardening-in.html).
+
+It is not surprising therefore, that a large number of mitigations have been
+added to instruction sets and toolchains to make it harder to build an exploit
+using a memory vulnerability. Examples are: stack canaries, stack clash,
+pac-ret, shadow stacks, arm64e, and many more.
+
+These mitigations guarantee a so-called "security property" on the binaries they
+produce. For example, for stack canaries, the security property is roughly that
+a canary is located on the stack between the set of saved variables and set of
+local variables. For pac-ret, it is roughly that there are no writes to the
+register containing the return address after either an authenticating
+instruction or a Branch-and-link instruction.
+
+From time to time, however, a bug gets found in the implementation of such
+mitigations in toolchains. Also, code that is written in assembler by hand
+requires the developer to ensure these security properties by hand.
+
+In short, it is sometimes found that a few places in the binary code are not
+protected as expected given the requested mitigations. Attackers could make use
+of those places (sometimes called gadgets) to circumvent to protection that the
+mitigation should give.
+
+One of the reasons that such gadgets, or holes in the mitigation implementation,
+exist is that typically the amount of testing and verification for these
+security properties is pretty limited.
+
+In comparison, for testing functional correctness, or for testing performance,
+toolchain and software in general typically get tested with large test suites
+and benchmarks and testing and benchmarking plans. In contrast, this typically
+does not get done for testing the security properties of binary code.
+
+The security scanners implemented in `llvm-bolt-binary-analysis` aim to enable
+better testing of security hardening implementations that require compilers to
+somehow generate assembly code with specific security properties.
+
+#### pac-ret analysis
+
+`pac-ret` protection is a security hardening scheme implemented in compilers
+such as gcc and clang, using the command line option
+`-mbranch-protection=pac-ret`. This option is enabled by default on most widely
+used distributions.
+
+The hardening scheme mitigates
+[Return-Oriented Programming (ROP)](https://llsoftsec.github.io/llsoftsecbook/#return-oriented-programming)
+attacks by making sure that return addresses are only ever stored to memory with
+a cryptographic hash, called a
+["Pointer Authentication Code" (PAC)](https://llsoftsec.github.io/llsoftsecbook/#pointer-authentication),
+in the upper bits of the pointer. This makes it substantially harder for
+attackers to divert control flow by overwriting a return address with a
+different value.
+
+The hardening scheme relies on compilers producing different code sequences when
+processing return addresses, especially when these are stored to and retrieved
+from memory.
+
+The 'pac-ret' binary analysis can be invoked using the command line option
+`--scanners=pac-ret`. It makes `llvm-bolt-binary-analysis` scan through the
+provided binary and check the following property:
+
+The security property that is checked is:
+
+When a register is used as the address to jump to in a return instruction,
+that register must either:
+
+1. never be changed within this function, i.e. have the same value as when the
+ function started, or
+2. the last write to the register must be by an authenticating instruction.
+
+##### Example 1
+
+For example, a typical non-pac-ret-protected function looks as follows:
+
+```
+stp x29, x30, [sp, #-0x10]!
+mov x29, sp
+bl g at PLT
+add x0, x0, #0x3
+ldp x29, x30, [sp], #0x10
+ret
+```
+
+The return instruction `ret` uses register `x30` as containing the address to
+return to. Register `x30` was last written by instruction `ldp`, which is not an
+authenticating instruction. `llvm-bolt-binary-analysis --scanners=pac-ret` will
+report this as follows:
+
+```
+GS-PACRET: non-protected ret found in function f1, basic block .LBB00, at address 10310
+ The return instruction is 00010310: ret # pacret-gadget: pac-ret-gadget<Ret:MCInstBBRef<BB:.LBB00:6>, Overwriting:[MCInstBBRef<BB:.LBB00:5> ]>
+ The 1 instructions that write to the return register after any authentication are:
+ 1. 0001030c: ldp x29, x30, [sp], #0x10
+ This happens in the following basic block:
+ 000102fc: stp x29, x30, [sp, #-0x10]!
+ 00010300: mov x29, sp
+ 00010304: bl g at PLT
+ 00010308: add x0, x0, #0x3
+ 0001030c: ldp x29, x30, [sp], #0x10
+ 00010310: ret # pacret-gadget: pac-ret-gadget<Ret:MCInstBBRef<BB:.LBB00:6>, Overwriting:[MCInstBBRef<BB:.LBB00:5> ]>
+```
+
+The exact format of how `llvm-bolt-binary-analysis` reports this is expected to
+evolve over time.
+
+##### Example 2: multiple "last-overwriting" instructions
+
+A simple example that show how there can be a set of "last overwriting"
+instructions of a register is the following:
+
+```
+ paciasp
+ stp x29, x30, [sp, #-16]!
+ ldp x29, x30, [sp], #16
+ cbnz x0, 1f
+ autiasp
+1:
+ ret
+```
+
+This will produce the following diagnostic:
+
+```
+GS-PACRET: non-protected ret found in function f_crossbb1, basic block .Ltmp0, at address 102dc
+ The return instruction is 000102dc: ret # pacret-gadget: pac-ret-gadget<Ret:MCInstBBRef<BB:.Ltmp0:0>, Overwriting:[MCInstBBRef<BB:.LFT0:0> MCInstBBRef<BB:.LBB00:2> ]>
+ The 2 instructions that write to the return register after any authentication are:
+ 1. 000102d0: ldp x29, x30, [sp], #0x10
+ 2. 000102d8: autiasp
+```
+
+(Yes, this diagnostic could be improved because the second "overwriting"
+instruction, `autiasp`, is an authenticating instruction...)
+
+##### Known false positives or negatives
+
+The following are current known cases of false positives:
+
+1. Not handling "no-return" functions. See issue
+ [#115154](https://github.com/llvm/llvm-project/issues/115154) for details and
+ pointers to open PRs to fix this.
+
+The folowing are current known cases of false negatives:
+
+1. Not handling functions for which the CFG cannot be reconstructed by BOLT. The
+ plan is to implement support for this, picking up the implementation from the
+ [prototype branch](
+ https://github.com/llvm/llvm-project/compare/main...kbeyls:llvm-project:bolt-gadget-scanner-prototype).
+
+BOLT cannot currently handle functions with `cfi_negate_ra_state` correctly,
+i.e. any binaries built with `-mbranch-protection=pac-ret`. The scanner is meant
+to be used on specifically such binaries, so this is a major limitation! Work is
+going on in PR [#120064](https://github.com/llvm/llvm-project/pull/120064) to
+fix this.
## How to add your own binary analysis
diff --git a/bolt/include/bolt/Core/MCPlusBuilder.h b/bolt/include/bolt/Core/MCPlusBuilder.h
index 3634fed9757ceb..9351107e66bdf6 100644
--- a/bolt/include/bolt/Core/MCPlusBuilder.h
+++ b/bolt/include/bolt/Core/MCPlusBuilder.h
@@ -550,6 +550,22 @@ class MCPlusBuilder {
return Analysis->isReturn(Inst);
}
+ virtual MCPhysReg getAuthenticatedReg(const MCInst &Inst) const {
+ llvm_unreachable("not implemented");
+ return false;
+ }
+
+ virtual bool isAuthenticationOfReg(const MCInst &Inst,
+ const unsigned RegAuthenticated) const {
+ llvm_unreachable("not implemented");
+ return false;
+ }
+
+ virtual llvm::MCPhysReg getRegUsedAsRetDest(const MCInst &Inst) const {
+ llvm_unreachable("not implemented");
+ return getNoRegister();
+ }
+
virtual bool isTerminator(const MCInst &Inst) const;
virtual bool isNoop(const MCInst &Inst) const {
diff --git a/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h b/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h
new file mode 100644
index 00000000000000..a9740896369fa2
--- /dev/null
+++ b/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h
@@ -0,0 +1,209 @@
+//===- bolt/Passes/NonPacProtectedRetAnalysis.h -----------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef BOLT_PASSES_NONPACPROTECTEDRETANALYSIS_H
+#define BOLT_PASSES_NONPACPROTECTEDRETANALYSIS_H
+
+#include "bolt/Core/BinaryContext.h"
+#include "bolt/Core/BinaryFunction.h"
+#include "bolt/Passes/BinaryPasses.h"
+#include "llvm/ADT/SmallSet.h"
+
+namespace llvm {
+namespace bolt {
+
+/// @brief MCInstReference represents a reference to an MCInst as stored either
+/// in a BinaryFunction (i.e. before a CFG is created), or in a BinaryBasicBlock
+/// (after a CFG is created). It aims to store the necessary information to be
+/// able to find the specific MCInst in either the BinaryFunction or
+/// BinaryBasicBlock data structures later, so that e.g. the InputAddress of
+/// the corresponding instruction can be computed.
+
+struct MCInstInBBReference {
+ BinaryBasicBlock *BB;
+ int64_t BBIndex;
+ MCInstInBBReference(BinaryBasicBlock *BB, int64_t BBIndex)
+ : BB(BB), BBIndex(BBIndex) {}
+ MCInstInBBReference() : BB(nullptr), BBIndex(0) {}
+ static MCInstInBBReference get(const MCInst *Inst, BinaryFunction &BF) {
+ for (BinaryBasicBlock& BB : BF)
+ for (size_t I = 0; I < BB.size(); ++I)
+ if (Inst == &(BB.getInstructionAtIndex(I)))
+ return MCInstInBBReference(&BB, I);
+ return {};
+ }
+ bool operator==(const MCInstInBBReference &RHS) const {
+ return BB == RHS.BB && BBIndex == RHS.BBIndex;
+ }
+ bool operator<(const MCInstInBBReference &RHS) const {
+ if (BB != RHS.BB)
+ return BB < RHS.BB;
+ return BBIndex < RHS.BBIndex;
+ }
+ operator MCInst &() const {
+ assert(BB != nullptr);
+ return BB->getInstructionAtIndex(BBIndex);
+ }
+ uint64_t getAddress() const {
+ // 4 bytes per instruction on AArch64;
+ return BB->getFunction()->getAddress() + BB->getOffset() + BBIndex * 4;
+ }
+};
+
+raw_ostream &operator<<(raw_ostream &OS, const MCInstInBBReference &);
+
+struct MCInstInBFReference {
+ BinaryFunction *BF;
+ uint32_t Offset;
+ MCInstInBFReference(BinaryFunction *BF, uint32_t Offset)
+ : BF(BF), Offset(Offset) {}
+ MCInstInBFReference() : BF(nullptr) {}
+ bool operator==(const MCInstInBFReference &RHS) const {
+ return BF == RHS.BF && Offset == RHS.Offset;
+ }
+ bool operator<(const MCInstInBFReference &RHS) const {
+ if (BF != RHS.BF)
+ return BF < RHS.BF;
+ return Offset < RHS.Offset;
+ }
+ operator MCInst &() const {
+ assert(BF != nullptr);
+ return *(BF->getInstructionAtOffset(Offset));
+ }
+
+ uint64_t getOffset() const { return Offset; }
+
+ uint64_t getAddress() const {
+ return BF->getAddress() + getOffset();
+ }
+};
+
+raw_ostream &operator<<(raw_ostream &OS, const MCInstInBFReference &);
+
+struct MCInstReference {
+ enum StoredIn { _BinaryFunction, _BinaryBasicBlock };
+ StoredIn CurrentLocation;
+ union U {
+ MCInstInBBReference BBRef;
+ MCInstInBFReference BFRef;
+ U(MCInstInBBReference BBRef) : BBRef(BBRef) {}
+ U(MCInstInBFReference BFRef) : BFRef(BFRef) {}
+ } U;
+ MCInstReference(MCInstInBBReference BBRef)
+ : CurrentLocation(_BinaryBasicBlock), U(BBRef) {}
+ MCInstReference(MCInstInBFReference BFRef)
+ : CurrentLocation(_BinaryFunction), U(BFRef) {}
+ MCInstReference(BinaryBasicBlock *BB, int64_t BBIndex)
+ : MCInstReference(MCInstInBBReference(BB, BBIndex)) {}
+ MCInstReference(BinaryFunction *BF, uint32_t Offset)
+ : MCInstReference(MCInstInBFReference(BF, Offset)) {}
+
+ bool operator<(const MCInstReference &RHS) const {
+ if (CurrentLocation != RHS.CurrentLocation)
+ return CurrentLocation < RHS.CurrentLocation;
+ switch (CurrentLocation) {
+ case _BinaryBasicBlock:
+ return U.BBRef < RHS.U.BBRef;
+ case _BinaryFunction:
+ return U.BFRef < RHS.U.BFRef;
+ }
+ llvm_unreachable("");
+ }
+
+ bool operator==(const MCInstReference &RHS) const {
+ if (CurrentLocation != RHS.CurrentLocation)
+ return false;
+ switch (CurrentLocation) {
+ case _BinaryBasicBlock:
+ return U.BBRef == RHS.U.BBRef;
+ case _BinaryFunction:
+ return U.BFRef == RHS.U.BFRef;
+ }
+ llvm_unreachable("");
+ }
+
+ operator MCInst &() const {
+ switch (CurrentLocation) {
+ case _BinaryBasicBlock:
+ return U.BBRef;
+ case _BinaryFunction:
+ return U.BFRef;
+ }
+ llvm_unreachable("");
+ }
+
+ uint64_t getAddress() const {
+ switch (CurrentLocation) {
+ case _BinaryBasicBlock:
+ return U.BBRef.getAddress();
+ case _BinaryFunction:
+ return U.BFRef.getAddress();
+ }
+ llvm_unreachable("");
+ }
+
+ BinaryFunction *getFunction() const {
+ switch (CurrentLocation) {
+ case _BinaryFunction:
+ return U.BFRef.BF;
+ case _BinaryBasicBlock:
+ return U.BBRef.BB->getFunction();
+ }
+ llvm_unreachable("");
+ }
+
+ BinaryBasicBlock *getBasicBlock() const {
+ switch (CurrentLocation) {
+ case _BinaryFunction:
+ return nullptr;
+ case _BinaryBasicBlock:
+ return U.BBRef.BB;
+ }
+ llvm_unreachable("");
+ }
+};
+
+raw_ostream &operator<<(raw_ostream &OS, const MCInstReference &);
+
+struct NonPacProtectedRetGadget {
+ MCInstReference RetInst;
+ std::vector<MCInstReference> OverwritingRetRegInst;
+ bool operator==(const NonPacProtectedRetGadget &RHS) const {
+ return RetInst == RHS.RetInst &&
+ OverwritingRetRegInst == RHS.OverwritingRetRegInst;
+ }
+ NonPacProtectedRetGadget(
+ MCInstReference RetInst,
+ const std::vector<MCInstReference>& OverwritingRetRegInst)
+ : RetInst(RetInst), OverwritingRetRegInst(OverwritingRetRegInst) {}
+};
+
+raw_ostream &operator<<(raw_ostream &OS, const NonPacProtectedRetGadget &NPPRG);
+class PacRetAnalysis;
+
+class NonPacProtectedRetAnalysis : public BinaryFunctionPass {
+ void runOnFunction(BinaryFunction &Function,
+ MCPlusBuilder::AllocatorIdTy AllocatorId);
+ SmallSet<MCPhysReg, 1>
+ computeDfState(PacRetAnalysis &PRA, BinaryFunction &BF,
+ MCPlusBuilder::AllocatorIdTy AllocatorId);
+ unsigned GadgetAnnotationIndex;
+
+public:
+ explicit NonPacProtectedRetAnalysis() : BinaryFunctionPass(false) {}
+
+ const char *getName() const override { return "non-pac-protected-rets"; }
+
+ /// Pass entry point
+ Error runOnFunctions(BinaryContext &BC) override;
+};
+
+} // namespace bolt
+} // namespace llvm
+
+#endif
\ No newline at end of file
diff --git a/bolt/include/bolt/Utils/CommandLineOpts.h b/bolt/include/bolt/Utils/CommandLineOpts.h
index 111eb650c37465..fefd7969ef6f4f 100644
--- a/bolt/include/bolt/Utils/CommandLineOpts.h
+++ b/bolt/include/bolt/Utils/CommandLineOpts.h
@@ -80,6 +80,10 @@ extern llvm::cl::opt<unsigned> Verbosity;
/// Return true if we should process all functions in the binary.
bool processAllFunctions();
+enum GadgetScannerKind { GS_PACRET, GS_ALL };
+
+extern llvm::cl::list<GadgetScannerKind> GadgetScannersToRun;
+
} // namespace opts
namespace llvm {
diff --git a/bolt/lib/Passes/CMakeLists.txt b/bolt/lib/Passes/CMakeLists.txt
index 1c1273b3d2420d..b3fbb0ba0108f4 100644
--- a/bolt/lib/Passes/CMakeLists.txt
+++ b/bolt/lib/Passes/CMakeLists.txt
@@ -23,6 +23,7 @@ add_llvm_library(LLVMBOLTPasses
LoopInversionPass.cpp
LivenessAnalysis.cpp
MCF.cpp
+ NonPacProtectedRetAnalysis.cpp
PatchEntries.cpp
PettisAndHansen.cpp
PLTCall.cpp
diff --git a/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp b/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
new file mode 100644
index 00000000000000..d6bc902f663fc9
--- /dev/null
+++ b/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
@@ -0,0 +1,520 @@
+//===- bolt/Passes/NonPacProtectedRetAnalysis.cpp -------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file implements a pass that looks for any AArch64 return instructions
+// that may not be protected by PAuth authentication instructions when needed.
+//
+// When needed = the register used to return (almost always X30), is potentially
+// written to between the AUThentication instruction and the RETurn instruction.
+//
+//===----------------------------------------------------------------------===//
+
+#include "bolt/Passes/NonPacProtectedRetAnalysis.h"
+#include "bolt/Core/ParallelUtilities.h"
+#include "bolt/Passes/DataflowAnalysis.h"
+#include "llvm/ADT/SmallSet.h"
+#include "llvm/MC/MCInst.h"
+#include "llvm/Support/Format.h"
+
+#define DEBUG_TYPE "bolt-nonpacprotectedret"
+
+namespace llvm {
+namespace bolt {
+
+raw_ostream &operator<<(raw_ostream &OS, const MCInstInBBReference &Ref) {
+ OS << "MCInstBBRef<";
+ if (Ref.BB == nullptr)
+ OS << "BB:(null)";
+ else
+ OS << "BB:" << Ref.BB->getName() << ":" << Ref.BBIndex;
+ OS << ">";
+ return OS;
+}
+
+raw_ostream &operator<<(raw_ostream &OS, const MCInstInBFReference &Ref) {
+ OS << "MCInstBFRef<";
+ if (Ref.BF == nullptr)
+ OS << "BF:(null)";
+ else
+ OS << "BF:" << Ref.BF->getPrintName() << ":" << Ref.getOffset();
+ OS << ">";
+ return OS;
+}
+
+raw_ostream &operator<<(raw_ostream &OS, const MCInstReference &Ref) {
+ switch (Ref.CurrentLocation) {
+ case MCInstReference::_BinaryBasicBlock:
+ OS << Ref.U.BBRef;
+ return OS;
+ case MCInstReference::_BinaryFunction:
+ OS << Ref.U.BFRef;
+ return OS;
+ }
+ llvm_unreachable("");
+}
+
+raw_ostream &operator<<(raw_ostream &OS,
+ const NonPacProtectedRetGadget &NPPRG) {
+ OS << "pac-ret-gadget<";
+ OS << "Ret:" << NPPRG.RetInst << ", ";
+ OS << "Overwriting:[";
+ for (auto Ref : NPPRG.OverwritingRetRegInst)
+ OS << Ref << " ";
+ OS << "]>";
+ return OS;
+}
+
+// The security property that is checked is:
+// When a register is used as the address to jump to in a return instruction,
+// that register must either:
+// (a) never be changed within this function, i.e. have the same value as when
+// the function started, or
+// (b) the last write to the register must be by an authentication instruction.
+
+// This property is checked by using data flow analysis to keep track of which
+// registers have been written (def-ed), since last authenticated. Those are
+// exactly the registers containing values that should not be trusted (as they
+// could have changed since the last time they were authenticated). For pac-ret,
+// any return instruction using such a register is a gadget to be reported. For
+// PAuthABI, any indirect control flow using such a register should be reported?
+
+// This security property is verified using a dataflow analysis.
+
+// Furthermore, when producing a diagnostic for a found non-pac-ret protected
+// return, the analysis also lists the last instructions that wrote to the
+// register used in the return instruction.
+// The total set of registers used in return instructions in a given function is
+// small. It almost always is just `X30`.
+// In order to reduce the memory consumption of storing this additional state
+// during the dataflow analysis, this is computed by running the dataflow
+// analysis twice:
+// 1. In the first run, the dataflow analysis only keeps track of the security
+// property: i.e. which registers have been overwritten since the last
+// time they've been authenticated.
+// 2. If the first run finds any return instructions using a
+// written-to-last-by-an-non-authenticating instruction, the data flow
+// analysis will be run a second time. The first run will return which
+// registers are used in the gadgets to be reported. This information is
+// used in the second run to also track with instructions last wrote to
+// those registers.
+
+struct State {
+ /// A BitVector containing the registers that have been clobbered, and
+ /// not authenticated.
+ BitVector NonAutClobRegs;
+ /// A vector of sets, only used in the second data flow run.
+ /// Each element in the vector represent one registers for which we
+ /// track the set of last instructions that wrote to this register.
+ /// For pac-ret analysis, the expectations is that almost all return
+ /// instructions only use register `X30`, and therefore, this vector
+ /// will have length 1 in the second run.
+ std::vector<SmallPtrSet<const MCInst *, 4>> LastInstWritingReg;
+ State() {}
+ State(uint16_t NumRegs, uint16_t NumRegsToTrack)
+ : NonAutClobRegs(NumRegs), LastInstWritingReg(NumRegsToTrack) {}
+ State &operator|=(const State &StateIn) {
+ NonAutClobRegs |= StateIn.NonAutClobRegs;
+ for (unsigned I = 0; I < LastInstWritingReg.size(); ++I)
+ for (const MCInst *J : StateIn.LastInstWritingReg[I])
+ LastInstWritingReg[I].insert(J);
+ return *this;
+ }
+ bool operator==(const State &RHS) const {
+ return NonAutClobRegs == RHS.NonAutClobRegs &&
+ LastInstWritingReg == RHS.LastInstWritingReg;
+ }
+ bool operator!=(const State &RHS) const { return !((*this) == RHS); }
+};
+
+static void printLastInsts(
+ raw_ostream &OS,
+ const std::vector<SmallPtrSet<const MCInst *, 4>> &LastInstWritingReg) {
+ OS << "Insts: ";
+ for (unsigned I = 0; I < LastInstWritingReg.size(); ++I) {
+ auto Set = LastInstWritingReg[I];
+ OS << "[" << I << "](";
+ for (const MCInst *MCInstP : Set)
+ OS << MCInstP << " ";
+ OS << ")";
+ }
+}
+
+raw_ostream &operator<<(raw_ostream &OS, const State &S) {
+ OS << "pacret-state<";
+ OS << "NonAutClobRegs: " << S.NonAutClobRegs;
+ OS << "Insts: ";
+ printLastInsts(OS, S.LastInstWritingReg);
+ OS << ">";
+ return OS;
+}
+
+class PacStatePrinter {
+public:
+ void print(raw_ostream &OS, const State &State) const;
+ explicit PacStatePrinter(const BinaryContext &BC) : BC(BC) {}
+
+private:
+ const BinaryContext &BC;
+};
+
+void PacStatePrinter::print(raw_ostream &OS, const State &S) const {
+ RegStatePrinter RegStatePrinter(BC);
+ OS << "pacret-state<";
+ OS << "NonAutClobRegs: ";
+ RegStatePrinter.print(OS, S.NonAutClobRegs);
+ OS << "Insts: ";
+ printLastInsts(OS, S.LastInstWritingReg);
+ OS << ">";
+}
+
+class PacRetAnalysis
+ : public DataflowAnalysis<PacRetAnalysis, State, false /*Backward*/,
+ PacStatePrinter> {
+ using Parent =
+ DataflowAnalysis<PacRetAnalysis, State, false, PacStatePrinter>;
+ friend Parent;
+
+public:
+ PacRetAnalysis(BinaryFunction &BF, MCPlusBuilder::AllocatorIdTy AllocId,
+ const std::vector<MCPhysReg> &RegsToTrackInstsFor)
+ : Parent(BF, AllocId), NumRegs(BF.getBinaryContext().MRI->getNumRegs()),
+ RegsToTrackInstsFor(RegsToTrackInstsFor),
+ TrackingLastInsts(RegsToTrackInstsFor.size() != 0),
+ Reg2StateIdx(RegsToTrackInstsFor.size() == 0
+ ? 0
+ : *std::max_element(RegsToTrackInstsFor.begin(),
+ RegsToTrackInstsFor.end()) +
+ 1,
+ -1) {
+ for (unsigned I = 0; I < RegsToTrackInstsFor.size(); ++I)
+ Reg2StateIdx[RegsToTrackInstsFor[I]] = I;
+ }
+ virtual ~PacRetAnalysis() {}
+
+ void run() { Parent::run(); }
+
+protected:
+ const uint16_t NumRegs;
+ /// RegToTrackInstsFor is the set of registers for which the dataflow analysis
+ /// must compute which the last set of instructions writing to it are.
+ const std::vector<MCPhysReg> RegsToTrackInstsFor;
+ const bool TrackingLastInsts;
+ /// Reg2StateIdx maps Register to the index in the vector used in State to
+ /// track which instructions last wrote to this register.
+ std::vector<uint16_t> Reg2StateIdx;
+
+ bool trackReg(MCPhysReg Reg) {
+ for (auto R : RegsToTrackInstsFor)
+ if (R == Reg)
+ return true;
+ return false;
+ }
+ uint16_t reg2StateIdx(MCPhysReg Reg) const {
+ assert(Reg < Reg2StateIdx.size());
+ return Reg2StateIdx[Reg];
+ }
+
+ void preflight() {}
+
+ State getStartingStateAtBB(const BinaryBasicBlock &BB) {
+ return State(NumRegs, RegsToTrackInstsFor.size());
+ }
+
+ State getStartingStateAtPoint(const MCInst &Point) {
+ return State(NumRegs, RegsToTrackInstsFor.size());
+ }
+
+ void doConfluence(State &StateOut, const State &StateIn) {
+ PacStatePrinter P(BC);
+ LLVM_DEBUG({
+ dbgs() << " PacRetAnalysis::Confluence(\n";
+ dbgs() << " State 1: ";
+ P.print(dbgs(), StateOut);
+ dbgs() << "\n";
+ dbgs() << " State 2: ";
+ P.print(dbgs(), StateIn);
+ dbgs() << ")\n";
+ });
+
+ StateOut |= StateIn;
+
+ LLVM_DEBUG({
+ dbgs() << " merged state: ";
+ P.print(dbgs(), StateOut);
+ dbgs() << "\n";
+ });
+ }
+
+ State computeNext(const MCInst &Point, const State &Cur) {
+ PacStatePrinter P(BC);
+ LLVM_DEBUG({
+ dbgs() << " PacRetAnalysis::ComputeNext(";
+ BC.InstPrinter->printInst(&const_cast<MCInst &>(Point), 0, "", *BC.STI,
+ dbgs());
+ dbgs() << ", ";
+ P.print(dbgs(), Cur);
+ dbgs() << ")\n";
+ });
+
+ State Next = Cur;
+ BitVector Written = BitVector(NumRegs, false);
+ BC.MIB->getWrittenRegs(Point, Written);
+ Next.NonAutClobRegs |= Written;
+ // Keep track of this instruction if it writes to any of the registers we
+ // need to track that for:
+ if (TrackingLastInsts) {
+ for (auto WrittenReg : Written.set_bits()) {
+ if (trackReg(WrittenReg)) {
+ Next.LastInstWritingReg[reg2StateIdx(WrittenReg)] = {};
+ Next.LastInstWritingReg[reg2StateIdx(WrittenReg)].insert(&Point);
+ }
+ }
+ }
+
+ MCPhysReg AutReg = BC.MIB->getAuthenticatedReg(Point);
+ if (AutReg != BC.MIB->getNoRegister()) {
+ Next.NonAutClobRegs.reset(
+ BC.MIB->getAliases(AutReg, /*OnlySmaller=*/true));
+ if (TrackingLastInsts && trackReg(AutReg))
+ Next.LastInstWritingReg[reg2StateIdx(AutReg)] = {};
+ }
+
+ LLVM_DEBUG({
+ dbgs() << " .. result: (";
+ P.print(dbgs(), Next);
+ dbgs() << ")\n";
+ });
+
+ return Next;
+ }
+
+ StringRef getAnnotationName() const { return StringRef("PacRetAnalysis"); }
+
+public:
+ std::vector<MCInstReference>
+ getLastClobberingInsts(const MCInst Ret, BinaryFunction &BF,
+ const BitVector &DirtyRawRegs) const {
+ if (!TrackingLastInsts)
+ return {};
+ if (auto MaybeState = getStateAt(Ret)) {
+ const State &S = *MaybeState;
+ // Due to aliasing registers, multiple registers may have
+ // been tracked.
+ std::set<const MCInst *> LastWritingInsts;
+ for (MCPhysReg TrackedReg : DirtyRawRegs.set_bits()) {
+ for (const MCInst *LastInstWriting :
+ S.LastInstWritingReg[reg2StateIdx(TrackedReg)])
+ LastWritingInsts.insert(LastInstWriting);
+ }
+ std::vector<MCInstReference> Result;
+ for (const MCInst *LastInstWriting : LastWritingInsts) {
+ MCInstInBBReference Ref = MCInstInBBReference::get(LastInstWriting, BF);
+ assert(Ref.BB != nullptr && "Expected Inst to be found");
+ Result.push_back(MCInstReference(Ref));
+ }
+ return Result;
+ }
+ llvm_unreachable("Expected State to be present");
+ }
+};
+
+SmallSet<MCPhysReg, 1> NonPacProtectedRetAnalysis::computeDfState(
+ PacRetAnalysis &PRA, BinaryFunction &BF,
+ MCPlusBuilder::AllocatorIdTy AllocatorId) {
+ PRA.run();
+ LLVM_DEBUG({
+ dbgs() << " After PacRetAnalysis:\n";
+ BF.dump();
+ });
+ // Now scan the CFG for non-authenticating return instructions that use an
+ // overwritten, non-authenticated register as return address.
+ SmallSet<MCPhysReg, 1> RetRegsWithGadgets;
+ BinaryContext &BC = BF.getBinaryContext();
+ for (BinaryBasicBlock &BB : BF) {
+ for (int64_t I = BB.size() - 1; I >= 0; --I) {
+ MCInst &Inst = BB.getInstructionAtIndex(I);
+ if (BC.MIB->isReturn(Inst)) {
+ MCPhysReg RetReg = BC.MIB->getRegUsedAsRetDest(Inst);
+ LLVM_DEBUG({
+ dbgs() << " Found RET inst: ";
+ BC.printInstruction(dbgs(), Inst);
+ dbgs() << " RetReg: " << BC.MRI->getName(RetReg)
+ << "; authenticatesReg: "
+ << BC.MIB->isAuthenticationOfReg(Inst, RetReg) << "\n";
+ });
+ if (BC.MIB->isAuthenticationOfReg(Inst, RetReg))
+ break;
+ BitVector DirtyRawRegs = PRA.getStateAt(Inst)->NonAutClobRegs;
+ LLVM_DEBUG({
+ dbgs() << " DirtyRawRegs at Ret: ";
+ RegStatePrinter RSP(BC);
+ RSP.print(dbgs(), DirtyRawRegs);
+ dbgs() << "\n";
+ });
+ DirtyRawRegs &= BC.MIB->getAliases(RetReg, /*OnlySmaller=*/true);
+ LLVM_DEBUG({
+ dbgs() << " Intersection with RetReg: ";
+ RegStatePrinter RSP(BC);
+ RSP.print(dbgs(), DirtyRawRegs);
+ dbgs() << "\n";
+ });
+ if (DirtyRawRegs.any()) {
+ // This return instruction needs to be reported
+ // First remove the annotation that the first, fast run of
+ // the dataflow analysis may have added.
+ if (BC.MIB->hasAnnotation(Inst, GadgetAnnotationIndex))
+ BC.MIB->removeAnnotation(Inst, GadgetAnnotationIndex);
+ BC.MIB->addAnnotation(
+ Inst, GadgetAnnotationIndex,
+ NonPacProtectedRetGadget(
+ MCInstInBBReference(&BB, I),
+ PRA.getLastClobberingInsts(Inst, BF, DirtyRawRegs)),
+ AllocatorId);
+ LLVM_DEBUG({
+ dbgs() << " Added gadget info annotation: ";
+ BC.printInstruction(dbgs(), Inst, 0, &BF);
+ dbgs() << "\n";
+ });
+ for (MCPhysReg RetRegWithGadget : DirtyRawRegs.set_bits())
+ RetRegsWithGadgets.insert(RetRegWithGadget);
+ }
+ }
+ }
+ }
+ return RetRegsWithGadgets;
+}
+
+void NonPacProtectedRetAnalysis::runOnFunction(
+ BinaryFunction &BF, MCPlusBuilder::AllocatorIdTy AllocatorId) {
+ LLVM_DEBUG({
+ dbgs() << "Analyzing in function " << BF.getPrintName() << ", AllocatorId "
+ << AllocatorId << "\n";
+ });
+ LLVM_DEBUG({ BF.dump(); });
+
+ if (BF.hasCFG()) {
+ PacRetAnalysis PRA(BF, AllocatorId, {});
+ SmallSet<MCPhysReg, 1> RetRegsWithGadgets =
+ computeDfState(PRA, BF, AllocatorId);
+ if (!RetRegsWithGadgets.empty()) {
+ // Redo the analysis, but now also track which instructions last wrote
+ // to any of the registers in RetRegsWithGadgets, so that better
+ // diagnostics can be produced.
+ std::vector<MCPhysReg> RegsToTrack;
+ for (MCPhysReg R : RetRegsWithGadgets)
+ RegsToTrack.push_back(R);
+ PacRetAnalysis PRWIA(BF, AllocatorId, RegsToTrack);
+ SmallSet<MCPhysReg, 1> RetRegsWithGadgets =
+ computeDfState(PRWIA, BF, AllocatorId);
+ }
+ }
+}
+
+void printBB(const BinaryContext &BC, const BinaryBasicBlock *BB,
+ size_t StartIndex = 0, size_t EndIndex = -1) {
+ if (EndIndex == (size_t)-1)
+ EndIndex = BB->size() - 1;
+ const BinaryFunction *BF = BB->getFunction();
+ for (unsigned I = StartIndex; I <= EndIndex; ++I) {
+ uint64_t Address = BB->getOffset() + BF->getAddress() + 4 * I;
+ const MCInst &Inst = BB->getInstructionAtIndex(I);
+ if (BC.MIB->isCFI(Inst))
+ continue;
+ BC.printInstruction(outs(), Inst, Address, BF);
+ }
+}
+
+void reportFoundGadgetInSingleBBSingleOverwInst(const BinaryContext &BC,
+ const MCInstReference OverwInst,
+ const MCInstReference RetInst) {
+ BinaryBasicBlock *BB = RetInst.getBasicBlock();
+ assert(OverwInst.CurrentLocation == MCInstReference::_BinaryBasicBlock);
+ assert(RetInst.CurrentLocation == MCInstReference::_BinaryBasicBlock);
+ MCInstInBBReference OverwInstBB = OverwInst.U.BBRef;
+ if (BB == OverwInstBB.BB) {
+ // overwriting inst and ret instruction are in the same basic block.
+ assert(OverwInstBB.BBIndex < RetInst.U.BBRef.BBIndex);
+ outs() << " This happens in the following basic block:\n";
+ printBB(BC, BB);
+ }
+}
+
+void reportFoundGadget(const BinaryContext &BC, const MCInst &Inst,
+ unsigned int GadgetAnnotationIndex) {
+ auto NPPRG = BC.MIB->getAnnotationAs<NonPacProtectedRetGadget>(
+ Inst, GadgetAnnotationIndex);
+ MCInstReference RetInst = NPPRG.RetInst;
+ BinaryFunction *BF = RetInst.getFunction();
+ BinaryBasicBlock *BB = RetInst.getBasicBlock();
+
+ outs() << "\nGS-PACRET: " << "non-protected ret found in function "
+ << BF->getPrintName();
+ if (BB)
+ outs() << ", basic block " << BB->getName();
+ outs() << ", at address " << llvm::format("%x", RetInst.getAddress()) << "\n";
+ outs() << " The return instruction is ";
+ BC.printInstruction(outs(), RetInst, RetInst.getAddress(), BF);
+ outs() << " The " << NPPRG.OverwritingRetRegInst.size()
+ << " instructions that write to the return register after any "
+ "authentication are:\n";
+ // Sort by address to ensure output is deterministic.
+ std::sort(std::begin(NPPRG.OverwritingRetRegInst),
+ std::end(NPPRG.OverwritingRetRegInst),
+ [](const MCInstReference &A, const MCInstReference &B) {
+ return A.getAddress() < B.getAddress();
+ });
+ for (unsigned I = 0; I < NPPRG.OverwritingRetRegInst.size(); ++I) {
+ MCInstReference InstRef = NPPRG.OverwritingRetRegInst[I];
+ outs() << " " << (I + 1) << ". ";
+ BC.printInstruction(outs(), InstRef, InstRef.getAddress(), BF);
+ };
+ LLVM_DEBUG({
+ dbgs() << " .. OverWritingRetRegInst:\n";
+ for (MCInstReference Ref : NPPRG.OverwritingRetRegInst) {
+ dbgs() << " " << Ref << "\n";
+ }
+ });
+ if (NPPRG.OverwritingRetRegInst.size() == 1) {
+ const MCInstReference OverwInst = NPPRG.OverwritingRetRegInst[0];
+ assert(OverwInst.CurrentLocation == MCInstReference::_BinaryBasicBlock);
+ reportFoundGadgetInSingleBBSingleOverwInst(BC, OverwInst, RetInst);
+ }
+}
+
+Error NonPacProtectedRetAnalysis::runOnFunctions(BinaryContext &BC) {
+ GadgetAnnotationIndex = BC.MIB->getOrCreateAnnotationIndex("pacret-gadget");
+
+ ParallelUtilities::WorkFuncWithAllocTy WorkFun =
+ [&](BinaryFunction &BF, MCPlusBuilder::AllocatorIdTy AllocatorId) {
+ runOnFunction(BF, AllocatorId);
+ };
+
+ ParallelUtilities::PredicateTy SkipFunc = [&](const BinaryFunction &BF) {
+ return false; // BF.shouldPreserveNops();
+ };
+
+ ParallelUtilities::runOnEachFunctionWithUniqueAllocId(
+ BC, ParallelUtilities::SchedulingPolicy::SP_INST_LINEAR, WorkFun,
+ SkipFunc, "NonPacProtectedRetAnalysis");
+
+ for (BinaryFunction *BF : BC.getAllBinaryFunctions())
+ if (BF->hasCFG()) {
+ for (BinaryBasicBlock &BB : *BF) {
+ for (int64_t I = BB.size() - 1; I >= 0; --I) {
+ MCInst &Inst = BB.getInstructionAtIndex(I);
+ if (BC.MIB->hasAnnotation(Inst, GadgetAnnotationIndex)) {
+ reportFoundGadget(BC, Inst, GadgetAnnotationIndex);
+ }
+ }
+ }
+ }
+ return Error::success();
+}
+
+} // namespace bolt
+} // namespace llvm
\ No newline at end of file
diff --git a/bolt/lib/Rewrite/RewriteInstance.cpp b/bolt/lib/Rewrite/RewriteInstance.cpp
index 4329235d470497..0a45c35f02da32 100644
--- a/bolt/lib/Rewrite/RewriteInstance.cpp
+++ b/bolt/lib/Rewrite/RewriteInstance.cpp
@@ -20,6 +20,7 @@
#include "bolt/Passes/BinaryPasses.h"
#include "bolt/Passes/CacheMetrics.h"
#include "bolt/Passes/IdenticalCodeFolding.h"
+#include "bolt/Passes/NonPacProtectedRetAnalysis.h"
#include "bolt/Passes/ReorderFunctions.h"
#include "bolt/Profile/BoltAddressTranslation.h"
#include "bolt/Profile/DataAggregator.h"
@@ -245,6 +246,12 @@ static cl::opt<bool> WriteBoltInfoSection(
"bolt-info", cl::desc("write bolt info section in the output binary"),
cl::init(true), cl::Hidden, cl::cat(BoltOutputCategory));
+cl::list<GadgetScannerKind> GadgetScannersToRun(
+ "scanners", cl::desc("which gadget scanners to run"),
+ cl::values(clEnumValN(GS_PACRET, "pacret", "pac-ret"),
+ clEnumValN(GS_ALL, "all", "all")),
+ cl::ZeroOrMore, cl::CommaSeparated, cl::cat(BinaryAnalysisCategory));
+
} // namespace opts
// FIXME: implement a better way to mark sections for replacement.
@@ -3490,7 +3497,23 @@ void RewriteInstance::runOptimizationPasses() {
BC->logBOLTErrorsAndQuitOnFatal(BinaryFunctionPassManager::runAllPasses(*BC));
}
-void RewriteInstance::runBinaryAnalyses() {}
+void RewriteInstance::runBinaryAnalyses() {
+ NamedRegionTimer T("runBinaryAnalyses", "run binary analysis passes",
+ TimerGroupName, TimerGroupDesc, opts::TimeRewrite);
+ BinaryFunctionPassManager Manager(*BC);
+ // FIXME: add a pass that warns about which functions do not have CFG,
+ // and therefore, analysis is most likely to be less accurate.
+ using GSK = opts::GadgetScannerKind;
+ // if no command line option was given, act as if "all" was specified.
+ if (opts::GadgetScannersToRun.size() == 0)
+ opts::GadgetScannersToRun.addValue(GSK::GS_ALL);
+ for (GSK ScannerToRun : opts::GadgetScannersToRun) {
+ if (ScannerToRun == GSK::GS_PACRET || ScannerToRun == GSK::GS_ALL)
+ Manager.registerPass(std::make_unique<NonPacProtectedRetAnalysis>());
+ }
+
+ BC->logBOLTErrorsAndQuitOnFatal(Manager.runPasses());
+}
void RewriteInstance::preregisterSections() {
// Preregister sections before emission to set their order in the output.
diff --git a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
index 679c9774c767f7..1840750822aa3f 100644
--- a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
+++ b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
@@ -148,6 +148,69 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
return false;
}
+ MCPhysReg getAuthenticatedReg(const MCInst &Inst) const override {
+ switch (Inst.getOpcode()) {
+ case AArch64::AUTIAZ:
+ case AArch64::AUTIBZ:
+ case AArch64::AUTIASP:
+ case AArch64::AUTIBSP:
+ case AArch64::RETAA:
+ case AArch64::RETAB:
+ return AArch64::LR;
+ case AArch64::AUTIA1716:
+ case AArch64::AUTIB1716:
+ return AArch64::X17;
+ case AArch64::ERETAA:
+ case AArch64::ERETAB:
+ return AArch64::LR;
+
+ case AArch64::AUTIA:
+ case AArch64::AUTIB:
+ case AArch64::AUTDA:
+ case AArch64::AUTDB:
+ case AArch64::AUTIZA:
+ case AArch64::AUTIZB:
+ case AArch64::AUTDZA:
+ case AArch64::AUTDZB:
+ assert(Inst.getOperand(0).isReg());
+ return Inst.getOperand(0).getReg();
+
+ default:
+ return getNoRegister();
+ }
+ }
+
+ bool isAuthenticationOfReg(const MCInst &Inst,
+ const unsigned RegAuthenticated) const override {
+ if (RegAuthenticated == getNoRegister())
+ return false;
+ return getAuthenticatedReg(Inst) == RegAuthenticated;
+ }
+
+ llvm::MCPhysReg getRegUsedAsRetDest(const MCInst &Inst) const override {
+ assert(isReturn(Inst));
+ switch (Inst.getOpcode()) {
+ case AArch64::RET:
+ // There should be one register that the return reads, and
+ // that's the one being used as the jump target?
+ for (unsigned OpIdx = 0, EndIdx = Inst.getNumOperands(); OpIdx < EndIdx;
+ ++OpIdx) {
+ const MCOperand &MO = Inst.getOperand(OpIdx);
+ if (MO.isReg())
+ return MO.getReg();
+ }
+ return getNoRegister();
+ case AArch64::RETAA:
+ case AArch64::RETAB:
+ case AArch64::ERETAA:
+ case AArch64::ERETAB:
+ case AArch64::ERET:
+ return AArch64::LR;
+ default:
+ llvm_unreachable("Unhandled return instruction");
+ }
+ }
+
bool isADRP(const MCInst &Inst) const override {
return Inst.getOpcode() == AArch64::ADRP;
}
diff --git a/bolt/test/binary-analysis/AArch64/cmdline-args.test b/bolt/test/binary-analysis/AArch64/cmdline-args.test
index e414818644a3b0..1204d5b1289af4 100644
--- a/bolt/test/binary-analysis/AArch64/cmdline-args.test
+++ b/bolt/test/binary-analysis/AArch64/cmdline-args.test
@@ -13,7 +13,7 @@ NONEXISTINGFILEARG: llvm-bolt-binary-analysis: 'non-existing-file': No suc
RUN: not llvm-bolt-binary-analysis %p/Inputs/dummy.txt 2>&1 | FileCheck -check-prefix=NOELFFILEARG %s
NOELFFILEARG: llvm-bolt-binary-analysis: '{{.*}}/Inputs/dummy.txt': The file was not recognized as a valid object file.
-RUN: %clang %cflags %p/../../Inputs/asm_foo.s %p/../../Inputs/asm_main.c -o %t.exe
+RUN: %clang %cflags -Wl,--emit-relocs %p/../../Inputs/asm_foo.s %p/../../Inputs/asm_main.c -o %t.exe
RUN: llvm-bolt-binary-analysis %t.exe 2>&1 | FileCheck -check-prefix=VALIDELFFILEARG --allow-empty %s
# Check that there are no BOLT-WARNING or BOLT-ERROR output lines
VALIDELFFILEARG: BOLT-INFO:
@@ -30,4 +30,10 @@ HELP-NEXT: USAGE: llvm-bolt-binary-analysis [options] <executable>
HELP-EMPTY:
HELP-NEXT: OPTIONS:
HELP-EMPTY:
+HELP-NEXT: BinaryAnalysis options:
+HELP-EMPTY:
+HELP-NEXT: --scanners=<value> - which gadget scanners to run
+HELP-NEXT: =pacret - pac-ret
+HELP-NEXT: =all - all
+HELP-EMPTY:
HELP-NEXT: Generic Options:
diff --git a/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s b/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s
new file mode 100644
index 00000000000000..8731bb46362867
--- /dev/null
+++ b/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s
@@ -0,0 +1,681 @@
+// RUN: %clang %cflags -march=armv8.3-a -mbranch-protection=pac-ret %s %p/../../Inputs/asm_main.c -o %t.exe
+// RUN: llvm-bolt-binary-analysis --scanners=pacret %t.exe 2>&1 | FileCheck -check-prefix=CHECK --allow-empty %s
+
+ .text
+
+ .globl f1
+ .type f1, at function
+f1:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ // autiasp
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f1, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT: The return instruction is {{[0-9a-f]+}}: ret
+// CHECK-NEXT: The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT: 1. {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}: paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}: stp x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}: mov x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}: bl g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}: add x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}: ret
+ ret
+ .size f1, .-f1
+
+
+ .globl f_intermediate_overwrite1
+ .type f_intermediate_overwrite1, at function
+f_intermediate_overwrite1:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ autiasp
+ ldp x29, x30, [sp], #16
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_intermediate_overwrite1, basic block .LBB
+// CHECK-NEXT: The return instruction is {{[0-9a-f]+}}: ret
+// CHECK-NEXT: The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT: 1. {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}: paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}: stp x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}: mov x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}: bl g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}: add x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}: autiasp
+// CHECK-NEXT: {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}: ret
+ ret
+ .size f_intermediate_overwrite1, .-f_intermediate_overwrite1
+
+ .globl f_intermediate_overwrite2
+ .type f_intermediate_overwrite2, at function
+f_intermediate_overwrite2:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autiasp
+ mov x30, x0
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_intermediate_overwrite2, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT: The return instruction is {{[0-9a-f]+}}: ret
+// CHECK-NEXT: The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT: 1. {{[0-9a-f]+}}: mov x30, x0
+// CHECK-NEXT: This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}: paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}: stp x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}: mov x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}: bl g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}: add x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}: autiasp
+// CHECK-NEXT: {{[0-9a-f]+}}: mov x30, x0
+// CHECK-NEXT: {{[0-9a-f]+}}: ret
+ ret
+ .size f_intermediate_overwrite2, .-f_intermediate_overwrite2
+
+ .globl f_intermediate_read
+ .type f_intermediate_read, at function
+f_intermediate_read:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autiasp
+ mov x0, x30
+// CHECK-NOT: function f_intermediate_read
+ ret
+ .size f_intermediate_read, .-f_intermediate_read
+
+ .globl f_intermediate_overwrite3
+ .type f_intermediate_overwrite3, at function
+f_intermediate_overwrite3:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autiasp
+ mov w30, w0
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_intermediate_overwrite3, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT: The return instruction is {{[0-9a-f]+}}: ret
+// CHECK-NEXT: The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT: 1. {{[0-9a-f]+}}: mov w30, w0
+// CHECK-NEXT: This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}: paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}: stp x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}: mov x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}: bl g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}: add x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}: autiasp
+// CHECK-NEXT: {{[0-9a-f]+}}: mov w30, w0
+// CHECK-NEXT: {{[0-9a-f]+}}: ret
+ ret
+ .size f_intermediate_overwrite3, .-f_intermediate_overwrite3
+
+ .globl f_nonx30_ret
+ .type f_nonx30_ret, at function
+f_nonx30_ret:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ mov x16, x30
+ autiasp
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_nonx30_ret, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT: The return instruction is {{[0-9a-f]+}}: ret x16
+// CHECK-NEXT: The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT: 1. {{[0-9a-f]+}}: mov x16, x30
+// CHECK-NEXT: This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}: paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}: stp x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}: mov x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}: bl g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}: add x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}: mov x16, x30
+// CHECK-NEXT: {{[0-9a-f]+}}: autiasp
+// CHECK-NEXT: {{[0-9a-f]+}}: ret x16
+ ret x16
+ .size f_nonx30_ret, .-f_nonx30_ret
+
+
+/// Now do a basic sanity check on every different Authentication instruction:
+
+ .globl f_autiasp
+ .type f_autiasp, at function
+f_autiasp:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autiasp
+// CHECK-NOT: function f_autiasp
+ ret
+ .size f_autiasp, .-f_autiasp
+
+ .globl f_autibsp
+ .type f_autibsp, at function
+f_autibsp:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autibsp
+// CHECK-NOT: function f_autibsp
+ ret
+ .size f_autibsp, .-f_autibsp
+
+ .globl f_autiaz
+ .type f_autiaz, at function
+f_autiaz:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autiaz
+// CHECK-NOT: function f_autiaz
+ ret
+ .size f_autiaz, .-f_autiaz
+
+ .globl f_autibz
+ .type f_autibz, at function
+f_autibz:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autiaz
+// CHECK-NOT: function f_autibz
+ ret
+ .size f_autibz, .-f_autibz
+
+ .globl f_autia1716
+ .type f_autia1716, at function
+f_autia1716:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autia1716
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_autia1716, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT: The return instruction is {{[0-9a-f]+}}: ret
+// CHECK-NEXT: The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT: 1. {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}: paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}: stp x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}: mov x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}: bl g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}: add x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}: autia1716
+// CHECK-NEXT: {{[0-9a-f]+}}: ret
+ ret
+ .size f_autia1716, .-f_autia1716
+
+ .globl f_autib1716
+ .type f_autib1716, at function
+f_autib1716:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autib1716
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_autib1716, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT: The return instruction is {{[0-9a-f]+}}: ret
+// CHECK-NEXT: The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT: 1. {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}: paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}: stp x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}: mov x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}: bl g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}: add x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}: autib1716
+// CHECK-NEXT: {{[0-9a-f]+}}: ret
+ ret
+ .size f_autib1716, .-f_autib1716
+
+ .globl f_autiax12
+ .type f_autiax12, at function
+f_autiax12:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autia x12, sp
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_autiax12, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT: The return instruction is {{[0-9a-f]+}}: ret
+// CHECK-NEXT: The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT: 1. {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}: paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}: stp x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}: mov x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}: bl g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}: add x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}: autia x12, sp
+// CHECK-NEXT: {{[0-9a-f]+}}: ret
+ ret
+ .size f_autiax12, .-f_autiax12
+
+ .globl f_autibx12
+ .type f_autibx12, at function
+f_autibx12:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autib x12, sp
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_autibx12, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT: The return instruction is {{[0-9a-f]+}}: ret
+// CHECK-NEXT: The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT: 1. {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}: paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}: stp x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}: mov x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}: bl g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}: add x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}: autib x12, sp
+// CHECK-NEXT: {{[0-9a-f]+}}: ret
+ ret
+ .size f_autibx12, .-f_autibx12
+
+ .globl f_autiax30
+ .type f_autiax30, at function
+f_autiax30:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autia x30, sp
+// CHECK-NOT: function f_autiax30
+ ret
+ .size f_autiax30, .-f_autiax30
+
+ .globl f_autibx30
+ .type f_autibx30, at function
+f_autibx30:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autib x30, sp
+// CHECK-NOT: function f_autibx30
+ ret
+ .size f_autibx30, .-f_autibx30
+
+
+ .globl f_autdax12
+ .type f_autdax12, at function
+f_autdax12:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autda x12, sp
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_autdax12, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT: The return instruction is {{[0-9a-f]+}}: ret
+// CHECK-NEXT: The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT: 1. {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}: paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}: stp x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}: mov x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}: bl g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}: add x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}: autda x12, sp
+// CHECK-NEXT: {{[0-9a-f]+}}: ret
+ ret
+ .size f_autdax12, .-f_autdax12
+
+ .globl f_autdbx12
+ .type f_autdbx12, at function
+f_autdbx12:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autdb x12, sp
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_autdbx12, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT: The return instruction is {{[0-9a-f]+}}: ret
+// CHECK-NEXT: The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT: 1. {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}: paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}: stp x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}: mov x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}: bl g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}: add x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}: autdb x12, sp
+// CHECK-NEXT: {{[0-9a-f]+}}: ret
+ ret
+ .size f_autdbx12, .-f_autdbx12
+
+ .globl f_autdax30
+ .type f_autdax30, at function
+f_autdax30:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autda x30, sp
+// CHECK-NOT: function f_autdax30
+ ret
+ .size f_autdax30, .-f_autdax30
+
+ .globl f_autdbx30
+ .type f_autdbx30, at function
+f_autdbx30:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autdb x30, sp
+// CHECK-NOT: function f_autdbx30
+ ret
+ .size f_autdbx30, .-f_autdbx30
+
+
+ .globl f_autizax12
+ .type f_autizax12, at function
+f_autizax12:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autiza x12
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_autizax12, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT: The return instruction is {{[0-9a-f]+}}: ret
+// CHECK-NEXT: The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT: 1. {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}: paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}: stp x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}: mov x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}: bl g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}: add x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}: autiza x12
+// CHECK-NEXT: {{[0-9a-f]+}}: ret
+ ret
+ .size f_autizax12, .-f_autizax12
+
+ .globl f_autizbx12
+ .type f_autizbx12, at function
+f_autizbx12:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autizb x12
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_autizbx12, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT: The return instruction is {{[0-9a-f]+}}: ret
+// CHECK-NEXT: The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT: 1. {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}: paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}: stp x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}: mov x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}: bl g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}: add x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}: autizb x12
+// CHECK-NEXT: {{[0-9a-f]+}}: ret
+ ret
+ .size f_autizbx12, .-f_autizbx12
+
+ .globl f_autizax30
+ .type f_autizax30, at function
+f_autizax30:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autiza x30
+// CHECK-NOT: function f_autizax30
+ ret
+ .size f_autizax30, .-f_autizax30
+
+ .globl f_autizbx30
+ .type f_autizbx30, at function
+f_autizbx30:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autizb x30
+// CHECK-NOT: function f_autizbx30
+ ret
+ .size f_autizbx30, .-f_autizbx30
+
+
+ .globl f_autdzax12
+ .type f_autdzax12, at function
+f_autdzax12:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autdza x12
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_autdzax12, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT: The return instruction is {{[0-9a-f]+}}: ret
+// CHECK-NEXT: The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT: 1. {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}: paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}: stp x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}: mov x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}: bl g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}: add x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}: autdza x12
+// CHECK-NEXT: {{[0-9a-f]+}}: ret
+ ret
+ .size f_autdzax12, .-f_autdzax12
+
+ .globl f_autdzbx12
+ .type f_autdzbx12, at function
+f_autdzbx12:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autdzb x12
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_autdzbx12, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT: The return instruction is {{[0-9a-f]+}}: ret
+// CHECK-NEXT: The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT: 1. {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}: paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}: stp x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}: mov x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}: bl g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}: add x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}: autdzb x12
+// CHECK-NEXT: {{[0-9a-f]+}}: ret
+ ret
+ .size f_autdzbx12, .-f_autdzbx12
+
+ .globl f_autdzax30
+ .type f_autdzax30, at function
+f_autdzax30:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autdza x30
+// CHECK-NOT: function f_autdzax30
+ ret
+ .size f_autdzax30, .-f_autdzax30
+
+ .globl f_autdzbx30
+ .type f_autdzbx30, at function
+f_autdzbx30:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autdzb x30
+// CHECK-NOT: function f_autdzbx30
+ ret
+ .size f_autdzbx30, .-f_autdzbx30
+
+ .globl f_retaa
+ .type f_retaa, at function
+f_retaa:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+// CHECK-NOT: function f_retaa
+ retaa
+ .size f_retaa, .-f_retaa
+
+ .globl f_retab
+ .type f_retab, at function
+f_retab:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+// CHECK-NOT: function f_retab
+ retab
+ .size f_retab, .-f_retab
+
+ .globl f_eretaa
+ .type f_eretaa, at function
+f_eretaa:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+// CHECK-NOT: function f_eretaa
+ eretaa
+ .size f_eretaa, .-f_eretaa
+
+ .globl f_eretab
+ .type f_eretab, at function
+f_eretab:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+// CHECK-NOT: function f_eretab
+ eretab
+ .size f_eretab, .-f_eretab
+
+ .globl f_eret
+ .type f_eret, at function
+f_eret:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_eret, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT: The return instruction is {{[0-9a-f]+}}: eret
+// CHECK-NEXT: The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT: 1. {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}: paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}: stp x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}: mov x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}: bl g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}: add x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}: eret
+ eret
+ .size f_eret, .-f_eret
+
+ .globl f_movx30reg
+ .type f_movx30reg, at function
+f_movx30reg:
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_movx30reg, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT: The return instruction is {{[0-9a-f]+}}: ret
+// CHECK-NEXT: The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT: 1. {{[0-9a-f]+}}: mov x30, x22
+// CHECK-NEXT: This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}: mov x30, x22
+// CHECK-NEXT: {{[0-9a-f]+}}: ret
+ mov x30, x22
+ ret
+ .size f_movx30reg, .-f_movx30reg
+
+// TODO: add test to see if registers clobbered by a call are picked up.
diff --git a/bolt/test/binary-analysis/AArch64/gs-pacret-multi-bb.s b/bolt/test/binary-analysis/AArch64/gs-pacret-multi-bb.s
new file mode 100644
index 00000000000000..402bdf6591dcb1
--- /dev/null
+++ b/bolt/test/binary-analysis/AArch64/gs-pacret-multi-bb.s
@@ -0,0 +1,46 @@
+// RUN: %clang %cflags -march=armv8.3-a -mbranch-protection=pac-ret %s %p/../../Inputs/asm_main.c -o %t.exe
+// RUN: llvm-bolt-binary-analysis --scanners=pacret %t.exe 2>&1 | FileCheck -check-prefix=CHECK --allow-empty %s
+
+
+// Verify that we can also detect gadgets across basic blocks
+
+ .globl f_crossbb1
+ .type f_crossbb1, at function
+f_crossbb1:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ ldp x29, x30, [sp], #16
+ cbnz x0, 1f
+ autiasp
+1:
+ ret
+ .size f_crossbb1, .-f_crossbb1
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_crossbb1, basic block .L{{[^,]+}}, at address
+// CHECK-NEXT: The return instruction is {{[0-9a-f]+}}: ret
+// CHECK-NEXT: The 2 instructions that write to the return register after any authentication are:
+// CHECK-NEXT: 1. {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: 2. {{[0-9a-f]+}}: autiasp
+
+// A test that checks that the dataflow state tracking across when merging BBs
+// seems to work:
+ .globl f_mergebb1
+ .type f_mergebb1, at function
+f_mergebb1:
+ hint #25
+2:
+ stp x29, x30, [sp, #-16]!
+ ldp x29, x30, [sp], #16
+ sub x0, x0, #1
+ cbnz x0, 1f
+ autiasp
+ b 2b
+1:
+ ret
+ .size f_mergebb1, .-f_mergebb1
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_mergebb1, basic block .L{{[^,]+}}, at address
+// CHECK-NEXT: The return instruction is {{[0-9a-f]+}}: ret
+// CHECK-NEXT: The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT: 1. {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+
+// TODO: also verify that false negatives exist in across-BB gadgets in functions
+// for which bolt cannot reconstruct the call graph.
diff --git a/bolt/test/binary-analysis/AArch64/lit.local.cfg b/bolt/test/binary-analysis/AArch64/lit.local.cfg
index 6f247dd52e82f8..6ac7d3cc0fec71 100644
--- a/bolt/test/binary-analysis/AArch64/lit.local.cfg
+++ b/bolt/test/binary-analysis/AArch64/lit.local.cfg
@@ -1,7 +1,7 @@
if "AArch64" not in config.root.targets:
config.unsupported = True
-flags = "--target=aarch64-linux-gnu -nostartfiles -nostdlib -ffreestanding -Wl,--emit-relocs"
+flags = "--target=aarch64-linux-gnu -nostartfiles -nostdlib -ffreestanding"
config.substitutions.insert(0, ("%cflags", f"%cflags {flags}"))
config.substitutions.insert(0, ("%cxxflags", f"%cxxflags {flags}"))
>From 35e1c44f36869c0378e932dd4d9a2562b9404d0f Mon Sep 17 00:00:00 2001
From: Kristof Beyls <kristof.beyls at arm.com>
Date: Thu, 9 Jan 2025 15:53:57 +0000
Subject: [PATCH 2/2] format cleanup
---
bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h | 8 +++-----
bolt/lib/Rewrite/RewriteInstance.cpp | 11 ++++++-----
2 files changed, 9 insertions(+), 10 deletions(-)
diff --git a/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h b/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h
index a9740896369fa2..21d089ed7b8506 100644
--- a/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h
+++ b/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h
@@ -31,7 +31,7 @@ struct MCInstInBBReference {
: BB(BB), BBIndex(BBIndex) {}
MCInstInBBReference() : BB(nullptr), BBIndex(0) {}
static MCInstInBBReference get(const MCInst *Inst, BinaryFunction &BF) {
- for (BinaryBasicBlock& BB : BF)
+ for (BinaryBasicBlock &BB : BF)
for (size_t I = 0; I < BB.size(); ++I)
if (Inst == &(BB.getInstructionAtIndex(I)))
return MCInstInBBReference(&BB, I);
@@ -78,9 +78,7 @@ struct MCInstInBFReference {
uint64_t getOffset() const { return Offset; }
- uint64_t getAddress() const {
- return BF->getAddress() + getOffset();
- }
+ uint64_t getAddress() const { return BF->getAddress() + getOffset(); }
};
raw_ostream &operator<<(raw_ostream &OS, const MCInstInBFReference &);
@@ -179,7 +177,7 @@ struct NonPacProtectedRetGadget {
}
NonPacProtectedRetGadget(
MCInstReference RetInst,
- const std::vector<MCInstReference>& OverwritingRetRegInst)
+ const std::vector<MCInstReference> &OverwritingRetRegInst)
: RetInst(RetInst), OverwritingRetRegInst(OverwritingRetRegInst) {}
};
diff --git a/bolt/lib/Rewrite/RewriteInstance.cpp b/bolt/lib/Rewrite/RewriteInstance.cpp
index 0a45c35f02da32..135cfc63de4d37 100644
--- a/bolt/lib/Rewrite/RewriteInstance.cpp
+++ b/bolt/lib/Rewrite/RewriteInstance.cpp
@@ -246,11 +246,12 @@ static cl::opt<bool> WriteBoltInfoSection(
"bolt-info", cl::desc("write bolt info section in the output binary"),
cl::init(true), cl::Hidden, cl::cat(BoltOutputCategory));
-cl::list<GadgetScannerKind> GadgetScannersToRun(
- "scanners", cl::desc("which gadget scanners to run"),
- cl::values(clEnumValN(GS_PACRET, "pacret", "pac-ret"),
- clEnumValN(GS_ALL, "all", "all")),
- cl::ZeroOrMore, cl::CommaSeparated, cl::cat(BinaryAnalysisCategory));
+cl::list<GadgetScannerKind>
+ GadgetScannersToRun("scanners", cl::desc("which gadget scanners to run"),
+ cl::values(clEnumValN(GS_PACRET, "pacret", "pac-ret"),
+ clEnumValN(GS_ALL, "all", "all")),
+ cl::ZeroOrMore, cl::CommaSeparated,
+ cl::cat(BinaryAnalysisCategory));
} // namespace opts
More information about the llvm-commits
mailing list