[llvm] [BOLT][binary-analysis] Add initial pac-ret gadget scanner (PR #122304)
Kristof Beyls via llvm-commits
llvm-commits at lists.llvm.org
Wed Jan 15 05:42:43 PST 2025
https://github.com/kbeyls updated https://github.com/llvm/llvm-project/pull/122304
>From 0b0137b8feab04db089297cb0efc288255f54298 Mon Sep 17 00:00:00 2001
From: Kristof Beyls <kristof.beyls at arm.com>
Date: Thu, 9 Jan 2025 11:06:57 +0000
Subject: [PATCH 1/8] [BOLT][binary-analysis] Add initial pac-ret gadget
scanner
This adds an initial pac-ret gadget scanner to the
llvm-bolt-binary-analysis-tool.
The scanner is taken from the prototype that was published last year at
https://github.com/llvm/llvm-project/compare/main...kbeyls:llvm-project:bolt-gadget-scanner-prototype,
and has been discussed in RFC
https://discourse.llvm.org/t/rfc-bolt-based-binary-analysis-tool-to-verify-correctness-of-security-hardening/78148
and in the EuroLLVM 2024 keynote "Does LLVM implement security
hardenings correctly? A BOLT-based static analyzer to the rescue?"
[Video](https://youtu.be/Sn_Fxa0tdpY)
[Slides](https://llvm.org/devmtg/2024-04/slides/Keynote/Beyls_EuroLLVM2024_security_hardening_keynote.pdf)
In the spirit of incremental development, this PR aims to add a minimal
implementation that is "fully working" on its own, but has major
limitations, as described in the bolt/docs/BinaryAnalysis.md
documentation in this proposed commit. These and other limitations will
be fixed in follow-on PRs, mostly based on code already existing in the
prototype branch. I hope incrementally upstreaming will make it easier
to review the code.
That being said, this patch isn't really small.
I believe that this patch needs 2 "kinds" of reviewers:
1. People familiar with BOLT internals.
2. People familiar with the pac-ret hardening scheme and AArch64.
I expect people familiar with the pac-ret hardening scheme might need to
focus on reviewing mostly what is in file
`NonPacProtectedRetAnalysis.cpp`, `AArch64MCPlusBuilder.cpp` and the
unit tests.
I expect people familiar with BOLT might mainly need to focus on the
other parts, and the high-level structure of
`NonPacProtectedRetAnalysis.cpp`.
The patch is not very polished currently, but I think it's in the right
shape to start getting it reviewed.
Note that I believe that this could also form the basis of a scanner to
analyze correct implementation of PAuthABI.
---
bolt/docs/BinaryAnalysis.md | 163 ++++-
bolt/include/bolt/Core/MCPlusBuilder.h | 16 +
.../bolt/Passes/NonPacProtectedRetAnalysis.h | 209 ++++++
bolt/include/bolt/Utils/CommandLineOpts.h | 4 +
bolt/lib/Passes/CMakeLists.txt | 1 +
.../lib/Passes/NonPacProtectedRetAnalysis.cpp | 520 +++++++++++++
bolt/lib/Rewrite/RewriteInstance.cpp | 25 +-
.../Target/AArch64/AArch64MCPlusBuilder.cpp | 63 ++
.../binary-analysis/AArch64/cmdline-args.test | 8 +-
.../AArch64/gs-pacret-autiasp.s | 681 ++++++++++++++++++
.../AArch64/gs-pacret-multi-bb.s | 46 ++
.../binary-analysis/AArch64/lit.local.cfg | 2 +-
12 files changed, 1733 insertions(+), 5 deletions(-)
create mode 100644 bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h
create mode 100644 bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
create mode 100644 bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s
create mode 100644 bolt/test/binary-analysis/AArch64/gs-pacret-multi-bb.s
diff --git a/bolt/docs/BinaryAnalysis.md b/bolt/docs/BinaryAnalysis.md
index f91b77d046de8f..e247ce20b4625c 100644
--- a/bolt/docs/BinaryAnalysis.md
+++ b/bolt/docs/BinaryAnalysis.md
@@ -9,9 +9,168 @@ analyses implemented in the BOLT libraries.
## Which binary analyses are implemented?
-At the moment, no binary analyses are implemented.
+* [Security scanners](#security-scanners)
+ * [pac-ret analysis](#pac-ret-analysis)
-The goal is to make it easy using a plug-in framework to add your own analyses.
+### Security scanners
+
+For the past 25 years, a large numbers of exploits have been built and used in
+the wild to undermine computer security. The majority of these exploits abuse
+memory vulnerabilities in programs, see evidence from
+[Microsoft](https://youtu.be/PjbGojjnBZQ?si=oCHCa0SHgaSNr6Gr&t=836),
+[Chromium](https://www.chromium.org/Home/chromium-security/memory-safety/) and
+[Android](https://security.googleblog.com/2021/01/data-driven-security-hardening-in.html).
+
+It is not surprising therefore, that a large number of mitigations have been
+added to instruction sets and toolchains to make it harder to build an exploit
+using a memory vulnerability. Examples are: stack canaries, stack clash,
+pac-ret, shadow stacks, arm64e, and many more.
+
+These mitigations guarantee a so-called "security property" on the binaries they
+produce. For example, for stack canaries, the security property is roughly that
+a canary is located on the stack between the set of saved variables and set of
+local variables. For pac-ret, it is roughly that there are no writes to the
+register containing the return address after either an authenticating
+instruction or a Branch-and-link instruction.
+
+From time to time, however, a bug gets found in the implementation of such
+mitigations in toolchains. Also, code that is written in assembler by hand
+requires the developer to ensure these security properties by hand.
+
+In short, it is sometimes found that a few places in the binary code are not
+protected as expected given the requested mitigations. Attackers could make use
+of those places (sometimes called gadgets) to circumvent to protection that the
+mitigation should give.
+
+One of the reasons that such gadgets, or holes in the mitigation implementation,
+exist is that typically the amount of testing and verification for these
+security properties is pretty limited.
+
+In comparison, for testing functional correctness, or for testing performance,
+toolchain and software in general typically get tested with large test suites
+and benchmarks and testing and benchmarking plans. In contrast, this typically
+does not get done for testing the security properties of binary code.
+
+The security scanners implemented in `llvm-bolt-binary-analysis` aim to enable
+better testing of security hardening implementations that require compilers to
+somehow generate assembly code with specific security properties.
+
+#### pac-ret analysis
+
+`pac-ret` protection is a security hardening scheme implemented in compilers
+such as gcc and clang, using the command line option
+`-mbranch-protection=pac-ret`. This option is enabled by default on most widely
+used distributions.
+
+The hardening scheme mitigates
+[Return-Oriented Programming (ROP)](https://llsoftsec.github.io/llsoftsecbook/#return-oriented-programming)
+attacks by making sure that return addresses are only ever stored to memory with
+a cryptographic hash, called a
+["Pointer Authentication Code" (PAC)](https://llsoftsec.github.io/llsoftsecbook/#pointer-authentication),
+in the upper bits of the pointer. This makes it substantially harder for
+attackers to divert control flow by overwriting a return address with a
+different value.
+
+The hardening scheme relies on compilers producing different code sequences when
+processing return addresses, especially when these are stored to and retrieved
+from memory.
+
+The 'pac-ret' binary analysis can be invoked using the command line option
+`--scanners=pac-ret`. It makes `llvm-bolt-binary-analysis` scan through the
+provided binary and check the following property:
+
+The security property that is checked is:
+
+When a register is used as the address to jump to in a return instruction,
+that register must either:
+
+1. never be changed within this function, i.e. have the same value as when the
+ function started, or
+2. the last write to the register must be by an authenticating instruction.
+
+##### Example 1
+
+For example, a typical non-pac-ret-protected function looks as follows:
+
+```
+stp x29, x30, [sp, #-0x10]!
+mov x29, sp
+bl g at PLT
+add x0, x0, #0x3
+ldp x29, x30, [sp], #0x10
+ret
+```
+
+The return instruction `ret` uses register `x30` as containing the address to
+return to. Register `x30` was last written by instruction `ldp`, which is not an
+authenticating instruction. `llvm-bolt-binary-analysis --scanners=pac-ret` will
+report this as follows:
+
+```
+GS-PACRET: non-protected ret found in function f1, basic block .LBB00, at address 10310
+ The return instruction is 00010310: ret # pacret-gadget: pac-ret-gadget<Ret:MCInstBBRef<BB:.LBB00:6>, Overwriting:[MCInstBBRef<BB:.LBB00:5> ]>
+ The 1 instructions that write to the return register after any authentication are:
+ 1. 0001030c: ldp x29, x30, [sp], #0x10
+ This happens in the following basic block:
+ 000102fc: stp x29, x30, [sp, #-0x10]!
+ 00010300: mov x29, sp
+ 00010304: bl g at PLT
+ 00010308: add x0, x0, #0x3
+ 0001030c: ldp x29, x30, [sp], #0x10
+ 00010310: ret # pacret-gadget: pac-ret-gadget<Ret:MCInstBBRef<BB:.LBB00:6>, Overwriting:[MCInstBBRef<BB:.LBB00:5> ]>
+```
+
+The exact format of how `llvm-bolt-binary-analysis` reports this is expected to
+evolve over time.
+
+##### Example 2: multiple "last-overwriting" instructions
+
+A simple example that show how there can be a set of "last overwriting"
+instructions of a register is the following:
+
+```
+ paciasp
+ stp x29, x30, [sp, #-16]!
+ ldp x29, x30, [sp], #16
+ cbnz x0, 1f
+ autiasp
+1:
+ ret
+```
+
+This will produce the following diagnostic:
+
+```
+GS-PACRET: non-protected ret found in function f_crossbb1, basic block .Ltmp0, at address 102dc
+ The return instruction is 000102dc: ret # pacret-gadget: pac-ret-gadget<Ret:MCInstBBRef<BB:.Ltmp0:0>, Overwriting:[MCInstBBRef<BB:.LFT0:0> MCInstBBRef<BB:.LBB00:2> ]>
+ The 2 instructions that write to the return register after any authentication are:
+ 1. 000102d0: ldp x29, x30, [sp], #0x10
+ 2. 000102d8: autiasp
+```
+
+(Yes, this diagnostic could be improved because the second "overwriting"
+instruction, `autiasp`, is an authenticating instruction...)
+
+##### Known false positives or negatives
+
+The following are current known cases of false positives:
+
+1. Not handling "no-return" functions. See issue
+ [#115154](https://github.com/llvm/llvm-project/issues/115154) for details and
+ pointers to open PRs to fix this.
+
+The folowing are current known cases of false negatives:
+
+1. Not handling functions for which the CFG cannot be reconstructed by BOLT. The
+ plan is to implement support for this, picking up the implementation from the
+ [prototype branch](
+ https://github.com/llvm/llvm-project/compare/main...kbeyls:llvm-project:bolt-gadget-scanner-prototype).
+
+BOLT cannot currently handle functions with `cfi_negate_ra_state` correctly,
+i.e. any binaries built with `-mbranch-protection=pac-ret`. The scanner is meant
+to be used on specifically such binaries, so this is a major limitation! Work is
+going on in PR [#120064](https://github.com/llvm/llvm-project/pull/120064) to
+fix this.
## How to add your own binary analysis
diff --git a/bolt/include/bolt/Core/MCPlusBuilder.h b/bolt/include/bolt/Core/MCPlusBuilder.h
index 3634fed9757ceb..9351107e66bdf6 100644
--- a/bolt/include/bolt/Core/MCPlusBuilder.h
+++ b/bolt/include/bolt/Core/MCPlusBuilder.h
@@ -550,6 +550,22 @@ class MCPlusBuilder {
return Analysis->isReturn(Inst);
}
+ virtual MCPhysReg getAuthenticatedReg(const MCInst &Inst) const {
+ llvm_unreachable("not implemented");
+ return false;
+ }
+
+ virtual bool isAuthenticationOfReg(const MCInst &Inst,
+ const unsigned RegAuthenticated) const {
+ llvm_unreachable("not implemented");
+ return false;
+ }
+
+ virtual llvm::MCPhysReg getRegUsedAsRetDest(const MCInst &Inst) const {
+ llvm_unreachable("not implemented");
+ return getNoRegister();
+ }
+
virtual bool isTerminator(const MCInst &Inst) const;
virtual bool isNoop(const MCInst &Inst) const {
diff --git a/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h b/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h
new file mode 100644
index 00000000000000..a9740896369fa2
--- /dev/null
+++ b/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h
@@ -0,0 +1,209 @@
+//===- bolt/Passes/NonPacProtectedRetAnalysis.h -----------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef BOLT_PASSES_NONPACPROTECTEDRETANALYSIS_H
+#define BOLT_PASSES_NONPACPROTECTEDRETANALYSIS_H
+
+#include "bolt/Core/BinaryContext.h"
+#include "bolt/Core/BinaryFunction.h"
+#include "bolt/Passes/BinaryPasses.h"
+#include "llvm/ADT/SmallSet.h"
+
+namespace llvm {
+namespace bolt {
+
+/// @brief MCInstReference represents a reference to an MCInst as stored either
+/// in a BinaryFunction (i.e. before a CFG is created), or in a BinaryBasicBlock
+/// (after a CFG is created). It aims to store the necessary information to be
+/// able to find the specific MCInst in either the BinaryFunction or
+/// BinaryBasicBlock data structures later, so that e.g. the InputAddress of
+/// the corresponding instruction can be computed.
+
+struct MCInstInBBReference {
+ BinaryBasicBlock *BB;
+ int64_t BBIndex;
+ MCInstInBBReference(BinaryBasicBlock *BB, int64_t BBIndex)
+ : BB(BB), BBIndex(BBIndex) {}
+ MCInstInBBReference() : BB(nullptr), BBIndex(0) {}
+ static MCInstInBBReference get(const MCInst *Inst, BinaryFunction &BF) {
+ for (BinaryBasicBlock& BB : BF)
+ for (size_t I = 0; I < BB.size(); ++I)
+ if (Inst == &(BB.getInstructionAtIndex(I)))
+ return MCInstInBBReference(&BB, I);
+ return {};
+ }
+ bool operator==(const MCInstInBBReference &RHS) const {
+ return BB == RHS.BB && BBIndex == RHS.BBIndex;
+ }
+ bool operator<(const MCInstInBBReference &RHS) const {
+ if (BB != RHS.BB)
+ return BB < RHS.BB;
+ return BBIndex < RHS.BBIndex;
+ }
+ operator MCInst &() const {
+ assert(BB != nullptr);
+ return BB->getInstructionAtIndex(BBIndex);
+ }
+ uint64_t getAddress() const {
+ // 4 bytes per instruction on AArch64;
+ return BB->getFunction()->getAddress() + BB->getOffset() + BBIndex * 4;
+ }
+};
+
+raw_ostream &operator<<(raw_ostream &OS, const MCInstInBBReference &);
+
+struct MCInstInBFReference {
+ BinaryFunction *BF;
+ uint32_t Offset;
+ MCInstInBFReference(BinaryFunction *BF, uint32_t Offset)
+ : BF(BF), Offset(Offset) {}
+ MCInstInBFReference() : BF(nullptr) {}
+ bool operator==(const MCInstInBFReference &RHS) const {
+ return BF == RHS.BF && Offset == RHS.Offset;
+ }
+ bool operator<(const MCInstInBFReference &RHS) const {
+ if (BF != RHS.BF)
+ return BF < RHS.BF;
+ return Offset < RHS.Offset;
+ }
+ operator MCInst &() const {
+ assert(BF != nullptr);
+ return *(BF->getInstructionAtOffset(Offset));
+ }
+
+ uint64_t getOffset() const { return Offset; }
+
+ uint64_t getAddress() const {
+ return BF->getAddress() + getOffset();
+ }
+};
+
+raw_ostream &operator<<(raw_ostream &OS, const MCInstInBFReference &);
+
+struct MCInstReference {
+ enum StoredIn { _BinaryFunction, _BinaryBasicBlock };
+ StoredIn CurrentLocation;
+ union U {
+ MCInstInBBReference BBRef;
+ MCInstInBFReference BFRef;
+ U(MCInstInBBReference BBRef) : BBRef(BBRef) {}
+ U(MCInstInBFReference BFRef) : BFRef(BFRef) {}
+ } U;
+ MCInstReference(MCInstInBBReference BBRef)
+ : CurrentLocation(_BinaryBasicBlock), U(BBRef) {}
+ MCInstReference(MCInstInBFReference BFRef)
+ : CurrentLocation(_BinaryFunction), U(BFRef) {}
+ MCInstReference(BinaryBasicBlock *BB, int64_t BBIndex)
+ : MCInstReference(MCInstInBBReference(BB, BBIndex)) {}
+ MCInstReference(BinaryFunction *BF, uint32_t Offset)
+ : MCInstReference(MCInstInBFReference(BF, Offset)) {}
+
+ bool operator<(const MCInstReference &RHS) const {
+ if (CurrentLocation != RHS.CurrentLocation)
+ return CurrentLocation < RHS.CurrentLocation;
+ switch (CurrentLocation) {
+ case _BinaryBasicBlock:
+ return U.BBRef < RHS.U.BBRef;
+ case _BinaryFunction:
+ return U.BFRef < RHS.U.BFRef;
+ }
+ llvm_unreachable("");
+ }
+
+ bool operator==(const MCInstReference &RHS) const {
+ if (CurrentLocation != RHS.CurrentLocation)
+ return false;
+ switch (CurrentLocation) {
+ case _BinaryBasicBlock:
+ return U.BBRef == RHS.U.BBRef;
+ case _BinaryFunction:
+ return U.BFRef == RHS.U.BFRef;
+ }
+ llvm_unreachable("");
+ }
+
+ operator MCInst &() const {
+ switch (CurrentLocation) {
+ case _BinaryBasicBlock:
+ return U.BBRef;
+ case _BinaryFunction:
+ return U.BFRef;
+ }
+ llvm_unreachable("");
+ }
+
+ uint64_t getAddress() const {
+ switch (CurrentLocation) {
+ case _BinaryBasicBlock:
+ return U.BBRef.getAddress();
+ case _BinaryFunction:
+ return U.BFRef.getAddress();
+ }
+ llvm_unreachable("");
+ }
+
+ BinaryFunction *getFunction() const {
+ switch (CurrentLocation) {
+ case _BinaryFunction:
+ return U.BFRef.BF;
+ case _BinaryBasicBlock:
+ return U.BBRef.BB->getFunction();
+ }
+ llvm_unreachable("");
+ }
+
+ BinaryBasicBlock *getBasicBlock() const {
+ switch (CurrentLocation) {
+ case _BinaryFunction:
+ return nullptr;
+ case _BinaryBasicBlock:
+ return U.BBRef.BB;
+ }
+ llvm_unreachable("");
+ }
+};
+
+raw_ostream &operator<<(raw_ostream &OS, const MCInstReference &);
+
+struct NonPacProtectedRetGadget {
+ MCInstReference RetInst;
+ std::vector<MCInstReference> OverwritingRetRegInst;
+ bool operator==(const NonPacProtectedRetGadget &RHS) const {
+ return RetInst == RHS.RetInst &&
+ OverwritingRetRegInst == RHS.OverwritingRetRegInst;
+ }
+ NonPacProtectedRetGadget(
+ MCInstReference RetInst,
+ const std::vector<MCInstReference>& OverwritingRetRegInst)
+ : RetInst(RetInst), OverwritingRetRegInst(OverwritingRetRegInst) {}
+};
+
+raw_ostream &operator<<(raw_ostream &OS, const NonPacProtectedRetGadget &NPPRG);
+class PacRetAnalysis;
+
+class NonPacProtectedRetAnalysis : public BinaryFunctionPass {
+ void runOnFunction(BinaryFunction &Function,
+ MCPlusBuilder::AllocatorIdTy AllocatorId);
+ SmallSet<MCPhysReg, 1>
+ computeDfState(PacRetAnalysis &PRA, BinaryFunction &BF,
+ MCPlusBuilder::AllocatorIdTy AllocatorId);
+ unsigned GadgetAnnotationIndex;
+
+public:
+ explicit NonPacProtectedRetAnalysis() : BinaryFunctionPass(false) {}
+
+ const char *getName() const override { return "non-pac-protected-rets"; }
+
+ /// Pass entry point
+ Error runOnFunctions(BinaryContext &BC) override;
+};
+
+} // namespace bolt
+} // namespace llvm
+
+#endif
\ No newline at end of file
diff --git a/bolt/include/bolt/Utils/CommandLineOpts.h b/bolt/include/bolt/Utils/CommandLineOpts.h
index 111eb650c37465..fefd7969ef6f4f 100644
--- a/bolt/include/bolt/Utils/CommandLineOpts.h
+++ b/bolt/include/bolt/Utils/CommandLineOpts.h
@@ -80,6 +80,10 @@ extern llvm::cl::opt<unsigned> Verbosity;
/// Return true if we should process all functions in the binary.
bool processAllFunctions();
+enum GadgetScannerKind { GS_PACRET, GS_ALL };
+
+extern llvm::cl::list<GadgetScannerKind> GadgetScannersToRun;
+
} // namespace opts
namespace llvm {
diff --git a/bolt/lib/Passes/CMakeLists.txt b/bolt/lib/Passes/CMakeLists.txt
index 1c1273b3d2420d..b3fbb0ba0108f4 100644
--- a/bolt/lib/Passes/CMakeLists.txt
+++ b/bolt/lib/Passes/CMakeLists.txt
@@ -23,6 +23,7 @@ add_llvm_library(LLVMBOLTPasses
LoopInversionPass.cpp
LivenessAnalysis.cpp
MCF.cpp
+ NonPacProtectedRetAnalysis.cpp
PatchEntries.cpp
PettisAndHansen.cpp
PLTCall.cpp
diff --git a/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp b/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
new file mode 100644
index 00000000000000..d6bc902f663fc9
--- /dev/null
+++ b/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
@@ -0,0 +1,520 @@
+//===- bolt/Passes/NonPacProtectedRetAnalysis.cpp -------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file implements a pass that looks for any AArch64 return instructions
+// that may not be protected by PAuth authentication instructions when needed.
+//
+// When needed = the register used to return (almost always X30), is potentially
+// written to between the AUThentication instruction and the RETurn instruction.
+//
+//===----------------------------------------------------------------------===//
+
+#include "bolt/Passes/NonPacProtectedRetAnalysis.h"
+#include "bolt/Core/ParallelUtilities.h"
+#include "bolt/Passes/DataflowAnalysis.h"
+#include "llvm/ADT/SmallSet.h"
+#include "llvm/MC/MCInst.h"
+#include "llvm/Support/Format.h"
+
+#define DEBUG_TYPE "bolt-nonpacprotectedret"
+
+namespace llvm {
+namespace bolt {
+
+raw_ostream &operator<<(raw_ostream &OS, const MCInstInBBReference &Ref) {
+ OS << "MCInstBBRef<";
+ if (Ref.BB == nullptr)
+ OS << "BB:(null)";
+ else
+ OS << "BB:" << Ref.BB->getName() << ":" << Ref.BBIndex;
+ OS << ">";
+ return OS;
+}
+
+raw_ostream &operator<<(raw_ostream &OS, const MCInstInBFReference &Ref) {
+ OS << "MCInstBFRef<";
+ if (Ref.BF == nullptr)
+ OS << "BF:(null)";
+ else
+ OS << "BF:" << Ref.BF->getPrintName() << ":" << Ref.getOffset();
+ OS << ">";
+ return OS;
+}
+
+raw_ostream &operator<<(raw_ostream &OS, const MCInstReference &Ref) {
+ switch (Ref.CurrentLocation) {
+ case MCInstReference::_BinaryBasicBlock:
+ OS << Ref.U.BBRef;
+ return OS;
+ case MCInstReference::_BinaryFunction:
+ OS << Ref.U.BFRef;
+ return OS;
+ }
+ llvm_unreachable("");
+}
+
+raw_ostream &operator<<(raw_ostream &OS,
+ const NonPacProtectedRetGadget &NPPRG) {
+ OS << "pac-ret-gadget<";
+ OS << "Ret:" << NPPRG.RetInst << ", ";
+ OS << "Overwriting:[";
+ for (auto Ref : NPPRG.OverwritingRetRegInst)
+ OS << Ref << " ";
+ OS << "]>";
+ return OS;
+}
+
+// The security property that is checked is:
+// When a register is used as the address to jump to in a return instruction,
+// that register must either:
+// (a) never be changed within this function, i.e. have the same value as when
+// the function started, or
+// (b) the last write to the register must be by an authentication instruction.
+
+// This property is checked by using data flow analysis to keep track of which
+// registers have been written (def-ed), since last authenticated. Those are
+// exactly the registers containing values that should not be trusted (as they
+// could have changed since the last time they were authenticated). For pac-ret,
+// any return instruction using such a register is a gadget to be reported. For
+// PAuthABI, any indirect control flow using such a register should be reported?
+
+// This security property is verified using a dataflow analysis.
+
+// Furthermore, when producing a diagnostic for a found non-pac-ret protected
+// return, the analysis also lists the last instructions that wrote to the
+// register used in the return instruction.
+// The total set of registers used in return instructions in a given function is
+// small. It almost always is just `X30`.
+// In order to reduce the memory consumption of storing this additional state
+// during the dataflow analysis, this is computed by running the dataflow
+// analysis twice:
+// 1. In the first run, the dataflow analysis only keeps track of the security
+// property: i.e. which registers have been overwritten since the last
+// time they've been authenticated.
+// 2. If the first run finds any return instructions using a
+// written-to-last-by-an-non-authenticating instruction, the data flow
+// analysis will be run a second time. The first run will return which
+// registers are used in the gadgets to be reported. This information is
+// used in the second run to also track with instructions last wrote to
+// those registers.
+
+struct State {
+ /// A BitVector containing the registers that have been clobbered, and
+ /// not authenticated.
+ BitVector NonAutClobRegs;
+ /// A vector of sets, only used in the second data flow run.
+ /// Each element in the vector represent one registers for which we
+ /// track the set of last instructions that wrote to this register.
+ /// For pac-ret analysis, the expectations is that almost all return
+ /// instructions only use register `X30`, and therefore, this vector
+ /// will have length 1 in the second run.
+ std::vector<SmallPtrSet<const MCInst *, 4>> LastInstWritingReg;
+ State() {}
+ State(uint16_t NumRegs, uint16_t NumRegsToTrack)
+ : NonAutClobRegs(NumRegs), LastInstWritingReg(NumRegsToTrack) {}
+ State &operator|=(const State &StateIn) {
+ NonAutClobRegs |= StateIn.NonAutClobRegs;
+ for (unsigned I = 0; I < LastInstWritingReg.size(); ++I)
+ for (const MCInst *J : StateIn.LastInstWritingReg[I])
+ LastInstWritingReg[I].insert(J);
+ return *this;
+ }
+ bool operator==(const State &RHS) const {
+ return NonAutClobRegs == RHS.NonAutClobRegs &&
+ LastInstWritingReg == RHS.LastInstWritingReg;
+ }
+ bool operator!=(const State &RHS) const { return !((*this) == RHS); }
+};
+
+static void printLastInsts(
+ raw_ostream &OS,
+ const std::vector<SmallPtrSet<const MCInst *, 4>> &LastInstWritingReg) {
+ OS << "Insts: ";
+ for (unsigned I = 0; I < LastInstWritingReg.size(); ++I) {
+ auto Set = LastInstWritingReg[I];
+ OS << "[" << I << "](";
+ for (const MCInst *MCInstP : Set)
+ OS << MCInstP << " ";
+ OS << ")";
+ }
+}
+
+raw_ostream &operator<<(raw_ostream &OS, const State &S) {
+ OS << "pacret-state<";
+ OS << "NonAutClobRegs: " << S.NonAutClobRegs;
+ OS << "Insts: ";
+ printLastInsts(OS, S.LastInstWritingReg);
+ OS << ">";
+ return OS;
+}
+
+class PacStatePrinter {
+public:
+ void print(raw_ostream &OS, const State &State) const;
+ explicit PacStatePrinter(const BinaryContext &BC) : BC(BC) {}
+
+private:
+ const BinaryContext &BC;
+};
+
+void PacStatePrinter::print(raw_ostream &OS, const State &S) const {
+ RegStatePrinter RegStatePrinter(BC);
+ OS << "pacret-state<";
+ OS << "NonAutClobRegs: ";
+ RegStatePrinter.print(OS, S.NonAutClobRegs);
+ OS << "Insts: ";
+ printLastInsts(OS, S.LastInstWritingReg);
+ OS << ">";
+}
+
+class PacRetAnalysis
+ : public DataflowAnalysis<PacRetAnalysis, State, false /*Backward*/,
+ PacStatePrinter> {
+ using Parent =
+ DataflowAnalysis<PacRetAnalysis, State, false, PacStatePrinter>;
+ friend Parent;
+
+public:
+ PacRetAnalysis(BinaryFunction &BF, MCPlusBuilder::AllocatorIdTy AllocId,
+ const std::vector<MCPhysReg> &RegsToTrackInstsFor)
+ : Parent(BF, AllocId), NumRegs(BF.getBinaryContext().MRI->getNumRegs()),
+ RegsToTrackInstsFor(RegsToTrackInstsFor),
+ TrackingLastInsts(RegsToTrackInstsFor.size() != 0),
+ Reg2StateIdx(RegsToTrackInstsFor.size() == 0
+ ? 0
+ : *std::max_element(RegsToTrackInstsFor.begin(),
+ RegsToTrackInstsFor.end()) +
+ 1,
+ -1) {
+ for (unsigned I = 0; I < RegsToTrackInstsFor.size(); ++I)
+ Reg2StateIdx[RegsToTrackInstsFor[I]] = I;
+ }
+ virtual ~PacRetAnalysis() {}
+
+ void run() { Parent::run(); }
+
+protected:
+ const uint16_t NumRegs;
+ /// RegToTrackInstsFor is the set of registers for which the dataflow analysis
+ /// must compute which the last set of instructions writing to it are.
+ const std::vector<MCPhysReg> RegsToTrackInstsFor;
+ const bool TrackingLastInsts;
+ /// Reg2StateIdx maps Register to the index in the vector used in State to
+ /// track which instructions last wrote to this register.
+ std::vector<uint16_t> Reg2StateIdx;
+
+ bool trackReg(MCPhysReg Reg) {
+ for (auto R : RegsToTrackInstsFor)
+ if (R == Reg)
+ return true;
+ return false;
+ }
+ uint16_t reg2StateIdx(MCPhysReg Reg) const {
+ assert(Reg < Reg2StateIdx.size());
+ return Reg2StateIdx[Reg];
+ }
+
+ void preflight() {}
+
+ State getStartingStateAtBB(const BinaryBasicBlock &BB) {
+ return State(NumRegs, RegsToTrackInstsFor.size());
+ }
+
+ State getStartingStateAtPoint(const MCInst &Point) {
+ return State(NumRegs, RegsToTrackInstsFor.size());
+ }
+
+ void doConfluence(State &StateOut, const State &StateIn) {
+ PacStatePrinter P(BC);
+ LLVM_DEBUG({
+ dbgs() << " PacRetAnalysis::Confluence(\n";
+ dbgs() << " State 1: ";
+ P.print(dbgs(), StateOut);
+ dbgs() << "\n";
+ dbgs() << " State 2: ";
+ P.print(dbgs(), StateIn);
+ dbgs() << ")\n";
+ });
+
+ StateOut |= StateIn;
+
+ LLVM_DEBUG({
+ dbgs() << " merged state: ";
+ P.print(dbgs(), StateOut);
+ dbgs() << "\n";
+ });
+ }
+
+ State computeNext(const MCInst &Point, const State &Cur) {
+ PacStatePrinter P(BC);
+ LLVM_DEBUG({
+ dbgs() << " PacRetAnalysis::ComputeNext(";
+ BC.InstPrinter->printInst(&const_cast<MCInst &>(Point), 0, "", *BC.STI,
+ dbgs());
+ dbgs() << ", ";
+ P.print(dbgs(), Cur);
+ dbgs() << ")\n";
+ });
+
+ State Next = Cur;
+ BitVector Written = BitVector(NumRegs, false);
+ BC.MIB->getWrittenRegs(Point, Written);
+ Next.NonAutClobRegs |= Written;
+ // Keep track of this instruction if it writes to any of the registers we
+ // need to track that for:
+ if (TrackingLastInsts) {
+ for (auto WrittenReg : Written.set_bits()) {
+ if (trackReg(WrittenReg)) {
+ Next.LastInstWritingReg[reg2StateIdx(WrittenReg)] = {};
+ Next.LastInstWritingReg[reg2StateIdx(WrittenReg)].insert(&Point);
+ }
+ }
+ }
+
+ MCPhysReg AutReg = BC.MIB->getAuthenticatedReg(Point);
+ if (AutReg != BC.MIB->getNoRegister()) {
+ Next.NonAutClobRegs.reset(
+ BC.MIB->getAliases(AutReg, /*OnlySmaller=*/true));
+ if (TrackingLastInsts && trackReg(AutReg))
+ Next.LastInstWritingReg[reg2StateIdx(AutReg)] = {};
+ }
+
+ LLVM_DEBUG({
+ dbgs() << " .. result: (";
+ P.print(dbgs(), Next);
+ dbgs() << ")\n";
+ });
+
+ return Next;
+ }
+
+ StringRef getAnnotationName() const { return StringRef("PacRetAnalysis"); }
+
+public:
+ std::vector<MCInstReference>
+ getLastClobberingInsts(const MCInst Ret, BinaryFunction &BF,
+ const BitVector &DirtyRawRegs) const {
+ if (!TrackingLastInsts)
+ return {};
+ if (auto MaybeState = getStateAt(Ret)) {
+ const State &S = *MaybeState;
+ // Due to aliasing registers, multiple registers may have
+ // been tracked.
+ std::set<const MCInst *> LastWritingInsts;
+ for (MCPhysReg TrackedReg : DirtyRawRegs.set_bits()) {
+ for (const MCInst *LastInstWriting :
+ S.LastInstWritingReg[reg2StateIdx(TrackedReg)])
+ LastWritingInsts.insert(LastInstWriting);
+ }
+ std::vector<MCInstReference> Result;
+ for (const MCInst *LastInstWriting : LastWritingInsts) {
+ MCInstInBBReference Ref = MCInstInBBReference::get(LastInstWriting, BF);
+ assert(Ref.BB != nullptr && "Expected Inst to be found");
+ Result.push_back(MCInstReference(Ref));
+ }
+ return Result;
+ }
+ llvm_unreachable("Expected State to be present");
+ }
+};
+
+SmallSet<MCPhysReg, 1> NonPacProtectedRetAnalysis::computeDfState(
+ PacRetAnalysis &PRA, BinaryFunction &BF,
+ MCPlusBuilder::AllocatorIdTy AllocatorId) {
+ PRA.run();
+ LLVM_DEBUG({
+ dbgs() << " After PacRetAnalysis:\n";
+ BF.dump();
+ });
+ // Now scan the CFG for non-authenticating return instructions that use an
+ // overwritten, non-authenticated register as return address.
+ SmallSet<MCPhysReg, 1> RetRegsWithGadgets;
+ BinaryContext &BC = BF.getBinaryContext();
+ for (BinaryBasicBlock &BB : BF) {
+ for (int64_t I = BB.size() - 1; I >= 0; --I) {
+ MCInst &Inst = BB.getInstructionAtIndex(I);
+ if (BC.MIB->isReturn(Inst)) {
+ MCPhysReg RetReg = BC.MIB->getRegUsedAsRetDest(Inst);
+ LLVM_DEBUG({
+ dbgs() << " Found RET inst: ";
+ BC.printInstruction(dbgs(), Inst);
+ dbgs() << " RetReg: " << BC.MRI->getName(RetReg)
+ << "; authenticatesReg: "
+ << BC.MIB->isAuthenticationOfReg(Inst, RetReg) << "\n";
+ });
+ if (BC.MIB->isAuthenticationOfReg(Inst, RetReg))
+ break;
+ BitVector DirtyRawRegs = PRA.getStateAt(Inst)->NonAutClobRegs;
+ LLVM_DEBUG({
+ dbgs() << " DirtyRawRegs at Ret: ";
+ RegStatePrinter RSP(BC);
+ RSP.print(dbgs(), DirtyRawRegs);
+ dbgs() << "\n";
+ });
+ DirtyRawRegs &= BC.MIB->getAliases(RetReg, /*OnlySmaller=*/true);
+ LLVM_DEBUG({
+ dbgs() << " Intersection with RetReg: ";
+ RegStatePrinter RSP(BC);
+ RSP.print(dbgs(), DirtyRawRegs);
+ dbgs() << "\n";
+ });
+ if (DirtyRawRegs.any()) {
+ // This return instruction needs to be reported
+ // First remove the annotation that the first, fast run of
+ // the dataflow analysis may have added.
+ if (BC.MIB->hasAnnotation(Inst, GadgetAnnotationIndex))
+ BC.MIB->removeAnnotation(Inst, GadgetAnnotationIndex);
+ BC.MIB->addAnnotation(
+ Inst, GadgetAnnotationIndex,
+ NonPacProtectedRetGadget(
+ MCInstInBBReference(&BB, I),
+ PRA.getLastClobberingInsts(Inst, BF, DirtyRawRegs)),
+ AllocatorId);
+ LLVM_DEBUG({
+ dbgs() << " Added gadget info annotation: ";
+ BC.printInstruction(dbgs(), Inst, 0, &BF);
+ dbgs() << "\n";
+ });
+ for (MCPhysReg RetRegWithGadget : DirtyRawRegs.set_bits())
+ RetRegsWithGadgets.insert(RetRegWithGadget);
+ }
+ }
+ }
+ }
+ return RetRegsWithGadgets;
+}
+
+void NonPacProtectedRetAnalysis::runOnFunction(
+ BinaryFunction &BF, MCPlusBuilder::AllocatorIdTy AllocatorId) {
+ LLVM_DEBUG({
+ dbgs() << "Analyzing in function " << BF.getPrintName() << ", AllocatorId "
+ << AllocatorId << "\n";
+ });
+ LLVM_DEBUG({ BF.dump(); });
+
+ if (BF.hasCFG()) {
+ PacRetAnalysis PRA(BF, AllocatorId, {});
+ SmallSet<MCPhysReg, 1> RetRegsWithGadgets =
+ computeDfState(PRA, BF, AllocatorId);
+ if (!RetRegsWithGadgets.empty()) {
+ // Redo the analysis, but now also track which instructions last wrote
+ // to any of the registers in RetRegsWithGadgets, so that better
+ // diagnostics can be produced.
+ std::vector<MCPhysReg> RegsToTrack;
+ for (MCPhysReg R : RetRegsWithGadgets)
+ RegsToTrack.push_back(R);
+ PacRetAnalysis PRWIA(BF, AllocatorId, RegsToTrack);
+ SmallSet<MCPhysReg, 1> RetRegsWithGadgets =
+ computeDfState(PRWIA, BF, AllocatorId);
+ }
+ }
+}
+
+void printBB(const BinaryContext &BC, const BinaryBasicBlock *BB,
+ size_t StartIndex = 0, size_t EndIndex = -1) {
+ if (EndIndex == (size_t)-1)
+ EndIndex = BB->size() - 1;
+ const BinaryFunction *BF = BB->getFunction();
+ for (unsigned I = StartIndex; I <= EndIndex; ++I) {
+ uint64_t Address = BB->getOffset() + BF->getAddress() + 4 * I;
+ const MCInst &Inst = BB->getInstructionAtIndex(I);
+ if (BC.MIB->isCFI(Inst))
+ continue;
+ BC.printInstruction(outs(), Inst, Address, BF);
+ }
+}
+
+void reportFoundGadgetInSingleBBSingleOverwInst(const BinaryContext &BC,
+ const MCInstReference OverwInst,
+ const MCInstReference RetInst) {
+ BinaryBasicBlock *BB = RetInst.getBasicBlock();
+ assert(OverwInst.CurrentLocation == MCInstReference::_BinaryBasicBlock);
+ assert(RetInst.CurrentLocation == MCInstReference::_BinaryBasicBlock);
+ MCInstInBBReference OverwInstBB = OverwInst.U.BBRef;
+ if (BB == OverwInstBB.BB) {
+ // overwriting inst and ret instruction are in the same basic block.
+ assert(OverwInstBB.BBIndex < RetInst.U.BBRef.BBIndex);
+ outs() << " This happens in the following basic block:\n";
+ printBB(BC, BB);
+ }
+}
+
+void reportFoundGadget(const BinaryContext &BC, const MCInst &Inst,
+ unsigned int GadgetAnnotationIndex) {
+ auto NPPRG = BC.MIB->getAnnotationAs<NonPacProtectedRetGadget>(
+ Inst, GadgetAnnotationIndex);
+ MCInstReference RetInst = NPPRG.RetInst;
+ BinaryFunction *BF = RetInst.getFunction();
+ BinaryBasicBlock *BB = RetInst.getBasicBlock();
+
+ outs() << "\nGS-PACRET: " << "non-protected ret found in function "
+ << BF->getPrintName();
+ if (BB)
+ outs() << ", basic block " << BB->getName();
+ outs() << ", at address " << llvm::format("%x", RetInst.getAddress()) << "\n";
+ outs() << " The return instruction is ";
+ BC.printInstruction(outs(), RetInst, RetInst.getAddress(), BF);
+ outs() << " The " << NPPRG.OverwritingRetRegInst.size()
+ << " instructions that write to the return register after any "
+ "authentication are:\n";
+ // Sort by address to ensure output is deterministic.
+ std::sort(std::begin(NPPRG.OverwritingRetRegInst),
+ std::end(NPPRG.OverwritingRetRegInst),
+ [](const MCInstReference &A, const MCInstReference &B) {
+ return A.getAddress() < B.getAddress();
+ });
+ for (unsigned I = 0; I < NPPRG.OverwritingRetRegInst.size(); ++I) {
+ MCInstReference InstRef = NPPRG.OverwritingRetRegInst[I];
+ outs() << " " << (I + 1) << ". ";
+ BC.printInstruction(outs(), InstRef, InstRef.getAddress(), BF);
+ };
+ LLVM_DEBUG({
+ dbgs() << " .. OverWritingRetRegInst:\n";
+ for (MCInstReference Ref : NPPRG.OverwritingRetRegInst) {
+ dbgs() << " " << Ref << "\n";
+ }
+ });
+ if (NPPRG.OverwritingRetRegInst.size() == 1) {
+ const MCInstReference OverwInst = NPPRG.OverwritingRetRegInst[0];
+ assert(OverwInst.CurrentLocation == MCInstReference::_BinaryBasicBlock);
+ reportFoundGadgetInSingleBBSingleOverwInst(BC, OverwInst, RetInst);
+ }
+}
+
+Error NonPacProtectedRetAnalysis::runOnFunctions(BinaryContext &BC) {
+ GadgetAnnotationIndex = BC.MIB->getOrCreateAnnotationIndex("pacret-gadget");
+
+ ParallelUtilities::WorkFuncWithAllocTy WorkFun =
+ [&](BinaryFunction &BF, MCPlusBuilder::AllocatorIdTy AllocatorId) {
+ runOnFunction(BF, AllocatorId);
+ };
+
+ ParallelUtilities::PredicateTy SkipFunc = [&](const BinaryFunction &BF) {
+ return false; // BF.shouldPreserveNops();
+ };
+
+ ParallelUtilities::runOnEachFunctionWithUniqueAllocId(
+ BC, ParallelUtilities::SchedulingPolicy::SP_INST_LINEAR, WorkFun,
+ SkipFunc, "NonPacProtectedRetAnalysis");
+
+ for (BinaryFunction *BF : BC.getAllBinaryFunctions())
+ if (BF->hasCFG()) {
+ for (BinaryBasicBlock &BB : *BF) {
+ for (int64_t I = BB.size() - 1; I >= 0; --I) {
+ MCInst &Inst = BB.getInstructionAtIndex(I);
+ if (BC.MIB->hasAnnotation(Inst, GadgetAnnotationIndex)) {
+ reportFoundGadget(BC, Inst, GadgetAnnotationIndex);
+ }
+ }
+ }
+ }
+ return Error::success();
+}
+
+} // namespace bolt
+} // namespace llvm
\ No newline at end of file
diff --git a/bolt/lib/Rewrite/RewriteInstance.cpp b/bolt/lib/Rewrite/RewriteInstance.cpp
index 4329235d470497..0a45c35f02da32 100644
--- a/bolt/lib/Rewrite/RewriteInstance.cpp
+++ b/bolt/lib/Rewrite/RewriteInstance.cpp
@@ -20,6 +20,7 @@
#include "bolt/Passes/BinaryPasses.h"
#include "bolt/Passes/CacheMetrics.h"
#include "bolt/Passes/IdenticalCodeFolding.h"
+#include "bolt/Passes/NonPacProtectedRetAnalysis.h"
#include "bolt/Passes/ReorderFunctions.h"
#include "bolt/Profile/BoltAddressTranslation.h"
#include "bolt/Profile/DataAggregator.h"
@@ -245,6 +246,12 @@ static cl::opt<bool> WriteBoltInfoSection(
"bolt-info", cl::desc("write bolt info section in the output binary"),
cl::init(true), cl::Hidden, cl::cat(BoltOutputCategory));
+cl::list<GadgetScannerKind> GadgetScannersToRun(
+ "scanners", cl::desc("which gadget scanners to run"),
+ cl::values(clEnumValN(GS_PACRET, "pacret", "pac-ret"),
+ clEnumValN(GS_ALL, "all", "all")),
+ cl::ZeroOrMore, cl::CommaSeparated, cl::cat(BinaryAnalysisCategory));
+
} // namespace opts
// FIXME: implement a better way to mark sections for replacement.
@@ -3490,7 +3497,23 @@ void RewriteInstance::runOptimizationPasses() {
BC->logBOLTErrorsAndQuitOnFatal(BinaryFunctionPassManager::runAllPasses(*BC));
}
-void RewriteInstance::runBinaryAnalyses() {}
+void RewriteInstance::runBinaryAnalyses() {
+ NamedRegionTimer T("runBinaryAnalyses", "run binary analysis passes",
+ TimerGroupName, TimerGroupDesc, opts::TimeRewrite);
+ BinaryFunctionPassManager Manager(*BC);
+ // FIXME: add a pass that warns about which functions do not have CFG,
+ // and therefore, analysis is most likely to be less accurate.
+ using GSK = opts::GadgetScannerKind;
+ // if no command line option was given, act as if "all" was specified.
+ if (opts::GadgetScannersToRun.size() == 0)
+ opts::GadgetScannersToRun.addValue(GSK::GS_ALL);
+ for (GSK ScannerToRun : opts::GadgetScannersToRun) {
+ if (ScannerToRun == GSK::GS_PACRET || ScannerToRun == GSK::GS_ALL)
+ Manager.registerPass(std::make_unique<NonPacProtectedRetAnalysis>());
+ }
+
+ BC->logBOLTErrorsAndQuitOnFatal(Manager.runPasses());
+}
void RewriteInstance::preregisterSections() {
// Preregister sections before emission to set their order in the output.
diff --git a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
index 679c9774c767f7..1840750822aa3f 100644
--- a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
+++ b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
@@ -148,6 +148,69 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
return false;
}
+ MCPhysReg getAuthenticatedReg(const MCInst &Inst) const override {
+ switch (Inst.getOpcode()) {
+ case AArch64::AUTIAZ:
+ case AArch64::AUTIBZ:
+ case AArch64::AUTIASP:
+ case AArch64::AUTIBSP:
+ case AArch64::RETAA:
+ case AArch64::RETAB:
+ return AArch64::LR;
+ case AArch64::AUTIA1716:
+ case AArch64::AUTIB1716:
+ return AArch64::X17;
+ case AArch64::ERETAA:
+ case AArch64::ERETAB:
+ return AArch64::LR;
+
+ case AArch64::AUTIA:
+ case AArch64::AUTIB:
+ case AArch64::AUTDA:
+ case AArch64::AUTDB:
+ case AArch64::AUTIZA:
+ case AArch64::AUTIZB:
+ case AArch64::AUTDZA:
+ case AArch64::AUTDZB:
+ assert(Inst.getOperand(0).isReg());
+ return Inst.getOperand(0).getReg();
+
+ default:
+ return getNoRegister();
+ }
+ }
+
+ bool isAuthenticationOfReg(const MCInst &Inst,
+ const unsigned RegAuthenticated) const override {
+ if (RegAuthenticated == getNoRegister())
+ return false;
+ return getAuthenticatedReg(Inst) == RegAuthenticated;
+ }
+
+ llvm::MCPhysReg getRegUsedAsRetDest(const MCInst &Inst) const override {
+ assert(isReturn(Inst));
+ switch (Inst.getOpcode()) {
+ case AArch64::RET:
+ // There should be one register that the return reads, and
+ // that's the one being used as the jump target?
+ for (unsigned OpIdx = 0, EndIdx = Inst.getNumOperands(); OpIdx < EndIdx;
+ ++OpIdx) {
+ const MCOperand &MO = Inst.getOperand(OpIdx);
+ if (MO.isReg())
+ return MO.getReg();
+ }
+ return getNoRegister();
+ case AArch64::RETAA:
+ case AArch64::RETAB:
+ case AArch64::ERETAA:
+ case AArch64::ERETAB:
+ case AArch64::ERET:
+ return AArch64::LR;
+ default:
+ llvm_unreachable("Unhandled return instruction");
+ }
+ }
+
bool isADRP(const MCInst &Inst) const override {
return Inst.getOpcode() == AArch64::ADRP;
}
diff --git a/bolt/test/binary-analysis/AArch64/cmdline-args.test b/bolt/test/binary-analysis/AArch64/cmdline-args.test
index e414818644a3b0..1204d5b1289af4 100644
--- a/bolt/test/binary-analysis/AArch64/cmdline-args.test
+++ b/bolt/test/binary-analysis/AArch64/cmdline-args.test
@@ -13,7 +13,7 @@ NONEXISTINGFILEARG: llvm-bolt-binary-analysis: 'non-existing-file': No suc
RUN: not llvm-bolt-binary-analysis %p/Inputs/dummy.txt 2>&1 | FileCheck -check-prefix=NOELFFILEARG %s
NOELFFILEARG: llvm-bolt-binary-analysis: '{{.*}}/Inputs/dummy.txt': The file was not recognized as a valid object file.
-RUN: %clang %cflags %p/../../Inputs/asm_foo.s %p/../../Inputs/asm_main.c -o %t.exe
+RUN: %clang %cflags -Wl,--emit-relocs %p/../../Inputs/asm_foo.s %p/../../Inputs/asm_main.c -o %t.exe
RUN: llvm-bolt-binary-analysis %t.exe 2>&1 | FileCheck -check-prefix=VALIDELFFILEARG --allow-empty %s
# Check that there are no BOLT-WARNING or BOLT-ERROR output lines
VALIDELFFILEARG: BOLT-INFO:
@@ -30,4 +30,10 @@ HELP-NEXT: USAGE: llvm-bolt-binary-analysis [options] <executable>
HELP-EMPTY:
HELP-NEXT: OPTIONS:
HELP-EMPTY:
+HELP-NEXT: BinaryAnalysis options:
+HELP-EMPTY:
+HELP-NEXT: --scanners=<value> - which gadget scanners to run
+HELP-NEXT: =pacret - pac-ret
+HELP-NEXT: =all - all
+HELP-EMPTY:
HELP-NEXT: Generic Options:
diff --git a/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s b/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s
new file mode 100644
index 00000000000000..8731bb46362867
--- /dev/null
+++ b/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s
@@ -0,0 +1,681 @@
+// RUN: %clang %cflags -march=armv8.3-a -mbranch-protection=pac-ret %s %p/../../Inputs/asm_main.c -o %t.exe
+// RUN: llvm-bolt-binary-analysis --scanners=pacret %t.exe 2>&1 | FileCheck -check-prefix=CHECK --allow-empty %s
+
+ .text
+
+ .globl f1
+ .type f1, at function
+f1:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ // autiasp
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f1, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT: The return instruction is {{[0-9a-f]+}}: ret
+// CHECK-NEXT: The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT: 1. {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}: paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}: stp x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}: mov x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}: bl g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}: add x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}: ret
+ ret
+ .size f1, .-f1
+
+
+ .globl f_intermediate_overwrite1
+ .type f_intermediate_overwrite1, at function
+f_intermediate_overwrite1:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ autiasp
+ ldp x29, x30, [sp], #16
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_intermediate_overwrite1, basic block .LBB
+// CHECK-NEXT: The return instruction is {{[0-9a-f]+}}: ret
+// CHECK-NEXT: The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT: 1. {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}: paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}: stp x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}: mov x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}: bl g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}: add x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}: autiasp
+// CHECK-NEXT: {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}: ret
+ ret
+ .size f_intermediate_overwrite1, .-f_intermediate_overwrite1
+
+ .globl f_intermediate_overwrite2
+ .type f_intermediate_overwrite2, at function
+f_intermediate_overwrite2:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autiasp
+ mov x30, x0
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_intermediate_overwrite2, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT: The return instruction is {{[0-9a-f]+}}: ret
+// CHECK-NEXT: The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT: 1. {{[0-9a-f]+}}: mov x30, x0
+// CHECK-NEXT: This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}: paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}: stp x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}: mov x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}: bl g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}: add x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}: autiasp
+// CHECK-NEXT: {{[0-9a-f]+}}: mov x30, x0
+// CHECK-NEXT: {{[0-9a-f]+}}: ret
+ ret
+ .size f_intermediate_overwrite2, .-f_intermediate_overwrite2
+
+ .globl f_intermediate_read
+ .type f_intermediate_read, at function
+f_intermediate_read:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autiasp
+ mov x0, x30
+// CHECK-NOT: function f_intermediate_read
+ ret
+ .size f_intermediate_read, .-f_intermediate_read
+
+ .globl f_intermediate_overwrite3
+ .type f_intermediate_overwrite3, at function
+f_intermediate_overwrite3:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autiasp
+ mov w30, w0
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_intermediate_overwrite3, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT: The return instruction is {{[0-9a-f]+}}: ret
+// CHECK-NEXT: The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT: 1. {{[0-9a-f]+}}: mov w30, w0
+// CHECK-NEXT: This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}: paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}: stp x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}: mov x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}: bl g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}: add x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}: autiasp
+// CHECK-NEXT: {{[0-9a-f]+}}: mov w30, w0
+// CHECK-NEXT: {{[0-9a-f]+}}: ret
+ ret
+ .size f_intermediate_overwrite3, .-f_intermediate_overwrite3
+
+ .globl f_nonx30_ret
+ .type f_nonx30_ret, at function
+f_nonx30_ret:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ mov x16, x30
+ autiasp
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_nonx30_ret, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT: The return instruction is {{[0-9a-f]+}}: ret x16
+// CHECK-NEXT: The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT: 1. {{[0-9a-f]+}}: mov x16, x30
+// CHECK-NEXT: This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}: paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}: stp x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}: mov x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}: bl g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}: add x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}: mov x16, x30
+// CHECK-NEXT: {{[0-9a-f]+}}: autiasp
+// CHECK-NEXT: {{[0-9a-f]+}}: ret x16
+ ret x16
+ .size f_nonx30_ret, .-f_nonx30_ret
+
+
+/// Now do a basic sanity check on every different Authentication instruction:
+
+ .globl f_autiasp
+ .type f_autiasp, at function
+f_autiasp:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autiasp
+// CHECK-NOT: function f_autiasp
+ ret
+ .size f_autiasp, .-f_autiasp
+
+ .globl f_autibsp
+ .type f_autibsp, at function
+f_autibsp:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autibsp
+// CHECK-NOT: function f_autibsp
+ ret
+ .size f_autibsp, .-f_autibsp
+
+ .globl f_autiaz
+ .type f_autiaz, at function
+f_autiaz:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autiaz
+// CHECK-NOT: function f_autiaz
+ ret
+ .size f_autiaz, .-f_autiaz
+
+ .globl f_autibz
+ .type f_autibz, at function
+f_autibz:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autiaz
+// CHECK-NOT: function f_autibz
+ ret
+ .size f_autibz, .-f_autibz
+
+ .globl f_autia1716
+ .type f_autia1716, at function
+f_autia1716:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autia1716
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_autia1716, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT: The return instruction is {{[0-9a-f]+}}: ret
+// CHECK-NEXT: The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT: 1. {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}: paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}: stp x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}: mov x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}: bl g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}: add x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}: autia1716
+// CHECK-NEXT: {{[0-9a-f]+}}: ret
+ ret
+ .size f_autia1716, .-f_autia1716
+
+ .globl f_autib1716
+ .type f_autib1716, at function
+f_autib1716:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autib1716
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_autib1716, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT: The return instruction is {{[0-9a-f]+}}: ret
+// CHECK-NEXT: The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT: 1. {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}: paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}: stp x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}: mov x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}: bl g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}: add x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}: autib1716
+// CHECK-NEXT: {{[0-9a-f]+}}: ret
+ ret
+ .size f_autib1716, .-f_autib1716
+
+ .globl f_autiax12
+ .type f_autiax12, at function
+f_autiax12:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autia x12, sp
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_autiax12, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT: The return instruction is {{[0-9a-f]+}}: ret
+// CHECK-NEXT: The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT: 1. {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}: paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}: stp x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}: mov x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}: bl g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}: add x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}: autia x12, sp
+// CHECK-NEXT: {{[0-9a-f]+}}: ret
+ ret
+ .size f_autiax12, .-f_autiax12
+
+ .globl f_autibx12
+ .type f_autibx12, at function
+f_autibx12:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autib x12, sp
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_autibx12, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT: The return instruction is {{[0-9a-f]+}}: ret
+// CHECK-NEXT: The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT: 1. {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}: paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}: stp x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}: mov x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}: bl g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}: add x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}: autib x12, sp
+// CHECK-NEXT: {{[0-9a-f]+}}: ret
+ ret
+ .size f_autibx12, .-f_autibx12
+
+ .globl f_autiax30
+ .type f_autiax30, at function
+f_autiax30:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autia x30, sp
+// CHECK-NOT: function f_autiax30
+ ret
+ .size f_autiax30, .-f_autiax30
+
+ .globl f_autibx30
+ .type f_autibx30, at function
+f_autibx30:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autib x30, sp
+// CHECK-NOT: function f_autibx30
+ ret
+ .size f_autibx30, .-f_autibx30
+
+
+ .globl f_autdax12
+ .type f_autdax12, at function
+f_autdax12:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autda x12, sp
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_autdax12, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT: The return instruction is {{[0-9a-f]+}}: ret
+// CHECK-NEXT: The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT: 1. {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}: paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}: stp x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}: mov x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}: bl g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}: add x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}: autda x12, sp
+// CHECK-NEXT: {{[0-9a-f]+}}: ret
+ ret
+ .size f_autdax12, .-f_autdax12
+
+ .globl f_autdbx12
+ .type f_autdbx12, at function
+f_autdbx12:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autdb x12, sp
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_autdbx12, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT: The return instruction is {{[0-9a-f]+}}: ret
+// CHECK-NEXT: The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT: 1. {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}: paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}: stp x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}: mov x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}: bl g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}: add x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}: autdb x12, sp
+// CHECK-NEXT: {{[0-9a-f]+}}: ret
+ ret
+ .size f_autdbx12, .-f_autdbx12
+
+ .globl f_autdax30
+ .type f_autdax30, at function
+f_autdax30:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autda x30, sp
+// CHECK-NOT: function f_autdax30
+ ret
+ .size f_autdax30, .-f_autdax30
+
+ .globl f_autdbx30
+ .type f_autdbx30, at function
+f_autdbx30:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autdb x30, sp
+// CHECK-NOT: function f_autdbx30
+ ret
+ .size f_autdbx30, .-f_autdbx30
+
+
+ .globl f_autizax12
+ .type f_autizax12, at function
+f_autizax12:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autiza x12
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_autizax12, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT: The return instruction is {{[0-9a-f]+}}: ret
+// CHECK-NEXT: The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT: 1. {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}: paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}: stp x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}: mov x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}: bl g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}: add x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}: autiza x12
+// CHECK-NEXT: {{[0-9a-f]+}}: ret
+ ret
+ .size f_autizax12, .-f_autizax12
+
+ .globl f_autizbx12
+ .type f_autizbx12, at function
+f_autizbx12:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autizb x12
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_autizbx12, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT: The return instruction is {{[0-9a-f]+}}: ret
+// CHECK-NEXT: The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT: 1. {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}: paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}: stp x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}: mov x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}: bl g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}: add x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}: autizb x12
+// CHECK-NEXT: {{[0-9a-f]+}}: ret
+ ret
+ .size f_autizbx12, .-f_autizbx12
+
+ .globl f_autizax30
+ .type f_autizax30, at function
+f_autizax30:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autiza x30
+// CHECK-NOT: function f_autizax30
+ ret
+ .size f_autizax30, .-f_autizax30
+
+ .globl f_autizbx30
+ .type f_autizbx30, at function
+f_autizbx30:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autizb x30
+// CHECK-NOT: function f_autizbx30
+ ret
+ .size f_autizbx30, .-f_autizbx30
+
+
+ .globl f_autdzax12
+ .type f_autdzax12, at function
+f_autdzax12:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autdza x12
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_autdzax12, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT: The return instruction is {{[0-9a-f]+}}: ret
+// CHECK-NEXT: The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT: 1. {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}: paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}: stp x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}: mov x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}: bl g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}: add x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}: autdza x12
+// CHECK-NEXT: {{[0-9a-f]+}}: ret
+ ret
+ .size f_autdzax12, .-f_autdzax12
+
+ .globl f_autdzbx12
+ .type f_autdzbx12, at function
+f_autdzbx12:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autdzb x12
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_autdzbx12, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT: The return instruction is {{[0-9a-f]+}}: ret
+// CHECK-NEXT: The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT: 1. {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}: paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}: stp x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}: mov x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}: bl g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}: add x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}: autdzb x12
+// CHECK-NEXT: {{[0-9a-f]+}}: ret
+ ret
+ .size f_autdzbx12, .-f_autdzbx12
+
+ .globl f_autdzax30
+ .type f_autdzax30, at function
+f_autdzax30:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autdza x30
+// CHECK-NOT: function f_autdzax30
+ ret
+ .size f_autdzax30, .-f_autdzax30
+
+ .globl f_autdzbx30
+ .type f_autdzbx30, at function
+f_autdzbx30:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+ autdzb x30
+// CHECK-NOT: function f_autdzbx30
+ ret
+ .size f_autdzbx30, .-f_autdzbx30
+
+ .globl f_retaa
+ .type f_retaa, at function
+f_retaa:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+// CHECK-NOT: function f_retaa
+ retaa
+ .size f_retaa, .-f_retaa
+
+ .globl f_retab
+ .type f_retab, at function
+f_retab:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+// CHECK-NOT: function f_retab
+ retab
+ .size f_retab, .-f_retab
+
+ .globl f_eretaa
+ .type f_eretaa, at function
+f_eretaa:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+// CHECK-NOT: function f_eretaa
+ eretaa
+ .size f_eretaa, .-f_eretaa
+
+ .globl f_eretab
+ .type f_eretab, at function
+f_eretab:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+// CHECK-NOT: function f_eretab
+ eretab
+ .size f_eretab, .-f_eretab
+
+ .globl f_eret
+ .type f_eret, at function
+f_eret:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl g
+ add x0, x0, #3
+ ldp x29, x30, [sp], #16
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_eret, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT: The return instruction is {{[0-9a-f]+}}: eret
+// CHECK-NEXT: The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT: 1. {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}: paciasp
+// CHECK-NEXT: {{[0-9a-f]+}}: stp x29, x30, [sp, #-0x10]!
+// CHECK-NEXT: {{[0-9a-f]+}}: mov x29, sp
+// CHECK-NEXT: {{[0-9a-f]+}}: bl g at PLT
+// CHECK-NEXT: {{[0-9a-f]+}}: add x0, x0, #0x3
+// CHECK-NEXT: {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: {{[0-9a-f]+}}: eret
+ eret
+ .size f_eret, .-f_eret
+
+ .globl f_movx30reg
+ .type f_movx30reg, at function
+f_movx30reg:
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_movx30reg, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT: The return instruction is {{[0-9a-f]+}}: ret
+// CHECK-NEXT: The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT: 1. {{[0-9a-f]+}}: mov x30, x22
+// CHECK-NEXT: This happens in the following basic block:
+// CHECK-NEXT: {{[0-9a-f]+}}: mov x30, x22
+// CHECK-NEXT: {{[0-9a-f]+}}: ret
+ mov x30, x22
+ ret
+ .size f_movx30reg, .-f_movx30reg
+
+// TODO: add test to see if registers clobbered by a call are picked up.
diff --git a/bolt/test/binary-analysis/AArch64/gs-pacret-multi-bb.s b/bolt/test/binary-analysis/AArch64/gs-pacret-multi-bb.s
new file mode 100644
index 00000000000000..402bdf6591dcb1
--- /dev/null
+++ b/bolt/test/binary-analysis/AArch64/gs-pacret-multi-bb.s
@@ -0,0 +1,46 @@
+// RUN: %clang %cflags -march=armv8.3-a -mbranch-protection=pac-ret %s %p/../../Inputs/asm_main.c -o %t.exe
+// RUN: llvm-bolt-binary-analysis --scanners=pacret %t.exe 2>&1 | FileCheck -check-prefix=CHECK --allow-empty %s
+
+
+// Verify that we can also detect gadgets across basic blocks
+
+ .globl f_crossbb1
+ .type f_crossbb1, at function
+f_crossbb1:
+ hint #25
+ stp x29, x30, [sp, #-16]!
+ ldp x29, x30, [sp], #16
+ cbnz x0, 1f
+ autiasp
+1:
+ ret
+ .size f_crossbb1, .-f_crossbb1
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_crossbb1, basic block .L{{[^,]+}}, at address
+// CHECK-NEXT: The return instruction is {{[0-9a-f]+}}: ret
+// CHECK-NEXT: The 2 instructions that write to the return register after any authentication are:
+// CHECK-NEXT: 1. {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+// CHECK-NEXT: 2. {{[0-9a-f]+}}: autiasp
+
+// A test that checks that the dataflow state tracking across when merging BBs
+// seems to work:
+ .globl f_mergebb1
+ .type f_mergebb1, at function
+f_mergebb1:
+ hint #25
+2:
+ stp x29, x30, [sp, #-16]!
+ ldp x29, x30, [sp], #16
+ sub x0, x0, #1
+ cbnz x0, 1f
+ autiasp
+ b 2b
+1:
+ ret
+ .size f_mergebb1, .-f_mergebb1
+// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_mergebb1, basic block .L{{[^,]+}}, at address
+// CHECK-NEXT: The return instruction is {{[0-9a-f]+}}: ret
+// CHECK-NEXT: The 1 instructions that write to the return register after any authentication are:
+// CHECK-NEXT: 1. {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
+
+// TODO: also verify that false negatives exist in across-BB gadgets in functions
+// for which bolt cannot reconstruct the call graph.
diff --git a/bolt/test/binary-analysis/AArch64/lit.local.cfg b/bolt/test/binary-analysis/AArch64/lit.local.cfg
index 6f247dd52e82f8..6ac7d3cc0fec71 100644
--- a/bolt/test/binary-analysis/AArch64/lit.local.cfg
+++ b/bolt/test/binary-analysis/AArch64/lit.local.cfg
@@ -1,7 +1,7 @@
if "AArch64" not in config.root.targets:
config.unsupported = True
-flags = "--target=aarch64-linux-gnu -nostartfiles -nostdlib -ffreestanding -Wl,--emit-relocs"
+flags = "--target=aarch64-linux-gnu -nostartfiles -nostdlib -ffreestanding"
config.substitutions.insert(0, ("%cflags", f"%cflags {flags}"))
config.substitutions.insert(0, ("%cxxflags", f"%cxxflags {flags}"))
>From 35e1c44f36869c0378e932dd4d9a2562b9404d0f Mon Sep 17 00:00:00 2001
From: Kristof Beyls <kristof.beyls at arm.com>
Date: Thu, 9 Jan 2025 15:53:57 +0000
Subject: [PATCH 2/8] format cleanup
---
bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h | 8 +++-----
bolt/lib/Rewrite/RewriteInstance.cpp | 11 ++++++-----
2 files changed, 9 insertions(+), 10 deletions(-)
diff --git a/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h b/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h
index a9740896369fa2..21d089ed7b8506 100644
--- a/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h
+++ b/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h
@@ -31,7 +31,7 @@ struct MCInstInBBReference {
: BB(BB), BBIndex(BBIndex) {}
MCInstInBBReference() : BB(nullptr), BBIndex(0) {}
static MCInstInBBReference get(const MCInst *Inst, BinaryFunction &BF) {
- for (BinaryBasicBlock& BB : BF)
+ for (BinaryBasicBlock &BB : BF)
for (size_t I = 0; I < BB.size(); ++I)
if (Inst == &(BB.getInstructionAtIndex(I)))
return MCInstInBBReference(&BB, I);
@@ -78,9 +78,7 @@ struct MCInstInBFReference {
uint64_t getOffset() const { return Offset; }
- uint64_t getAddress() const {
- return BF->getAddress() + getOffset();
- }
+ uint64_t getAddress() const { return BF->getAddress() + getOffset(); }
};
raw_ostream &operator<<(raw_ostream &OS, const MCInstInBFReference &);
@@ -179,7 +177,7 @@ struct NonPacProtectedRetGadget {
}
NonPacProtectedRetGadget(
MCInstReference RetInst,
- const std::vector<MCInstReference>& OverwritingRetRegInst)
+ const std::vector<MCInstReference> &OverwritingRetRegInst)
: RetInst(RetInst), OverwritingRetRegInst(OverwritingRetRegInst) {}
};
diff --git a/bolt/lib/Rewrite/RewriteInstance.cpp b/bolt/lib/Rewrite/RewriteInstance.cpp
index 0a45c35f02da32..135cfc63de4d37 100644
--- a/bolt/lib/Rewrite/RewriteInstance.cpp
+++ b/bolt/lib/Rewrite/RewriteInstance.cpp
@@ -246,11 +246,12 @@ static cl::opt<bool> WriteBoltInfoSection(
"bolt-info", cl::desc("write bolt info section in the output binary"),
cl::init(true), cl::Hidden, cl::cat(BoltOutputCategory));
-cl::list<GadgetScannerKind> GadgetScannersToRun(
- "scanners", cl::desc("which gadget scanners to run"),
- cl::values(clEnumValN(GS_PACRET, "pacret", "pac-ret"),
- clEnumValN(GS_ALL, "all", "all")),
- cl::ZeroOrMore, cl::CommaSeparated, cl::cat(BinaryAnalysisCategory));
+cl::list<GadgetScannerKind>
+ GadgetScannersToRun("scanners", cl::desc("which gadget scanners to run"),
+ cl::values(clEnumValN(GS_PACRET, "pacret", "pac-ret"),
+ clEnumValN(GS_ALL, "all", "all")),
+ cl::ZeroOrMore, cl::CommaSeparated,
+ cl::cat(BinaryAnalysisCategory));
} // namespace opts
>From 3b1bd2458e4bc3fe9e05c88da1b2c20ac65f8bea Mon Sep 17 00:00:00 2001
From: Kristof Beyls <kristof.beyls at arm.com>
Date: Fri, 10 Jan 2025 14:32:42 +0000
Subject: [PATCH 3/8] Address first round of review feedback by Peter Smith
---
bolt/docs/BinaryAnalysis.md | 53 ++++++++++++++++++++-----------------
1 file changed, 29 insertions(+), 24 deletions(-)
diff --git a/bolt/docs/BinaryAnalysis.md b/bolt/docs/BinaryAnalysis.md
index e247ce20b4625c..17e45513462af2 100644
--- a/bolt/docs/BinaryAnalysis.md
+++ b/bolt/docs/BinaryAnalysis.md
@@ -28,39 +28,46 @@ pac-ret, shadow stacks, arm64e, and many more.
These mitigations guarantee a so-called "security property" on the binaries they
produce. For example, for stack canaries, the security property is roughly that
-a canary is located on the stack between the set of saved variables and set of
-local variables. For pac-ret, it is roughly that there are no writes to the
-register containing the return address after either an authenticating
-instruction or a Branch-and-link instruction.
+a canary is located on the stack between the set of saved variables and the set
+of local variables. For pac-ret, it is roughly that either the return address is
+never stored/retrieved to/from memory; or, there are no writes to the register
+containing the return address between an instruction authenticating it and a
+return instruction using it.
From time to time, however, a bug gets found in the implementation of such
mitigations in toolchains. Also, code that is written in assembler by hand
requires the developer to ensure these security properties by hand.
In short, it is sometimes found that a few places in the binary code are not
-protected as expected given the requested mitigations. Attackers could make use
-of those places (sometimes called gadgets) to circumvent to protection that the
-mitigation should give.
+protected as well as expected given the requested mitigations. Attackers could
+make use of those places (sometimes called gadgets) to circumvent the protection
+that the mitigation should give.
One of the reasons that such gadgets, or holes in the mitigation implementation,
exist is that typically the amount of testing and verification for these
-security properties is pretty limited.
+security properties is limited to checking results on specific examples.
In comparison, for testing functional correctness, or for testing performance,
toolchain and software in general typically get tested with large test suites
-and benchmarks and testing and benchmarking plans. In contrast, this typically
-does not get done for testing the security properties of binary code.
+and benchmarks. In contrast, this typically does not get done for testing the
+security properties of binary code.
+
+Unlike functional correctness where compilation errors result in test failures,
+and performance where speed and size differences are measurable, broken security
+properties cannot be easily observed using existing testing and benchmarking
+tools.
The security scanners implemented in `llvm-bolt-binary-analysis` aim to enable
-better testing of security hardening implementations that require compilers to
-somehow generate assembly code with specific security properties.
+the testing of security hardening in arbitrary programs and not just specific
+examples.
+
#### pac-ret analysis
`pac-ret` protection is a security hardening scheme implemented in compilers
such as gcc and clang, using the command line option
`-mbranch-protection=pac-ret`. This option is enabled by default on most widely
-used distributions.
+used linux distributions.
The hardening scheme mitigates
[Return-Oriented Programming (ROP)](https://llsoftsec.github.io/llsoftsecbook/#return-oriented-programming)
@@ -77,16 +84,14 @@ from memory.
The 'pac-ret' binary analysis can be invoked using the command line option
`--scanners=pac-ret`. It makes `llvm-bolt-binary-analysis` scan through the
-provided binary and check the following property:
-
-The security property that is checked is:
+provided binary, checking each function for the following security property:
-When a register is used as the address to jump to in a return instruction,
-that register must either:
+For each procedure and exception return instruction, the destination register
+must have one of the following properties:
-1. never be changed within this function, i.e. have the same value as when the
- function started, or
-2. the last write to the register must be by an authenticating instruction.
+1. be immutable within the function, or
+2. the last write to the register must be by an authenticating instruction. This
+ includes combined authentication and return instructions such as `RETAA`.
##### Example 1
@@ -101,7 +106,7 @@ ldp x29, x30, [sp], #0x10
ret
```
-The return instruction `ret` uses register `x30` as containing the address to
+The return instruction `ret` implicitly uses register `x30` as the address to
return to. Register `x30` was last written by instruction `ldp`, which is not an
authenticating instruction. `llvm-bolt-binary-analysis --scanners=pac-ret` will
report this as follows:
@@ -125,8 +130,8 @@ evolve over time.
##### Example 2: multiple "last-overwriting" instructions
-A simple example that show how there can be a set of "last overwriting"
-instructions of a register is the following:
+A simple example that shows how there can be a set of "last overwriting"
+instructions of a register follows:
```
paciasp
>From 39d0e54d9ec0e1c9f9a52cb03f2dab4a571498ca Mon Sep 17 00:00:00 2001
From: Kristof Beyls <kristof.beyls at arm.com>
Date: Fri, 10 Jan 2025 15:41:25 +0000
Subject: [PATCH 4/8] Address first round of review feedback by @atrosinenko
---
bolt/include/bolt/Core/MCPlusBuilder.h | 7 +--
.../bolt/Passes/NonPacProtectedRetAnalysis.h | 45 ++++++++++---------
.../lib/Passes/NonPacProtectedRetAnalysis.cpp | 43 +++++++++---------
.../Target/AArch64/AArch64MCPlusBuilder.cpp | 10 ++---
4 files changed, 53 insertions(+), 52 deletions(-)
diff --git a/bolt/include/bolt/Core/MCPlusBuilder.h b/bolt/include/bolt/Core/MCPlusBuilder.h
index 9351107e66bdf6..79f0176b47776a 100644
--- a/bolt/include/bolt/Core/MCPlusBuilder.h
+++ b/bolt/include/bolt/Core/MCPlusBuilder.h
@@ -27,6 +27,7 @@
#include "llvm/MC/MCInstrAnalysis.h"
#include "llvm/MC/MCInstrDesc.h"
#include "llvm/MC/MCInstrInfo.h"
+#include "llvm/MC/MCRegister.h"
#include "llvm/Support/Allocator.h"
#include "llvm/Support/Casting.h"
#include "llvm/Support/ErrorHandling.h"
@@ -552,16 +553,16 @@ class MCPlusBuilder {
virtual MCPhysReg getAuthenticatedReg(const MCInst &Inst) const {
llvm_unreachable("not implemented");
- return false;
+ return getNoRegister();
}
virtual bool isAuthenticationOfReg(const MCInst &Inst,
- const unsigned RegAuthenticated) const {
+ MCPhysReg AuthenticatedReg) const {
llvm_unreachable("not implemented");
return false;
}
- virtual llvm::MCPhysReg getRegUsedAsRetDest(const MCInst &Inst) const {
+ virtual MCPhysReg getRegUsedAsRetDest(const MCInst &Inst) const {
llvm_unreachable("not implemented");
return getNoRegister();
}
diff --git a/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h b/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h
index 21d089ed7b8506..b98a99a18d3425 100644
--- a/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h
+++ b/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h
@@ -51,6 +51,9 @@ struct MCInstInBBReference {
}
uint64_t getAddress() const {
// 4 bytes per instruction on AArch64;
+ // FIXME: the assumption of 4 byte per instruction needs to be fixed before
+ // this method gets used on any non-AArch64 binaries (but should be fine for
+ // pac-ret analysis, as that is an AArch64-specific feature).
return BB->getFunction()->getAddress() + BB->getOffset() + BBIndex * 4;
}
};
@@ -84,8 +87,8 @@ struct MCInstInBFReference {
raw_ostream &operator<<(raw_ostream &OS, const MCInstInBFReference &);
struct MCInstReference {
- enum StoredIn { _BinaryFunction, _BinaryBasicBlock };
- StoredIn CurrentLocation;
+ enum ParentKind { BinaryFunction, BinaryBasicBlock };
+ ParentKind CurrentLocation;
union U {
MCInstInBBReference BBRef;
MCInstInBFReference BFRef;
@@ -93,21 +96,21 @@ struct MCInstReference {
U(MCInstInBFReference BFRef) : BFRef(BFRef) {}
} U;
MCInstReference(MCInstInBBReference BBRef)
- : CurrentLocation(_BinaryBasicBlock), U(BBRef) {}
+ : CurrentLocation(BinaryBasicBlock), U(BBRef) {}
MCInstReference(MCInstInBFReference BFRef)
- : CurrentLocation(_BinaryFunction), U(BFRef) {}
- MCInstReference(BinaryBasicBlock *BB, int64_t BBIndex)
+ : CurrentLocation(BinaryFunction), U(BFRef) {}
+ MCInstReference(class BinaryBasicBlock *BB, int64_t BBIndex)
: MCInstReference(MCInstInBBReference(BB, BBIndex)) {}
- MCInstReference(BinaryFunction *BF, uint32_t Offset)
+ MCInstReference(class BinaryFunction *BF, uint32_t Offset)
: MCInstReference(MCInstInBFReference(BF, Offset)) {}
bool operator<(const MCInstReference &RHS) const {
if (CurrentLocation != RHS.CurrentLocation)
return CurrentLocation < RHS.CurrentLocation;
switch (CurrentLocation) {
- case _BinaryBasicBlock:
+ case BinaryBasicBlock:
return U.BBRef < RHS.U.BBRef;
- case _BinaryFunction:
+ case BinaryFunction:
return U.BFRef < RHS.U.BFRef;
}
llvm_unreachable("");
@@ -117,9 +120,9 @@ struct MCInstReference {
if (CurrentLocation != RHS.CurrentLocation)
return false;
switch (CurrentLocation) {
- case _BinaryBasicBlock:
+ case BinaryBasicBlock:
return U.BBRef == RHS.U.BBRef;
- case _BinaryFunction:
+ case BinaryFunction:
return U.BFRef == RHS.U.BFRef;
}
llvm_unreachable("");
@@ -127,9 +130,9 @@ struct MCInstReference {
operator MCInst &() const {
switch (CurrentLocation) {
- case _BinaryBasicBlock:
+ case BinaryBasicBlock:
return U.BBRef;
- case _BinaryFunction:
+ case BinaryFunction:
return U.BFRef;
}
llvm_unreachable("");
@@ -137,29 +140,29 @@ struct MCInstReference {
uint64_t getAddress() const {
switch (CurrentLocation) {
- case _BinaryBasicBlock:
+ case BinaryBasicBlock:
return U.BBRef.getAddress();
- case _BinaryFunction:
+ case BinaryFunction:
return U.BFRef.getAddress();
}
llvm_unreachable("");
}
- BinaryFunction *getFunction() const {
+ class BinaryFunction *getFunction() const {
switch (CurrentLocation) {
- case _BinaryFunction:
+ case BinaryFunction:
return U.BFRef.BF;
- case _BinaryBasicBlock:
+ case BinaryBasicBlock:
return U.BBRef.BB->getFunction();
}
llvm_unreachable("");
}
- BinaryBasicBlock *getBasicBlock() const {
+ class BinaryBasicBlock *getBasicBlock() const {
switch (CurrentLocation) {
- case _BinaryFunction:
+ case BinaryFunction:
return nullptr;
- case _BinaryBasicBlock:
+ case BinaryBasicBlock:
return U.BBRef.BB;
}
llvm_unreachable("");
@@ -204,4 +207,4 @@ class NonPacProtectedRetAnalysis : public BinaryFunctionPass {
} // namespace bolt
} // namespace llvm
-#endif
\ No newline at end of file
+#endif
diff --git a/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp b/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
index d6bc902f663fc9..fe456d8fcaa503 100644
--- a/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
+++ b/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
@@ -48,10 +48,10 @@ raw_ostream &operator<<(raw_ostream &OS, const MCInstInBFReference &Ref) {
raw_ostream &operator<<(raw_ostream &OS, const MCInstReference &Ref) {
switch (Ref.CurrentLocation) {
- case MCInstReference::_BinaryBasicBlock:
+ case MCInstReference::BinaryBasicBlock:
OS << Ref.U.BBRef;
return OS;
- case MCInstReference::_BinaryFunction:
+ case MCInstReference::BinaryFunction:
OS << Ref.U.BFRef;
return OS;
}
@@ -173,7 +173,7 @@ void PacStatePrinter::print(raw_ostream &OS, const State &S) const {
}
class PacRetAnalysis
- : public DataflowAnalysis<PacRetAnalysis, State, false /*Backward*/,
+ : public DataflowAnalysis<PacRetAnalysis, State, /*Backward=*/false,
PacStatePrinter> {
using Parent =
DataflowAnalysis<PacRetAnalysis, State, false, PacStatePrinter>;
@@ -187,17 +187,13 @@ class PacRetAnalysis
TrackingLastInsts(RegsToTrackInstsFor.size() != 0),
Reg2StateIdx(RegsToTrackInstsFor.size() == 0
? 0
- : *std::max_element(RegsToTrackInstsFor.begin(),
- RegsToTrackInstsFor.end()) +
- 1,
+ : *llvm::max_element(RegsToTrackInstsFor) + 1,
-1) {
for (unsigned I = 0; I < RegsToTrackInstsFor.size(); ++I)
Reg2StateIdx[RegsToTrackInstsFor[I]] = I;
}
virtual ~PacRetAnalysis() {}
- void run() { Parent::run(); }
-
protected:
const uint16_t NumRegs;
/// RegToTrackInstsFor is the set of registers for which the dataflow analysis
@@ -208,7 +204,7 @@ class PacRetAnalysis
/// track which instructions last wrote to this register.
std::vector<uint16_t> Reg2StateIdx;
- bool trackReg(MCPhysReg Reg) {
+ bool doTrackReg(MCPhysReg Reg) const {
for (auto R : RegsToTrackInstsFor)
if (R == Reg)
return true;
@@ -269,7 +265,7 @@ class PacRetAnalysis
// need to track that for:
if (TrackingLastInsts) {
for (auto WrittenReg : Written.set_bits()) {
- if (trackReg(WrittenReg)) {
+ if (doTrackReg(WrittenReg)) {
Next.LastInstWritingReg[reg2StateIdx(WrittenReg)] = {};
Next.LastInstWritingReg[reg2StateIdx(WrittenReg)].insert(&Point);
}
@@ -280,7 +276,7 @@ class PacRetAnalysis
if (AutReg != BC.MIB->getNoRegister()) {
Next.NonAutClobRegs.reset(
BC.MIB->getAliases(AutReg, /*OnlySmaller=*/true));
- if (TrackingLastInsts && trackReg(AutReg))
+ if (TrackingLastInsts && doTrackReg(AutReg))
Next.LastInstWritingReg[reg2StateIdx(AutReg)] = {};
}
@@ -394,8 +390,8 @@ void NonPacProtectedRetAnalysis::runOnFunction(
LLVM_DEBUG({
dbgs() << "Analyzing in function " << BF.getPrintName() << ", AllocatorId "
<< AllocatorId << "\n";
+ BF.dump();
});
- LLVM_DEBUG({ BF.dump(); });
if (BF.hasCFG()) {
PacRetAnalysis PRA(BF, AllocatorId, {});
@@ -421,6 +417,9 @@ void printBB(const BinaryContext &BC, const BinaryBasicBlock *BB,
EndIndex = BB->size() - 1;
const BinaryFunction *BF = BB->getFunction();
for (unsigned I = StartIndex; I <= EndIndex; ++I) {
+ // FIXME: this assumes all instructions are 4 bytes in size. This is true
+ // for AArch64, but it might be good to extract this function so it can be
+ // used elsewhere and for other targets too.
uint64_t Address = BB->getOffset() + BF->getAddress() + 4 * I;
const MCInst &Inst = BB->getInstructionAtIndex(I);
if (BC.MIB->isCFI(Inst))
@@ -433,8 +432,8 @@ void reportFoundGadgetInSingleBBSingleOverwInst(const BinaryContext &BC,
const MCInstReference OverwInst,
const MCInstReference RetInst) {
BinaryBasicBlock *BB = RetInst.getBasicBlock();
- assert(OverwInst.CurrentLocation == MCInstReference::_BinaryBasicBlock);
- assert(RetInst.CurrentLocation == MCInstReference::_BinaryBasicBlock);
+ assert(OverwInst.CurrentLocation == MCInstReference::BinaryBasicBlock);
+ assert(RetInst.CurrentLocation == MCInstReference::BinaryBasicBlock);
MCInstInBBReference OverwInstBB = OverwInst.U.BBRef;
if (BB == OverwInstBB.BB) {
// overwriting inst and ret instruction are in the same basic block.
@@ -463,11 +462,10 @@ void reportFoundGadget(const BinaryContext &BC, const MCInst &Inst,
<< " instructions that write to the return register after any "
"authentication are:\n";
// Sort by address to ensure output is deterministic.
- std::sort(std::begin(NPPRG.OverwritingRetRegInst),
- std::end(NPPRG.OverwritingRetRegInst),
- [](const MCInstReference &A, const MCInstReference &B) {
- return A.getAddress() < B.getAddress();
- });
+ llvm::sort(NPPRG.OverwritingRetRegInst,
+ [](const MCInstReference &A, const MCInstReference &B) {
+ return A.getAddress() < B.getAddress();
+ });
for (unsigned I = 0; I < NPPRG.OverwritingRetRegInst.size(); ++I) {
MCInstReference InstRef = NPPRG.OverwritingRetRegInst[I];
outs() << " " << (I + 1) << ". ";
@@ -481,7 +479,7 @@ void reportFoundGadget(const BinaryContext &BC, const MCInst &Inst,
});
if (NPPRG.OverwritingRetRegInst.size() == 1) {
const MCInstReference OverwInst = NPPRG.OverwritingRetRegInst[0];
- assert(OverwInst.CurrentLocation == MCInstReference::_BinaryBasicBlock);
+ assert(OverwInst.CurrentLocation == MCInstReference::BinaryBasicBlock);
reportFoundGadgetInSingleBBSingleOverwInst(BC, OverwInst, RetInst);
}
}
@@ -505,8 +503,7 @@ Error NonPacProtectedRetAnalysis::runOnFunctions(BinaryContext &BC) {
for (BinaryFunction *BF : BC.getAllBinaryFunctions())
if (BF->hasCFG()) {
for (BinaryBasicBlock &BB : *BF) {
- for (int64_t I = BB.size() - 1; I >= 0; --I) {
- MCInst &Inst = BB.getInstructionAtIndex(I);
+ for (MCInst &Inst : BB) {
if (BC.MIB->hasAnnotation(Inst, GadgetAnnotationIndex)) {
reportFoundGadget(BC, Inst, GadgetAnnotationIndex);
}
@@ -517,4 +514,4 @@ Error NonPacProtectedRetAnalysis::runOnFunctions(BinaryContext &BC) {
}
} // namespace bolt
-} // namespace llvm
\ No newline at end of file
+} // namespace llvm
diff --git a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
index 1840750822aa3f..7bbcb1faff3689 100644
--- a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
+++ b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
@@ -23,6 +23,7 @@
#include "llvm/MC/MCFixupKindInfo.h"
#include "llvm/MC/MCInstBuilder.h"
#include "llvm/MC/MCInstrInfo.h"
+#include "llvm/MC/MCRegister.h"
#include "llvm/MC/MCRegisterInfo.h"
#include "llvm/Support/DataExtractor.h"
#include "llvm/Support/Debug.h"
@@ -172,7 +173,6 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
case AArch64::AUTIZB:
case AArch64::AUTDZA:
case AArch64::AUTDZB:
- assert(Inst.getOperand(0).isReg());
return Inst.getOperand(0).getReg();
default:
@@ -181,13 +181,13 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
}
bool isAuthenticationOfReg(const MCInst &Inst,
- const unsigned RegAuthenticated) const override {
- if (RegAuthenticated == getNoRegister())
+ MCPhysReg AuthenticatedReg) const override {
+ if (AuthenticatedReg == getNoRegister())
return false;
- return getAuthenticatedReg(Inst) == RegAuthenticated;
+ return getAuthenticatedReg(Inst) == AuthenticatedReg;
}
- llvm::MCPhysReg getRegUsedAsRetDest(const MCInst &Inst) const override {
+ MCPhysReg getRegUsedAsRetDest(const MCInst &Inst) const override {
assert(isReturn(Inst));
switch (Inst.getOpcode()) {
case AArch64::RET:
>From 6edaa012f9d727866beb5bd7c1f1a6e4a0bf4e20 Mon Sep 17 00:00:00 2001
From: Kristof Beyls <kristof.beyls at arm.com>
Date: Fri, 10 Jan 2025 16:17:48 +0000
Subject: [PATCH 5/8] doTrackReg -> isTrackingReg
---
bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp b/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
index fe456d8fcaa503..7053fd7c579c26 100644
--- a/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
+++ b/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
@@ -204,7 +204,7 @@ class PacRetAnalysis
/// track which instructions last wrote to this register.
std::vector<uint16_t> Reg2StateIdx;
- bool doTrackReg(MCPhysReg Reg) const {
+ bool isTrackingReg(MCPhysReg Reg) const {
for (auto R : RegsToTrackInstsFor)
if (R == Reg)
return true;
@@ -265,7 +265,7 @@ class PacRetAnalysis
// need to track that for:
if (TrackingLastInsts) {
for (auto WrittenReg : Written.set_bits()) {
- if (doTrackReg(WrittenReg)) {
+ if (isTrackingReg(WrittenReg)) {
Next.LastInstWritingReg[reg2StateIdx(WrittenReg)] = {};
Next.LastInstWritingReg[reg2StateIdx(WrittenReg)].insert(&Point);
}
@@ -276,7 +276,7 @@ class PacRetAnalysis
if (AutReg != BC.MIB->getNoRegister()) {
Next.NonAutClobRegs.reset(
BC.MIB->getAliases(AutReg, /*OnlySmaller=*/true));
- if (TrackingLastInsts && doTrackReg(AutReg))
+ if (TrackingLastInsts && isTrackingReg(AutReg))
Next.LastInstWritingReg[reg2StateIdx(AutReg)] = {};
}
>From dd20d447c63259a5269fdc5b1d7e0edf598c2af3 Mon Sep 17 00:00:00 2001
From: Kristof Beyls <kristof.beyls at arm.com>
Date: Mon, 13 Jan 2025 10:27:49 +0000
Subject: [PATCH 6/8] ParentKind->Kind; CurrentLocation->ParentKind
---
.../bolt/Passes/NonPacProtectedRetAnalysis.h | 54 +++++++++----------
.../lib/Passes/NonPacProtectedRetAnalysis.cpp | 12 ++---
2 files changed, 33 insertions(+), 33 deletions(-)
diff --git a/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h b/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h
index b98a99a18d3425..1a9b2c41dbaf84 100644
--- a/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h
+++ b/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h
@@ -87,8 +87,8 @@ struct MCInstInBFReference {
raw_ostream &operator<<(raw_ostream &OS, const MCInstInBFReference &);
struct MCInstReference {
- enum ParentKind { BinaryFunction, BinaryBasicBlock };
- ParentKind CurrentLocation;
+ enum Kind { FunctionParent, BasicBlockParent };
+ Kind ParentKind;
union U {
MCInstInBBReference BBRef;
MCInstInBFReference BFRef;
@@ -96,73 +96,73 @@ struct MCInstReference {
U(MCInstInBFReference BFRef) : BFRef(BFRef) {}
} U;
MCInstReference(MCInstInBBReference BBRef)
- : CurrentLocation(BinaryBasicBlock), U(BBRef) {}
+ : ParentKind(BasicBlockParent), U(BBRef) {}
MCInstReference(MCInstInBFReference BFRef)
- : CurrentLocation(BinaryFunction), U(BFRef) {}
+ : ParentKind(FunctionParent), U(BFRef) {}
MCInstReference(class BinaryBasicBlock *BB, int64_t BBIndex)
: MCInstReference(MCInstInBBReference(BB, BBIndex)) {}
MCInstReference(class BinaryFunction *BF, uint32_t Offset)
: MCInstReference(MCInstInBFReference(BF, Offset)) {}
bool operator<(const MCInstReference &RHS) const {
- if (CurrentLocation != RHS.CurrentLocation)
- return CurrentLocation < RHS.CurrentLocation;
- switch (CurrentLocation) {
- case BinaryBasicBlock:
+ if (ParentKind != RHS.ParentKind)
+ return ParentKind < RHS.ParentKind;
+ switch (ParentKind) {
+ case BasicBlockParent:
return U.BBRef < RHS.U.BBRef;
- case BinaryFunction:
+ case FunctionParent:
return U.BFRef < RHS.U.BFRef;
}
llvm_unreachable("");
}
bool operator==(const MCInstReference &RHS) const {
- if (CurrentLocation != RHS.CurrentLocation)
+ if (ParentKind != RHS.ParentKind)
return false;
- switch (CurrentLocation) {
- case BinaryBasicBlock:
+ switch (ParentKind) {
+ case BasicBlockParent:
return U.BBRef == RHS.U.BBRef;
- case BinaryFunction:
+ case FunctionParent:
return U.BFRef == RHS.U.BFRef;
}
llvm_unreachable("");
}
operator MCInst &() const {
- switch (CurrentLocation) {
- case BinaryBasicBlock:
+ switch (ParentKind) {
+ case BasicBlockParent:
return U.BBRef;
- case BinaryFunction:
+ case FunctionParent:
return U.BFRef;
}
llvm_unreachable("");
}
uint64_t getAddress() const {
- switch (CurrentLocation) {
- case BinaryBasicBlock:
+ switch (ParentKind) {
+ case BasicBlockParent:
return U.BBRef.getAddress();
- case BinaryFunction:
+ case FunctionParent:
return U.BFRef.getAddress();
}
llvm_unreachable("");
}
- class BinaryFunction *getFunction() const {
- switch (CurrentLocation) {
- case BinaryFunction:
+ BinaryFunction *getFunction() const {
+ switch (ParentKind) {
+ case FunctionParent:
return U.BFRef.BF;
- case BinaryBasicBlock:
+ case BasicBlockParent:
return U.BBRef.BB->getFunction();
}
llvm_unreachable("");
}
- class BinaryBasicBlock *getBasicBlock() const {
- switch (CurrentLocation) {
- case BinaryFunction:
+ BinaryBasicBlock *getBasicBlock() const {
+ switch (ParentKind) {
+ case FunctionParent:
return nullptr;
- case BinaryBasicBlock:
+ case BasicBlockParent:
return U.BBRef.BB;
}
llvm_unreachable("");
diff --git a/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp b/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
index 7053fd7c579c26..c91e56ac941167 100644
--- a/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
+++ b/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
@@ -47,11 +47,11 @@ raw_ostream &operator<<(raw_ostream &OS, const MCInstInBFReference &Ref) {
}
raw_ostream &operator<<(raw_ostream &OS, const MCInstReference &Ref) {
- switch (Ref.CurrentLocation) {
- case MCInstReference::BinaryBasicBlock:
+ switch (Ref.ParentKind) {
+ case MCInstReference::BasicBlockParent:
OS << Ref.U.BBRef;
return OS;
- case MCInstReference::BinaryFunction:
+ case MCInstReference::FunctionParent:
OS << Ref.U.BFRef;
return OS;
}
@@ -432,8 +432,8 @@ void reportFoundGadgetInSingleBBSingleOverwInst(const BinaryContext &BC,
const MCInstReference OverwInst,
const MCInstReference RetInst) {
BinaryBasicBlock *BB = RetInst.getBasicBlock();
- assert(OverwInst.CurrentLocation == MCInstReference::BinaryBasicBlock);
- assert(RetInst.CurrentLocation == MCInstReference::BinaryBasicBlock);
+ assert(OverwInst.ParentKind == MCInstReference::BasicBlockParent);
+ assert(RetInst.ParentKind == MCInstReference::BasicBlockParent);
MCInstInBBReference OverwInstBB = OverwInst.U.BBRef;
if (BB == OverwInstBB.BB) {
// overwriting inst and ret instruction are in the same basic block.
@@ -479,7 +479,7 @@ void reportFoundGadget(const BinaryContext &BC, const MCInst &Inst,
});
if (NPPRG.OverwritingRetRegInst.size() == 1) {
const MCInstReference OverwInst = NPPRG.OverwritingRetRegInst[0];
- assert(OverwInst.CurrentLocation == MCInstReference::BinaryBasicBlock);
+ assert(OverwInst.ParentKind == MCInstReference::BasicBlockParent);
reportFoundGadgetInSingleBBSingleOverwInst(BC, OverwInst, RetInst);
}
}
>From a30116dfb526b852244f2bd13f3e96049569ae8e Mon Sep 17 00:00:00 2001
From: Kristof Beyls <kristof.beyls at arm.com>
Date: Mon, 13 Jan 2025 11:00:28 +0000
Subject: [PATCH 7/8] Address the easy-to-address feedback by Jacob
---
.../lib/Passes/NonPacProtectedRetAnalysis.cpp | 20 +++++++------------
.../Target/AArch64/AArch64MCPlusBuilder.cpp | 10 +---------
2 files changed, 8 insertions(+), 22 deletions(-)
diff --git a/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp b/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
index c91e56ac941167..bb95c9f31f3106 100644
--- a/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
+++ b/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
@@ -9,9 +9,6 @@
// This file implements a pass that looks for any AArch64 return instructions
// that may not be protected by PAuth authentication instructions when needed.
//
-// When needed = the register used to return (almost always X30), is potentially
-// written to between the AUThentication instruction and the RETurn instruction.
-//
//===----------------------------------------------------------------------===//
#include "bolt/Passes/NonPacProtectedRetAnalysis.h"
@@ -76,15 +73,13 @@ raw_ostream &operator<<(raw_ostream &OS,
// the function started, or
// (b) the last write to the register must be by an authentication instruction.
-// This property is checked by using data flow analysis to keep track of which
+// This property is checked by using dataflow analysis to keep track of which
// registers have been written (def-ed), since last authenticated. Those are
// exactly the registers containing values that should not be trusted (as they
// could have changed since the last time they were authenticated). For pac-ret,
// any return instruction using such a register is a gadget to be reported. For
// PAuthABI, any indirect control flow using such a register should be reported?
-// This security property is verified using a dataflow analysis.
-
// Furthermore, when producing a diagnostic for a found non-pac-ret protected
// return, the analysis also lists the last instructions that wrote to the
// register used in the return instruction.
@@ -96,12 +91,11 @@ raw_ostream &operator<<(raw_ostream &OS,
// 1. In the first run, the dataflow analysis only keeps track of the security
// property: i.e. which registers have been overwritten since the last
// time they've been authenticated.
-// 2. If the first run finds any return instructions using a
-// written-to-last-by-an-non-authenticating instruction, the data flow
-// analysis will be run a second time. The first run will return which
-// registers are used in the gadgets to be reported. This information is
-// used in the second run to also track with instructions last wrote to
-// those registers.
+// 2. If the first run finds any return instructions using a register last
+// written by a non-authenticating instruction, the dataflow analysis will
+// be run a second time. The first run will return which registers are used
+// in the gadgets to be reported. This information is used in the second run
+// to also track with instructions last wrote to those registers.
struct State {
/// A BitVector containing the registers that have been clobbered, and
@@ -110,7 +104,7 @@ struct State {
/// A vector of sets, only used in the second data flow run.
/// Each element in the vector represent one registers for which we
/// track the set of last instructions that wrote to this register.
- /// For pac-ret analysis, the expectations is that almost all return
+ /// For pac-ret analysis, the expectation is that almost all return
/// instructions only use register `X30`, and therefore, this vector
/// will have length 1 in the second run.
std::vector<SmallPtrSet<const MCInst *, 4>> LastInstWritingReg;
diff --git a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
index 7bbcb1faff3689..01bb53d9610e98 100644
--- a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
+++ b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
@@ -191,15 +191,7 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
assert(isReturn(Inst));
switch (Inst.getOpcode()) {
case AArch64::RET:
- // There should be one register that the return reads, and
- // that's the one being used as the jump target?
- for (unsigned OpIdx = 0, EndIdx = Inst.getNumOperands(); OpIdx < EndIdx;
- ++OpIdx) {
- const MCOperand &MO = Inst.getOperand(OpIdx);
- if (MO.isReg())
- return MO.getReg();
- }
- return getNoRegister();
+ return Inst.getOperand(0).getReg();
case AArch64::RETAA:
case AArch64::RETAB:
case AArch64::ERETAA:
>From f44f9bfb53b8e4c18581fb2855b9d19bdeef95e2 Mon Sep 17 00:00:00 2001
From: Kristof Beyls <kristof.beyls at arm.com>
Date: Mon, 13 Jan 2025 13:44:05 +0000
Subject: [PATCH 8/8] Handle erets by producing diagnostic that analysis cannot
analyze them correctly.
ERETAA/ERETAB/ERET instructions return to the address stored in one of
the system registers ELR_EL1, ELR_EL2 or ELR_EL3, depending on the
current "Exception Level" state.
I'm not aware of any examples of use of ERETA{A|B} in open source
software currently. My understanding is that could be used as follows to create a
pac-ret-like hardening for ERET:
exception_entry:
MRS x0, elr_el1 // Could be elr_el2, elr_el3 depending on whether
// this code is designed to run at EL1, EL2 or EL3.
PACIA x0, SP
STR x0, [some place]
...
LDR x0, [some place]
MSR elr_el1, x0
ERETAA
Assuming my understanding above is correct, this makes it impossible to
write a fully static "pac-ret"-like checker for exception returns,
because the static analyzer doesn't know whether the code will run at
EL1, EL2 or EL3. Therefore, the static analyzer can't know whether to
check for writes to elr_el1, elr_el2 or elr_el3.
Therefore, in this commit, changes are made to produce a warning when
any of the ERET* instructions are encountered, stating that the analyzer
can't analyze them.
This in turn results in the non-pac-ret-protected-return analyzer now
needing to produce 2 different kinds of diagnostics.
I found that this was most nicely modelled by creating a class hierarchy
for these diagnostics.
However, it seems that MCAnnotations currently do not work (well or at
all) when trying to store any object from a class hierarchy, rather than
an object with fixed type.
At which point I realized that it was a bit strange that the diagnostics
that need to be produced are stored using MCAnnotations at all...
So, this patch redesigns that so that the diagnostics that need to be
produced are not stored as MCAnnotations on an MCInst at all...
Overall, it seems like this is a cleaner design and a better example for
any future binary analyses.
---
bolt/include/bolt/Core/MCPlusBuilder.h | 4 +-
.../bolt/Passes/NonPacProtectedRetAnalysis.h | 68 ++++++-
.../lib/Passes/NonPacProtectedRetAnalysis.cpp | 175 ++++++++++--------
.../Target/AArch64/AArch64MCPlusBuilder.cpp | 42 ++++-
.../AArch64/gs-pacret-autiasp.s | 23 +--
5 files changed, 208 insertions(+), 104 deletions(-)
diff --git a/bolt/include/bolt/Core/MCPlusBuilder.h b/bolt/include/bolt/Core/MCPlusBuilder.h
index 79f0176b47776a..fd62f2e22d3e1b 100644
--- a/bolt/include/bolt/Core/MCPlusBuilder.h
+++ b/bolt/include/bolt/Core/MCPlusBuilder.h
@@ -551,7 +551,7 @@ class MCPlusBuilder {
return Analysis->isReturn(Inst);
}
- virtual MCPhysReg getAuthenticatedReg(const MCInst &Inst) const {
+ virtual ErrorOr<MCPhysReg> getAuthenticatedReg(const MCInst &Inst) const {
llvm_unreachable("not implemented");
return getNoRegister();
}
@@ -562,7 +562,7 @@ class MCPlusBuilder {
return false;
}
- virtual MCPhysReg getRegUsedAsRetDest(const MCInst &Inst) const {
+ virtual ErrorOr<MCPhysReg> getRegUsedAsRetDest(const MCInst &Inst) const {
llvm_unreachable("not implemented");
return getNoRegister();
}
diff --git a/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h b/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h
index 1a9b2c41dbaf84..702709c0fba9ab 100644
--- a/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h
+++ b/bolt/include/bolt/Passes/NonPacProtectedRetAnalysis.h
@@ -13,6 +13,9 @@
#include "bolt/Core/BinaryFunction.h"
#include "bolt/Passes/BinaryPasses.h"
#include "llvm/ADT/SmallSet.h"
+#include "llvm/Support/raw_ostream.h"
+#include "llvm/Testing/Annotations/Annotations.h"
+#include <memory>
namespace llvm {
namespace bolt {
@@ -171,29 +174,82 @@ struct MCInstReference {
raw_ostream &operator<<(raw_ostream &OS, const MCInstReference &);
-struct NonPacProtectedRetGadget {
+struct GeneralDiagnostic {
+ std::string Text;
+ GeneralDiagnostic(const std::string &Text) : Text(Text) {}
+ bool operator==(const GeneralDiagnostic &RHS) const {
+ return Text == RHS.Text;
+ }
+};
+
+struct NonPacProtectedRetAnnotation {
MCInstReference RetInst;
+ NonPacProtectedRetAnnotation(MCInstReference RetInst) : RetInst(RetInst) {}
+ virtual bool operator==(const NonPacProtectedRetAnnotation &RHS) const {
+ return RetInst == RHS.RetInst;
+ }
+ NonPacProtectedRetAnnotation &
+ operator=(const NonPacProtectedRetAnnotation &Other) {
+ if (this == &Other)
+ return *this;
+ RetInst = Other.RetInst;
+ return *this;
+ }
+ virtual ~NonPacProtectedRetAnnotation() {}
+ virtual void generateReport(raw_ostream &OS,
+ const BinaryContext &BC) const = 0;
+ virtual void print(raw_ostream &OS) const;
+};
+
+struct NonPacProtectedRetGadget : public NonPacProtectedRetAnnotation {
std::vector<MCInstReference> OverwritingRetRegInst;
- bool operator==(const NonPacProtectedRetGadget &RHS) const {
- return RetInst == RHS.RetInst &&
+ virtual bool operator==(const NonPacProtectedRetGadget &RHS) const {
+ return NonPacProtectedRetAnnotation::operator==(RHS) &&
OverwritingRetRegInst == RHS.OverwritingRetRegInst;
}
NonPacProtectedRetGadget(
MCInstReference RetInst,
const std::vector<MCInstReference> &OverwritingRetRegInst)
- : RetInst(RetInst), OverwritingRetRegInst(OverwritingRetRegInst) {}
+ : NonPacProtectedRetAnnotation(RetInst),
+ OverwritingRetRegInst(OverwritingRetRegInst) {}
+ virtual void generateReport(raw_ostream &OS,
+ const BinaryContext &BC) const override;
+ virtual void print(raw_ostream &OS) const override;
+};
+
+struct NonPacProtectedRetGenDiag : public NonPacProtectedRetAnnotation {
+ GeneralDiagnostic Diag;
+ virtual bool operator==(const NonPacProtectedRetGenDiag &RHS) const {
+ return NonPacProtectedRetAnnotation::operator==(RHS) && Diag == RHS.Diag;
+ }
+ NonPacProtectedRetGenDiag(MCInstReference RetInst, const std::string &Text)
+ : NonPacProtectedRetAnnotation(RetInst), Diag(Text) {}
+ virtual void generateReport(raw_ostream &OS,
+ const BinaryContext &BC) const override;
+ virtual void print(raw_ostream &OS) const override;
};
raw_ostream &operator<<(raw_ostream &OS, const NonPacProtectedRetGadget &NPPRG);
+raw_ostream &operator<<(raw_ostream &OS, const GeneralDiagnostic &Diag);
+raw_ostream &operator<<(raw_ostream &OS, const NonPacProtectedRetGenDiag &Diag);
+raw_ostream &operator<<(raw_ostream &OS, const NonPacProtectedRetAnnotation &A);
+
class PacRetAnalysis;
+struct FunctionAnalysisResult {
+ SmallSet<MCPhysReg, 1> RegistersAffected;
+ std::vector<std::shared_ptr<NonPacProtectedRetAnnotation>> Diagnostics;
+};
+
class NonPacProtectedRetAnalysis : public BinaryFunctionPass {
void runOnFunction(BinaryFunction &Function,
MCPlusBuilder::AllocatorIdTy AllocatorId);
- SmallSet<MCPhysReg, 1>
+ FunctionAnalysisResult
computeDfState(PacRetAnalysis &PRA, BinaryFunction &BF,
MCPlusBuilder::AllocatorIdTy AllocatorId);
- unsigned GadgetAnnotationIndex;
+
+ std::map<const BinaryFunction *, FunctionAnalysisResult> AnalysisResults;
+ std::mutex AnalysisResultsMutex;
public:
explicit NonPacProtectedRetAnalysis() : BinaryFunctionPass(false) {}
diff --git a/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp b/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
index bb95c9f31f3106..e46469a4fd68c4 100644
--- a/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
+++ b/bolt/lib/Passes/NonPacProtectedRetAnalysis.cpp
@@ -17,6 +17,7 @@
#include "llvm/ADT/SmallSet.h"
#include "llvm/MC/MCInst.h"
#include "llvm/Support/Format.h"
+#include <memory>
#define DEBUG_TYPE "bolt-nonpacprotectedret"
@@ -57,15 +58,37 @@ raw_ostream &operator<<(raw_ostream &OS, const MCInstReference &Ref) {
raw_ostream &operator<<(raw_ostream &OS,
const NonPacProtectedRetGadget &NPPRG) {
- OS << "pac-ret-gadget<";
- OS << "Ret:" << NPPRG.RetInst << ", ";
+ OS << "pac-ret<Ret:" << NPPRG.RetInst << ", ";
+ OS << "gadget<";
OS << "Overwriting:[";
for (auto Ref : NPPRG.OverwritingRetRegInst)
OS << Ref << " ";
OS << "]>";
+ OS << ">";
+ return OS;
+}
+
+raw_ostream &operator<<(raw_ostream &OS, const GeneralDiagnostic &Diag) {
+ OS << "diag<'" << Diag.Text << "'>";
+ return OS;
+}
+
+raw_ostream &operator<<(raw_ostream &OS,
+ const NonPacProtectedRetGenDiag &Diag) {
+ OS << "pac-ret<Ret:" << Diag.RetInst << ", " << Diag.Diag << ">";
return OS;
}
+raw_ostream &operator<<(raw_ostream &OS,
+ const NonPacProtectedRetAnnotation &Diag) {
+ OS << "pac-ret<Ret:" << Diag.RetInst << ">";
+ return OS;
+}
+
+void NonPacProtectedRetAnnotation::print(raw_ostream &OS) const { OS << *this; }
+void NonPacProtectedRetGadget::print(raw_ostream &OS) const { OS << *this; }
+void NonPacProtectedRetGenDiag::print(raw_ostream &OS) const { OS << *this; }
+
// The security property that is checked is:
// When a register is used as the address to jump to in a return instruction,
// that register must either:
@@ -106,7 +129,7 @@ struct State {
/// track the set of last instructions that wrote to this register.
/// For pac-ret analysis, the expectation is that almost all return
/// instructions only use register `X30`, and therefore, this vector
- /// will have length 1 in the second run.
+ /// will probably have length 1 in the second run.
std::vector<SmallPtrSet<const MCInst *, 4>> LastInstWritingReg;
State() {}
State(uint16_t NumRegs, uint16_t NumRegsToTrack)
@@ -266,12 +289,12 @@ class PacRetAnalysis
}
}
- MCPhysReg AutReg = BC.MIB->getAuthenticatedReg(Point);
- if (AutReg != BC.MIB->getNoRegister()) {
+ ErrorOr<MCPhysReg> AutReg = BC.MIB->getAuthenticatedReg(Point);
+ if (AutReg && *AutReg != BC.MIB->getNoRegister()) {
Next.NonAutClobRegs.reset(
- BC.MIB->getAliases(AutReg, /*OnlySmaller=*/true));
- if (TrackingLastInsts && isTrackingReg(AutReg))
- Next.LastInstWritingReg[reg2StateIdx(AutReg)] = {};
+ BC.MIB->getAliases(*AutReg, /*OnlySmaller=*/true));
+ if (TrackingLastInsts && isTrackingReg(*AutReg))
+ Next.LastInstWritingReg[reg2StateIdx(*AutReg)] = {};
}
LLVM_DEBUG({
@@ -313,7 +336,7 @@ class PacRetAnalysis
}
};
-SmallSet<MCPhysReg, 1> NonPacProtectedRetAnalysis::computeDfState(
+FunctionAnalysisResult NonPacProtectedRetAnalysis::computeDfState(
PacRetAnalysis &PRA, BinaryFunction &BF,
MCPlusBuilder::AllocatorIdTy AllocatorId) {
PRA.run();
@@ -321,15 +344,25 @@ SmallSet<MCPhysReg, 1> NonPacProtectedRetAnalysis::computeDfState(
dbgs() << " After PacRetAnalysis:\n";
BF.dump();
});
+
+ FunctionAnalysisResult Result;
// Now scan the CFG for non-authenticating return instructions that use an
// overwritten, non-authenticated register as return address.
- SmallSet<MCPhysReg, 1> RetRegsWithGadgets;
BinaryContext &BC = BF.getBinaryContext();
for (BinaryBasicBlock &BB : BF) {
for (int64_t I = BB.size() - 1; I >= 0; --I) {
MCInst &Inst = BB.getInstructionAtIndex(I);
if (BC.MIB->isReturn(Inst)) {
- MCPhysReg RetReg = BC.MIB->getRegUsedAsRetDest(Inst);
+ ErrorOr<MCPhysReg> MaybeRetReg = BC.MIB->getRegUsedAsRetDest(Inst);
+ if (MaybeRetReg.getError()) {
+ Result.Diagnostics.push_back(
+ std::make_shared<NonPacProtectedRetGenDiag>(
+ MCInstInBBReference(&BB, I),
+ "Warning: pac-ret analysis could not analyze this return "
+ "instruction"));
+ continue;
+ }
+ MCPhysReg RetReg = *MaybeRetReg;
LLVM_DEBUG({
dbgs() << " Found RET inst: ";
BC.printInstruction(dbgs(), Inst);
@@ -355,28 +388,17 @@ SmallSet<MCPhysReg, 1> NonPacProtectedRetAnalysis::computeDfState(
});
if (DirtyRawRegs.any()) {
// This return instruction needs to be reported
- // First remove the annotation that the first, fast run of
- // the dataflow analysis may have added.
- if (BC.MIB->hasAnnotation(Inst, GadgetAnnotationIndex))
- BC.MIB->removeAnnotation(Inst, GadgetAnnotationIndex);
- BC.MIB->addAnnotation(
- Inst, GadgetAnnotationIndex,
- NonPacProtectedRetGadget(
+ Result.Diagnostics.push_back(
+ std::make_shared<NonPacProtectedRetGadget>(
MCInstInBBReference(&BB, I),
- PRA.getLastClobberingInsts(Inst, BF, DirtyRawRegs)),
- AllocatorId);
- LLVM_DEBUG({
- dbgs() << " Added gadget info annotation: ";
- BC.printInstruction(dbgs(), Inst, 0, &BF);
- dbgs() << "\n";
- });
+ PRA.getLastClobberingInsts(Inst, BF, DirtyRawRegs)));
for (MCPhysReg RetRegWithGadget : DirtyRawRegs.set_bits())
- RetRegsWithGadgets.insert(RetRegWithGadget);
+ Result.RegistersAffected.insert(RetRegWithGadget);
}
}
}
}
- return RetRegsWithGadgets;
+ return Result;
}
void NonPacProtectedRetAnalysis::runOnFunction(
@@ -389,18 +411,20 @@ void NonPacProtectedRetAnalysis::runOnFunction(
if (BF.hasCFG()) {
PacRetAnalysis PRA(BF, AllocatorId, {});
- SmallSet<MCPhysReg, 1> RetRegsWithGadgets =
- computeDfState(PRA, BF, AllocatorId);
- if (!RetRegsWithGadgets.empty()) {
+ FunctionAnalysisResult FAR = computeDfState(PRA, BF, AllocatorId);
+ if (!FAR.RegistersAffected.empty()) {
// Redo the analysis, but now also track which instructions last wrote
// to any of the registers in RetRegsWithGadgets, so that better
// diagnostics can be produced.
std::vector<MCPhysReg> RegsToTrack;
- for (MCPhysReg R : RetRegsWithGadgets)
+ for (MCPhysReg R : FAR.RegistersAffected)
RegsToTrack.push_back(R);
PacRetAnalysis PRWIA(BF, AllocatorId, RegsToTrack);
- SmallSet<MCPhysReg, 1> RetRegsWithGadgets =
- computeDfState(PRWIA, BF, AllocatorId);
+ FAR = computeDfState(PRWIA, BF, AllocatorId);
+ }
+ {
+ std::lock_guard<std::mutex> Lock(AnalysisResultsMutex);
+ AnalysisResults[&BF] = FAR;
}
}
}
@@ -422,9 +446,9 @@ void printBB(const BinaryContext &BC, const BinaryBasicBlock *BB,
}
}
-void reportFoundGadgetInSingleBBSingleOverwInst(const BinaryContext &BC,
- const MCInstReference OverwInst,
- const MCInstReference RetInst) {
+static void reportFoundGadgetInSingleBBSingleOverwInst(
+ raw_ostream &OS, const BinaryContext &BC, const MCInstReference OverwInst,
+ const MCInstReference RetInst) {
BinaryBasicBlock *BB = RetInst.getBasicBlock();
assert(OverwInst.ParentKind == MCInstReference::BasicBlockParent);
assert(RetInst.ParentKind == MCInstReference::BasicBlockParent);
@@ -432,55 +456,64 @@ void reportFoundGadgetInSingleBBSingleOverwInst(const BinaryContext &BC,
if (BB == OverwInstBB.BB) {
// overwriting inst and ret instruction are in the same basic block.
assert(OverwInstBB.BBIndex < RetInst.U.BBRef.BBIndex);
- outs() << " This happens in the following basic block:\n";
+ OS << " This happens in the following basic block:\n";
printBB(BC, BB);
}
}
-void reportFoundGadget(const BinaryContext &BC, const MCInst &Inst,
- unsigned int GadgetAnnotationIndex) {
- auto NPPRG = BC.MIB->getAnnotationAs<NonPacProtectedRetGadget>(
- Inst, GadgetAnnotationIndex);
- MCInstReference RetInst = NPPRG.RetInst;
+void NonPacProtectedRetGadget::generateReport(raw_ostream &OS,
+ const BinaryContext &BC) const {
BinaryFunction *BF = RetInst.getFunction();
BinaryBasicBlock *BB = RetInst.getBasicBlock();
- outs() << "\nGS-PACRET: " << "non-protected ret found in function "
- << BF->getPrintName();
+ OS << "\nGS-PACRET: " << "non-protected ret found in function "
+ << BF->getPrintName();
if (BB)
- outs() << ", basic block " << BB->getName();
- outs() << ", at address " << llvm::format("%x", RetInst.getAddress()) << "\n";
- outs() << " The return instruction is ";
- BC.printInstruction(outs(), RetInst, RetInst.getAddress(), BF);
- outs() << " The " << NPPRG.OverwritingRetRegInst.size()
- << " instructions that write to the return register after any "
- "authentication are:\n";
+ OS << ", basic block " << BB->getName();
+ OS << ", at address " << llvm::format("%x", RetInst.getAddress()) << "\n";
+ OS << " The return instruction is ";
+ BC.printInstruction(OS, RetInst, RetInst.getAddress(), BF);
+ OS << " The " << OverwritingRetRegInst.size()
+ << " instructions that write to the return register after any "
+ "authentication are:\n";
// Sort by address to ensure output is deterministic.
- llvm::sort(NPPRG.OverwritingRetRegInst,
- [](const MCInstReference &A, const MCInstReference &B) {
- return A.getAddress() < B.getAddress();
- });
- for (unsigned I = 0; I < NPPRG.OverwritingRetRegInst.size(); ++I) {
- MCInstReference InstRef = NPPRG.OverwritingRetRegInst[I];
- outs() << " " << (I + 1) << ". ";
- BC.printInstruction(outs(), InstRef, InstRef.getAddress(), BF);
+ std::vector<MCInstReference> ORRI = OverwritingRetRegInst;
+ llvm::sort(ORRI, [](const MCInstReference &A, const MCInstReference &B) {
+ return A.getAddress() < B.getAddress();
+ });
+ for (unsigned I = 0; I < ORRI.size(); ++I) {
+ MCInstReference InstRef = ORRI[I];
+ OS << " " << (I + 1) << ". ";
+ BC.printInstruction(OS, InstRef, InstRef.getAddress(), BF);
};
LLVM_DEBUG({
dbgs() << " .. OverWritingRetRegInst:\n";
- for (MCInstReference Ref : NPPRG.OverwritingRetRegInst) {
+ for (MCInstReference Ref : OverwritingRetRegInst) {
dbgs() << " " << Ref << "\n";
}
});
- if (NPPRG.OverwritingRetRegInst.size() == 1) {
- const MCInstReference OverwInst = NPPRG.OverwritingRetRegInst[0];
+ if (OverwritingRetRegInst.size() == 1) {
+ const MCInstReference OverwInst = OverwritingRetRegInst[0];
assert(OverwInst.ParentKind == MCInstReference::BasicBlockParent);
- reportFoundGadgetInSingleBBSingleOverwInst(BC, OverwInst, RetInst);
+ reportFoundGadgetInSingleBBSingleOverwInst(OS, BC, OverwInst, RetInst);
}
}
-Error NonPacProtectedRetAnalysis::runOnFunctions(BinaryContext &BC) {
- GadgetAnnotationIndex = BC.MIB->getOrCreateAnnotationIndex("pacret-gadget");
+void NonPacProtectedRetGenDiag::generateReport(raw_ostream &OS,
+ const BinaryContext &BC) const {
+ BinaryFunction *BF = RetInst.getFunction();
+ BinaryBasicBlock *BB = RetInst.getBasicBlock();
+ OS << "\nGS-PACRET: " << "" << Diag.Text;
+ OS << " in function " << BF->getPrintName();
+ if (BB)
+ OS << ", basic block " << BB->getName();
+ OS << ", at address " << llvm::format("%x", RetInst.getAddress()) << "\n";
+ OS << " The return instruction is ";
+ BC.printInstruction(OS, RetInst, RetInst.getAddress(), BF);
+}
+
+Error NonPacProtectedRetAnalysis::runOnFunctions(BinaryContext &BC) {
ParallelUtilities::WorkFuncWithAllocTy WorkFun =
[&](BinaryFunction &BF, MCPlusBuilder::AllocatorIdTy AllocatorId) {
runOnFunction(BF, AllocatorId);
@@ -495,14 +528,10 @@ Error NonPacProtectedRetAnalysis::runOnFunctions(BinaryContext &BC) {
SkipFunc, "NonPacProtectedRetAnalysis");
for (BinaryFunction *BF : BC.getAllBinaryFunctions())
- if (BF->hasCFG()) {
- for (BinaryBasicBlock &BB : *BF) {
- for (MCInst &Inst : BB) {
- if (BC.MIB->hasAnnotation(Inst, GadgetAnnotationIndex)) {
- reportFoundGadget(BC, Inst, GadgetAnnotationIndex);
- }
- }
- }
+ if (AnalysisResults.count(BF) > 0) {
+ for (const std::shared_ptr<NonPacProtectedRetAnnotation> &A :
+ AnalysisResults[BF].Diagnostics)
+ A->generateReport(outs(), BC);
}
return Error::success();
}
diff --git a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
index 01bb53d9610e98..ea37ec140079ae 100644
--- a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
+++ b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
@@ -149,21 +149,42 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
return false;
}
- MCPhysReg getAuthenticatedReg(const MCInst &Inst) const override {
+ ErrorOr<MCPhysReg> getAuthenticatedReg(const MCInst &Inst) const override {
switch (Inst.getOpcode()) {
case AArch64::AUTIAZ:
case AArch64::AUTIBZ:
case AArch64::AUTIASP:
case AArch64::AUTIBSP:
+ case AArch64::AUTIASPPCi:
+ case AArch64::AUTIBSPPCi:
+ case AArch64::AUTIASPPCr:
+ case AArch64::AUTIBSPPCr:
case AArch64::RETAA:
case AArch64::RETAB:
+ case AArch64::RETAASPPCi:
+ case AArch64::RETABSPPCi:
+ case AArch64::RETAASPPCr:
+ case AArch64::RETABSPPCr:
return AArch64::LR;
+
case AArch64::AUTIA1716:
case AArch64::AUTIB1716:
+ case AArch64::AUTIA171615:
+ case AArch64::AUTIB171615:
return AArch64::X17;
+
case AArch64::ERETAA:
case AArch64::ERETAB:
- return AArch64::LR;
+ // The ERETA{A,B} instructions use either register ELR_EL1, ELR_EL2 or
+ // ELR_EL3, depending on the current Exception Level at run-time.
+ //
+ // Furthermore, these registers are not modelled by LLVM as a regular
+ // MCPhysReg.... So there is no way to indicate that through the current
+ // API.
+ //
+ // Therefore, return an error to indicate that LLVM/BOLT cannot model
+ // this.
+ return make_error_code(std::errc::result_out_of_range);
case AArch64::AUTIA:
case AArch64::AUTIB:
@@ -175,29 +196,32 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
case AArch64::AUTDZB:
return Inst.getOperand(0).getReg();
+ // FIXME: BL?RA(A|B)Z? and LDRA(A|B) should probably be handled here too.
+
default:
return getNoRegister();
}
}
- bool isAuthenticationOfReg(const MCInst &Inst,
- MCPhysReg AuthenticatedReg) const override {
- if (AuthenticatedReg == getNoRegister())
+ bool isAuthenticationOfReg(const MCInst &Inst, MCPhysReg Reg) const override {
+ if (Reg == getNoRegister())
return false;
- return getAuthenticatedReg(Inst) == AuthenticatedReg;
+ ErrorOr<MCPhysReg> AuthenticatedReg = getAuthenticatedReg(Inst);
+ return AuthenticatedReg.getError() ? false : *AuthenticatedReg == Reg;
}
- MCPhysReg getRegUsedAsRetDest(const MCInst &Inst) const override {
+ ErrorOr<MCPhysReg> getRegUsedAsRetDest(const MCInst &Inst) const override {
assert(isReturn(Inst));
switch (Inst.getOpcode()) {
case AArch64::RET:
return Inst.getOperand(0).getReg();
case AArch64::RETAA:
case AArch64::RETAB:
+ return AArch64::LR;
+ case AArch64::ERET:
case AArch64::ERETAA:
case AArch64::ERETAB:
- case AArch64::ERET:
- return AArch64::LR;
+ return make_error_code(std::errc::result_out_of_range);
default:
llvm_unreachable("Unhandled return instruction");
}
diff --git a/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s b/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s
index 8731bb46362867..ac16346fd815bb 100644
--- a/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s
+++ b/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s
@@ -623,7 +623,8 @@ f_eretaa:
bl g
add x0, x0, #3
ldp x29, x30, [sp], #16
-// CHECK-NOT: function f_eretaa
+// CHECK-LABEL: GS-PACRET: Warning: pac-ret analysis could not analyze this return instruction in function f_eretaa, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT: The return instruction is {{[0-9a-f]+}}: eretaa
eretaa
.size f_eretaa, .-f_eretaa
@@ -636,7 +637,8 @@ f_eretab:
bl g
add x0, x0, #3
ldp x29, x30, [sp], #16
-// CHECK-NOT: function f_eretab
+// CHECK-LABEL: GS-PACRET: Warning: pac-ret analysis could not analyze this return instruction in function f_eretab, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT: The return instruction is {{[0-9a-f]+}}: eretab
eretab
.size f_eretab, .-f_eretab
@@ -649,18 +651,8 @@ f_eret:
bl g
add x0, x0, #3
ldp x29, x30, [sp], #16
-// CHECK-LABEL: GS-PACRET: non-protected ret found in function f_eret, basic block .LBB{{[0-9]+}}, at address
-// CHECK-NEXT: The return instruction is {{[0-9a-f]+}}: eret
-// CHECK-NEXT: The 1 instructions that write to the return register after any authentication are:
-// CHECK-NEXT: 1. {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
-// CHECK-NEXT: This happens in the following basic block:
-// CHECK-NEXT: {{[0-9a-f]+}}: paciasp
-// CHECK-NEXT: {{[0-9a-f]+}}: stp x29, x30, [sp, #-0x10]!
-// CHECK-NEXT: {{[0-9a-f]+}}: mov x29, sp
-// CHECK-NEXT: {{[0-9a-f]+}}: bl g at PLT
-// CHECK-NEXT: {{[0-9a-f]+}}: add x0, x0, #0x3
-// CHECK-NEXT: {{[0-9a-f]+}}: ldp x29, x30, [sp], #0x10
-// CHECK-NEXT: {{[0-9a-f]+}}: eret
+// CHECK-LABEL: GS-PACRET: Warning: pac-ret analysis could not analyze this return instruction in function f_eret, basic block .LBB{{[0-9]+}}, at address
+// CHECK-NEXT: The return instruction is {{[0-9a-f]+}}: eret
eret
.size f_eret, .-f_eret
@@ -678,4 +670,7 @@ f_movx30reg:
ret
.size f_movx30reg, .-f_movx30reg
+// FIXME: add regression tests for the instructions added in v9.5: AUTI{A,B}SPPC{i,r}, RETI{A,B}SPPC{i,r}, AUTI{A,B}171615.
+
+
// TODO: add test to see if registers clobbered by a call are picked up.
More information about the llvm-commits
mailing list