[llvm] [BOLT][binary-analysis] Add initial pac-ret gadget scanner (PR #122304)
Kristof Beyls via llvm-commits
llvm-commits at lists.llvm.org
Mon Jan 20 07:49:54 PST 2025
================
@@ -9,9 +9,182 @@ analyses implemented in the BOLT libraries.
## Which binary analyses are implemented?
-At the moment, no binary analyses are implemented.
+* [Security scanners](#security-scanners)
+ * [pac-ret analysis](#pac-ret-analysis)
-The goal is to make it easy using a plug-in framework to add your own analyses.
+### Security scanners
+
+For the past 25 years, a large numbers of exploits have been built and used in
+the wild to undermine computer security. The majority of these exploits abuse
+memory vulnerabilities in programs, see evidence from
+[Microsoft](https://youtu.be/PjbGojjnBZQ?si=oCHCa0SHgaSNr6Gr&t=836),
+[Chromium](https://www.chromium.org/Home/chromium-security/memory-safety/) and
+[Android](https://security.googleblog.com/2021/01/data-driven-security-hardening-in.html).
+
+It is not surprising therefore, that a large number of mitigations have been
+added to instruction sets and toolchains to make it harder to build an exploit
+using a memory vulnerability. Examples are: stack canaries, stack clash,
+pac-ret, shadow stacks, arm64e, and many more.
+
+These mitigations guarantee a so-called "security property" on the binaries they
+produce. For example, for stack canaries, the security property is roughly that
+a canary is located on the stack between the set of saved variables and the set
+of local variables. For pac-ret, it is roughly that either the return address is
+never stored/retrieved to/from memory; or, there are no writes to the register
+containing the return address between an instruction authenticating it and a
+return instruction using it.
+
+From time to time, however, a bug gets found in the implementation of such
+mitigations in toolchains. Also, code that is written in assembler by hand
+requires the developer to ensure these security properties by hand.
+
+In short, it is sometimes found that a few places in the binary code are not
+protected as well as expected given the requested mitigations. Attackers could
+make use of those places (sometimes called gadgets) to circumvent the protection
+that the mitigation should give.
+
+One of the reasons that such gadgets, or holes in the mitigation implementation,
+exist is that typically the amount of testing and verification for these
+security properties is limited to checking results on specific examples.
+
+In comparison, for testing functional correctness, or for testing performance,
+toolchain and software in general typically get tested with large test suites
+and benchmarks. In contrast, this typically does not get done for testing the
+security properties of binary code.
+
+Unlike functional correctness where compilation errors result in test failures,
+and performance where speed and size differences are measurable, broken security
+properties cannot be easily observed using existing testing and benchmarking
+tools.
+
+The security scanners implemented in `llvm-bolt-binary-analysis` aim to enable
+the testing of security hardening in arbitrary programs and not just specific
+examples.
+
+
+#### pac-ret analysis
+
+`pac-ret` protection is a security hardening scheme implemented in compilers
+such as gcc and clang, using the command line option
+`-mbranch-protection=pac-ret`. This option is enabled by default on most widely
+used linux distributions.
+
+The hardening scheme mitigates
+[Return-Oriented Programming (ROP)](https://llsoftsec.github.io/llsoftsecbook/#return-oriented-programming)
+attacks by making sure that return addresses are only ever stored to memory with
+a cryptographic hash, called a
+["Pointer Authentication Code" (PAC)](https://llsoftsec.github.io/llsoftsecbook/#pointer-authentication),
+in the upper bits of the pointer. This makes it substantially harder for
+attackers to divert control flow by overwriting a return address with a
+different value.
+
+The hardening scheme relies on compilers producing different code sequences when
+processing return addresses, especially when these are stored to and retrieved
+from memory.
+
+The 'pac-ret' binary analysis can be invoked using the command line option
+`--scanners=pac-ret`. It makes `llvm-bolt-binary-analysis` scan through the
+provided binary, checking each function for the following security property:
+
+For each procedure and exception return instruction, the destination register
+must have one of the following properties:
+
+1. be immutable within the function, or
+2. the last write to the register must be by an authenticating instruction. This
+ includes combined authentication and return instructions such as `RETAA`.
+
+##### Example 1
+
+For example, a typical non-pac-ret-protected function looks as follows:
+
+```
+stp x29, x30, [sp, #-0x10]!
+mov x29, sp
+bl g at PLT
+add x0, x0, #0x3
+ldp x29, x30, [sp], #0x10
+ret
----------------
kbeyls wrote:
Should now be fixed in the latest commit
https://github.com/llvm/llvm-project/pull/122304
More information about the llvm-commits
mailing list