[llvm] [AArch64][v8.5A] Omit BTI for non-addr-taken static fns on Linux (PR #134669)
Simon Tatham via llvm-commits
llvm-commits at lists.llvm.org
Mon Apr 7 08:17:47 PDT 2025
https://github.com/statham-arm created https://github.com/llvm/llvm-project/pull/134669
This is a conditional revert of cca40aa8d8aa732, which made LLVM's branch-target-enforcement mode generate BTI at the start of _every_ function, even in the case where the function has internal linkage and its address is never taken for use in an indirect call.
The rationale was that it might turn out at link time that a direct call to the function spanned a larger distance than the range of a BL instruction (say, if the translation unit generated multiple code sections and the linker put them a very long way apart). Then the linker might insert a long-branch thunk using an indirect call instruction.
SYSVABI64 has now clarified that in this situation the static linker may not assume that the target function is safe to call directly. If it needs to use this strategy, it's responsible for also generating a 'landing pad' near the target function, with a BTI followed by a direct branch, and using that as the target of the long-distance indirect call.
https://github.com/ARM-software/abi-aa/commit/606ce44fe4d3419c15cd9ed598f18fb5d520fcfc
LLD complies with this spec as of commit 098b0d18add97de.
So if we're compiling in a mode that respects SYSVABI64, such as targeting Linux, it's safe to leave out the BTI at the start of a function with internal linkage if we can prove that its address isn't either used in an indirect call in _this_ translation unit or passed out of the object.
Therefore, this patch goes back to the behavior before cca40aa8d8aa732, leaving out BTIs in functions that can't be called indirectly, but only if the target triple is Linux. (I wasn't able to find a more precise query for "is this a SYSVABI64-compliant platform?", but Linux certainly is, and this check at least fails in the safe direction - if in doubt, we put in all the BTIs that might be necessary.)
>From f2e71e94b90904b0306eedd884935becbca8b8b6 Mon Sep 17 00:00:00 2001
From: Simon Tatham <simon.tatham at arm.com>
Date: Thu, 3 Apr 2025 15:00:23 +0100
Subject: [PATCH] [AArch64][v8.5A] Omit BTI for non-addr-taken static fns on
Linux
This is a conditional revert of cca40aa8d8aa732, which made LLVM's
branch-target-enforcement mode generate BTI at the start of _every_
function, even in the case where the function has internal linkage and
its address is never taken for use in an indirect call.
The rationale was that it might turn out at link time that a direct
call to the function spanned a larger distance than the range of a BL
instruction (say, if the translation unit generated multiple code
sections and the linker put them a very long way apart). Then the
linker might insert a long-branch thunk using an indirect call
instruction.
SYSVABI64 has now clarified that in this situation the static linker
may not assume that the target function is safe to call directly. If
it needs to use this strategy, it's responsible for also generating a
'landing pad' near the target function, with a BTI followed by a
direct branch, and using that as the target of the long-distance
indirect call.
https://github.com/ARM-software/abi-aa/commit/606ce44fe4d3419c15cd9ed598f18fb5d520fcfc
LLD complies with this spec as of commit 098b0d18add97de.
So if we're compiling in a mode that respects SYSVABI64, such as
targeting Linux, it's safe to leave out the BTI at the start of a
function with internal linkage if we can prove that its address isn't
either used in an indirect call in _this_ translation unit or passed
out of the object.
Therefore, this patch goes back to the behavior before
cca40aa8d8aa732, leaving out BTIs in functions that can't be called
indirectly, but only if the target triple is Linux. (I wasn't able to
find a more precise query for "is this a SYSVABI64-compliant
platform?", but Linux certainly is, and this check at least fails in
the safe direction - if in doubt, we put in all the BTIs that might be
necessary.)
---
.../Target/AArch64/AArch64BranchTargets.cpp | 20 +++++++++++++------
.../AArch64/patchable-function-entry-bti.ll | 16 ++++++++++-----
2 files changed, 25 insertions(+), 11 deletions(-)
diff --git a/llvm/lib/Target/AArch64/AArch64BranchTargets.cpp b/llvm/lib/Target/AArch64/AArch64BranchTargets.cpp
index b9feb83339d8d..c60fbb63c73ab 100644
--- a/llvm/lib/Target/AArch64/AArch64BranchTargets.cpp
+++ b/llvm/lib/Target/AArch64/AArch64BranchTargets.cpp
@@ -65,6 +65,7 @@ bool AArch64BranchTargets::runOnMachineFunction(MachineFunction &MF) {
LLVM_DEBUG(
dbgs() << "********** AArch64 Branch Targets **********\n"
<< "********** Function: " << MF.getName() << '\n');
+ const Function &F = MF.getFunction();
// LLVM does not consider basic blocks which are the targets of jump tables
// to be address-taken (the address can't escape anywhere else), but they are
@@ -78,16 +79,23 @@ bool AArch64BranchTargets::runOnMachineFunction(MachineFunction &MF) {
bool HasWinCFI = MF.hasWinCFI();
for (MachineBasicBlock &MBB : MF) {
bool CouldCall = false, CouldJump = false;
- // Even in cases where a function has internal linkage and is only called
- // directly in its translation unit, it can still be called indirectly if
- // the linker decides to add a thunk to it for whatever reason (say, for
- // example, if it is finally placed far from its call site and a BL is not
- // long-range enough). PLT entries and tail-calls use BR, but when they are
+ // If the function is address-taken or externally-visible, it could be
+ // indirectly called. PLT entries and tail-calls use BR, but when they are
// are in guarded pages should all use x16 or x17 to hold the called
// address, so we don't need to set CouldJump here. BR instructions in
// non-guarded pages (which might be non-BTI-aware code) are allowed to
// branch to a "BTI c" using any register.
- if (&MBB == &*MF.begin())
+ //
+ // For SysV targets, this is enough, because SYSVABI64 says that if the
+ // static linker later wants to use an indirect branch instruction in a
+ // long-branch thunk, it's also responsible for adding a 'landing pad' with
+ // a BTI, and pointing the indirect branch at that. However, at present
+ // this guarantee only holds for targets complying with SYSVABI64, so for
+ // other targets we must assume that `CouldCall` is _always_ true due to
+ // the risk of long-branch thunks at link time.
+ if (&MBB == &*MF.begin() &&
+ (!MF.getSubtarget<AArch64Subtarget>().isTargetLinux() ||
+ (F.hasAddressTaken() || !F.hasLocalLinkage())))
CouldCall = true;
// If the block itself is address-taken, it could be indirectly branched
diff --git a/llvm/test/CodeGen/AArch64/patchable-function-entry-bti.ll b/llvm/test/CodeGen/AArch64/patchable-function-entry-bti.ll
index 85f5f6fa4674a..6d5dfc9d8fae4 100644
--- a/llvm/test/CodeGen/AArch64/patchable-function-entry-bti.ll
+++ b/llvm/test/CodeGen/AArch64/patchable-function-entry-bti.ll
@@ -1,4 +1,5 @@
-; RUN: llc -mtriple=aarch64 -aarch64-min-jump-table-entries=4 %s -o - | FileCheck %s
+; RUN: llc -mtriple=aarch64-linux-gnu -aarch64-min-jump-table-entries=4 %s -o - | FileCheck %s --check-prefixes=CHECK,SYSV
+; RUN: llc -mtriple=aarch64-none-elf -aarch64-min-jump-table-entries=4 %s -o - | FileCheck %s --check-prefixes=CHECK,NONSYSV
define void @f0() "patchable-function-entry"="0" "branch-target-enforcement" {
; CHECK-LABEL: f0:
@@ -48,20 +49,25 @@ define void @f2_1() "patchable-function-entry"="1" "patchable-function-prefix"="
}
;; -fpatchable-function-entry=1 -mbranch-protection=bti
-;; We add BTI c even when the function has internal linkage
+;; For SysV compliant targets, we don't add BTI (or create the .Lpatch0 symbol)
+;; because the function has internal linkage and isn't address-taken. For
+;; non-SysV targets, we do add the BTI, because outside SYSVABI64 there's no
+;; spec preventing the static linker from using an indirect call instruction in
+;; a long-branch thunk inserted at link time.
define internal void @f1i(i64 %v) "patchable-function-entry"="1" "branch-target-enforcement" {
; CHECK-LABEL: f1i:
; CHECK-NEXT: .Lfunc_begin3:
; CHECK: // %bb.0:
-; CHECK-NEXT: hint #34
-; CHECK-NEXT: .Lpatch1:
+; NONSYSV-NEXT: hint #34
+; NONSYSV-NEXT: .Lpatch1:
; CHECK-NEXT: nop
;; Other basic blocks have BTI, but they don't affect our decision to not create .Lpatch0
; CHECK: .LBB{{.+}} // %sw.bb1
; CHECK-NEXT: hint #36
; CHECK: .section __patchable_function_entries,"awo", at progbits,f1i{{$}}
; CHECK-NEXT: .p2align 3
-; CHECK-NEXT: .xword .Lpatch1
+; NONSYSV-NEXT: .xword .Lpatch1
+; SYSV-NEXT: .xword .Lfunc_begin3
entry:
switch i64 %v, label %sw.bb0 [
i64 1, label %sw.bb1
More information about the llvm-commits
mailing list