[lld] c4d9cd8 - [LLD][ELF][AArch64] Add BTI Aware long branch thunks (#108989)

via llvm-commits llvm-commits at lists.llvm.org
Tue Oct 1 05:12:33 PDT 2024


Author: Peter Smith
Date: 2024-10-01T13:12:29+01:00
New Revision: c4d9cd8b747cb399a61dd987eb95ad518eb15448

URL: https://github.com/llvm/llvm-project/commit/c4d9cd8b747cb399a61dd987eb95ad518eb15448
DIFF: https://github.com/llvm/llvm-project/commit/c4d9cd8b747cb399a61dd987eb95ad518eb15448.diff

LOG: [LLD][ELF][AArch64] Add BTI Aware long branch thunks (#108989)

When Branch Target Identification BTI is enabled all indirect branches
must target a BTI instruction. A long branch thunk is a source of
indirect branches. To date LLD has been assuming that the object
producer is responsible for putting a BTI instruction at all places the
linker might generate an indirect branch to. This is true for clang, but
not for GCC. GCC will elide the BTI instruction when it can prove that
there are no indirect branches from outside the translation unit(s). GNU
ld was fixed to generate a landing pad stub (gnu ld speak for thunk) for
the destination when a long range stub was needed [1].

This means that using GCC compiled objects with LLD may lead to LLD
generating an indirect branch to a location without a BTI. The ABI [2]
has also been clarified to say that it is a static linker's
responsibility to generate a landing pad when the target does not have a
BTI.

This patch implements the same mechansim as GNU ld. When the output ELF
file is setting the
GNU_PROPERTY_AARCH64_FEATURE_1_BTI property, then we check the
destination to see if it has a BTI instruction. If it does not we
generate a landing pad consisting of:
BTI c
B <destination>

The B <destination> can be elided if the thunk can be placed so that
control flow drops through. For example:
BTI c
<destination>:
This will be common when -ffunction-sections is used.

The landing pad thunks are effectively alternative entry points for the
function. Direct branches are unaffected but any linker generated
indirect branch needs to use the alternative. We place these as close as
possible to the destination section.

There is some further optimization possible. Consider the case:
.text
fn1
...
fn2
...

If we need landing pad thunks for both fn1 and fn2 we could order them
so that the thunk for fn1 immediately precedes fn1. This could save a
single branch. However I didn't think that would be worth the additional
complexity.

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106671
[2] https://github.com/ARM-software/abi-aa/issues/196

Added: 
    lld/test/ELF/aarch64-thunk-bti.s

Modified: 
    lld/ELF/Arch/AArch64.cpp
    lld/ELF/Relocations.cpp
    lld/ELF/Relocations.h
    lld/ELF/Target.h
    lld/ELF/Thunks.cpp
    lld/ELF/Thunks.h

Removed: 
    


################################################################################
diff  --git a/lld/ELF/Arch/AArch64.cpp b/lld/ELF/Arch/AArch64.cpp
index cfea605e2da601..45d429c915a6ee 100644
--- a/lld/ELF/Arch/AArch64.cpp
+++ b/lld/ELF/Arch/AArch64.cpp
@@ -28,6 +28,38 @@ uint64_t elf::getAArch64Page(uint64_t expr) {
   return expr & ~static_cast<uint64_t>(0xFFF);
 }
 
+// A BTI landing pad is a valid target for an indirect branch when the Branch
+// Target Identification has been enabled.  As linker generated branches are
+// via x16 the BTI landing pads are defined as: BTI C, BTI J, BTI JC, PACIASP,
+// PACIBSP.
+bool elf::isAArch64BTILandingPad(Symbol &s, int64_t a) {
+  // PLT entries accessed indirectly have a BTI c.
+  if (s.isInPlt())
+    return true;
+  Defined *d = dyn_cast<Defined>(&s);
+  if (!isa_and_nonnull<InputSection>(d->section))
+    // All places that we cannot disassemble are responsible for making
+    // the target a BTI landing pad.
+    return true;
+  InputSection *isec = cast<InputSection>(d->section);
+  uint64_t off = d->value + a;
+  // Likely user error, but protect ourselves against out of bounds
+  // access.
+  if (off >= isec->getSize())
+    return true;
+  const uint8_t *buf = isec->content().begin();
+  const uint32_t instr = read32le(buf + off);
+  // All BTI instructions are HINT instructions which all have same encoding
+  // apart from bits [11:5]
+  if ((instr & 0xd503201f) == 0xd503201f &&
+      is_contained({/*PACIASP*/ 0xd503233f, /*PACIBSP*/ 0xd503237f,
+                    /*BTI C*/ 0xd503245f, /*BTI J*/ 0xd503249f,
+                    /*BTI JC*/ 0xd50324df},
+                   instr))
+    return true;
+  return false;
+}
+
 namespace {
 class AArch64 : public TargetInfo {
 public:

diff  --git a/lld/ELF/Relocations.cpp b/lld/ELF/Relocations.cpp
index 3d4de56b6dfb35..078166e0d3f037 100644
--- a/lld/ELF/Relocations.cpp
+++ b/lld/ELF/Relocations.cpp
@@ -2265,6 +2265,15 @@ std::pair<Thunk *, bool> ThunkCreator::getThunk(InputSection *isec,
   return std::make_pair(t, true);
 }
 
+std::pair<Thunk *, bool> ThunkCreator::getSyntheticLandingPad(Defined &d,
+                                                              int64_t a) {
+  auto [it, isNew] = landingPadsBySectionAndAddend.try_emplace(
+      {{d.section, d.value}, a}, nullptr);
+  if (isNew)
+    it->second = addLandingPadThunk(ctx, d, a);
+  return {it->second, isNew};
+}
+
 // Return true if the relocation target is an in range Thunk.
 // Return false if the relocation is not to a Thunk. If the relocation target
 // was originally to a Thunk, but is no longer in range we revert the
@@ -2348,6 +2357,20 @@ bool ThunkCreator::createThunks(uint32_t pass,
                 ts = getISDThunkSec(os, isec, isd, rel, src);
               ts->addThunk(t);
               thunks[t->getThunkTargetSym()] = t;
+
+              // When indirect branches are restricted, such as AArch64 BTI
+              // Thunks may need to target a linker generated landing pad
+              // instead of the target.
+              if (t->needsSyntheticLandingPad()) {
+                Thunk *lpt;
+                auto &dr = cast<Defined>(t->destination);
+                std::tie(lpt, isNew) = getSyntheticLandingPad(dr, t->addend);
+                if (isNew) {
+                  ts = getISThunkSec(cast<InputSection>(dr.section));
+                  ts->addThunk(lpt);
+                }
+                t->landingPad = lpt->getThunkTargetSym();
+              }
             }
 
             // Redirect relocation to Thunk, we never go via the PLT to a Thunk

diff  --git a/lld/ELF/Relocations.h b/lld/ELF/Relocations.h
index 4d349f68d33ccb..64e67c2c968207 100644
--- a/lld/ELF/Relocations.h
+++ b/lld/ELF/Relocations.h
@@ -17,6 +17,7 @@
 
 namespace lld::elf {
 struct Ctx;
+class Defined;
 class Symbol;
 class InputSection;
 class InputSectionBase;
@@ -175,6 +176,8 @@ class ThunkCreator {
   std::pair<Thunk *, bool> getThunk(InputSection *isec, Relocation &rel,
                                     uint64_t src);
 
+  std::pair<Thunk *, bool> getSyntheticLandingPad(Defined &d, int64_t a);
+
   ThunkSection *addThunkSection(OutputSection *os, InputSectionDescription *,
                                 uint64_t off);
 
@@ -201,9 +204,18 @@ class ThunkCreator {
   // Track InputSections that have an inline ThunkSection placed in front
   // an inline ThunkSection may have control fall through to the section below
   // so we need to make sure that there is only one of them.
-  // The Mips LA25 Thunk is an example of an inline ThunkSection.
+  // The Mips LA25 Thunk is an example of an inline ThunkSection, as is
+  // the AArch64BTLandingPadThunk.
   llvm::DenseMap<InputSection *, ThunkSection *> thunkedSections;
 
+  // Record landing pads, generated for a section + offset destination.
+  // Landling pads are alternative entry points for destinations that need
+  // to be reached via thunks that use indirect branches. A destination
+  // needs at most one landing pad as that can be reused by all callers.
+  llvm::DenseMap<std::pair<std::pair<SectionBase *, uint64_t>, int64_t>,
+                 Thunk *>
+      landingPadsBySectionAndAddend;
+
   // The number of completed passes of createThunks this permits us
   // to do one time initialization on Pass 0 and put a limit on the
   // number of times it can be called to prevent infinite loops.

diff  --git a/lld/ELF/Target.h b/lld/ELF/Target.h
index 16944688f3cee9..f18770dfc424de 100644
--- a/lld/ELF/Target.h
+++ b/lld/ELF/Target.h
@@ -232,6 +232,7 @@ void writePrefixedInstruction(uint8_t *loc, uint64_t insn);
 void addPPC64SaveRestore();
 uint64_t getPPC64TocBase();
 uint64_t getAArch64Page(uint64_t expr);
+bool isAArch64BTILandingPad(Symbol &s, int64_t a);
 template <typename ELFT> void writeARMCmseImportLib();
 uint64_t getLoongArchPageDelta(uint64_t dest, uint64_t pc, RelType type);
 void riscvFinalizeRelax(int passes);

diff  --git a/lld/ELF/Thunks.cpp b/lld/ELF/Thunks.cpp
index dcb60330dbb12c..ef97530679469d 100644
--- a/lld/ELF/Thunks.cpp
+++ b/lld/ELF/Thunks.cpp
@@ -51,13 +51,20 @@ namespace {
 // distance from the thunk to the target is less than 128MB. Long thunks can
 // branch to any virtual address and they are implemented in the derived
 // classes. This class tries to create a short thunk if the target is in range,
-// otherwise it creates a long thunk.
+// otherwise it creates a long thunk. When BTI is enabled indirect branches
+// must land on a BTI instruction. If the destination does not have a BTI
+// instruction mayNeedLandingPad is set to true and Thunk::landingPad points
+// to an alternative entry point with a BTI.
 class AArch64Thunk : public Thunk {
 public:
-  AArch64Thunk(Ctx &ctx, Symbol &dest, int64_t addend)
-      : Thunk(ctx, dest, addend) {}
+  AArch64Thunk(Ctx &ctx, Symbol &dest, int64_t addend, bool mayNeedLandingPad)
+      : Thunk(ctx, dest, addend), mayNeedLandingPad(mayNeedLandingPad) {}
   bool getMayUseShortThunk();
   void writeTo(uint8_t *buf) override;
+  bool needsSyntheticLandingPad() override;
+
+protected:
+  bool mayNeedLandingPad;
 
 private:
   bool mayUseShortThunk = true;
@@ -67,8 +74,9 @@ class AArch64Thunk : public Thunk {
 // AArch64 long range Thunks.
 class AArch64ABSLongThunk final : public AArch64Thunk {
 public:
-  AArch64ABSLongThunk(Ctx &ctx, Symbol &dest, int64_t addend)
-      : AArch64Thunk(ctx, dest, addend) {}
+  AArch64ABSLongThunk(Ctx &ctx, Symbol &dest, int64_t addend,
+                      bool mayNeedLandingPad)
+      : AArch64Thunk(ctx, dest, addend, mayNeedLandingPad) {}
   uint32_t size() override { return getMayUseShortThunk() ? 4 : 16; }
   void addSymbols(ThunkSection &isec) override;
 
@@ -78,8 +86,9 @@ class AArch64ABSLongThunk final : public AArch64Thunk {
 
 class AArch64ADRPThunk final : public AArch64Thunk {
 public:
-  AArch64ADRPThunk(Ctx &ctx, Symbol &dest, int64_t addend)
-      : AArch64Thunk(ctx, dest, addend) {}
+  AArch64ADRPThunk(Ctx &ctx, Symbol &dest, int64_t addend,
+                   bool mayNeedLandingPad)
+      : AArch64Thunk(ctx, dest, addend, mayNeedLandingPad) {}
   uint32_t size() override { return getMayUseShortThunk() ? 4 : 12; }
   void addSymbols(ThunkSection &isec) override;
 
@@ -87,6 +96,26 @@ class AArch64ADRPThunk final : public AArch64Thunk {
   void writeLong(uint8_t *buf) override;
 };
 
+// AArch64 BTI Landing Pad
+// When BTI is enabled indirect branches must land on a BTI
+// compatible instruction. When the destination does not have a
+// BTI compatible instruction a Thunk doing an indirect branch
+// targets a Landing Pad Thunk that direct branches to the target.
+class AArch64BTILandingPadThunk final : public Thunk {
+public:
+  AArch64BTILandingPadThunk(Ctx &ctx, Symbol &dest, int64_t addend)
+      : Thunk(ctx, dest, addend) {}
+
+  uint32_t size() override { return getMayUseShortThunk() ? 4 : 8; }
+  void addSymbols(ThunkSection &isec) override;
+  void writeTo(uint8_t *buf) override;
+
+private:
+  bool getMayUseShortThunk();
+  void writeLong(uint8_t *buf);
+  bool mayUseShortThunk = true;
+};
+
 // Base class for ARM thunks.
 //
 // An ARM thunk may be either short or long. A short thunk is simply a branch
@@ -545,6 +574,12 @@ void AArch64Thunk::writeTo(uint8_t *buf) {
   ctx.target->relocateNoSym(buf, R_AARCH64_CALL26, s - p);
 }
 
+bool AArch64Thunk::needsSyntheticLandingPad() {
+  // Short Thunks use a direct branch, no synthetic landing pad
+  // required.
+  return mayNeedLandingPad && !getMayUseShortThunk();
+}
+
 // AArch64 long range Thunks.
 void AArch64ABSLongThunk::writeLong(uint8_t *buf) {
   const uint8_t data[] = {
@@ -553,7 +588,11 @@ void AArch64ABSLongThunk::writeLong(uint8_t *buf) {
     0x00, 0x00, 0x00, 0x00, // L0: .xword S
     0x00, 0x00, 0x00, 0x00,
   };
-  uint64_t s = getAArch64ThunkDestVA(destination, addend);
+  // If mayNeedLandingPad is true then destination is an
+  // AArch64BTILandingPadThunk that defines landingPad.
+  assert(!mayNeedLandingPad || landingPad != nullptr);
+  uint64_t s = mayNeedLandingPad ? landingPad->getVA(0)
+                                 : getAArch64ThunkDestVA(destination, addend);
   memcpy(buf, data, sizeof(data));
   ctx.target->relocateNoSym(buf + 8, R_AARCH64_ABS64, s);
 }
@@ -577,7 +616,11 @@ void AArch64ADRPThunk::writeLong(uint8_t *buf) {
       0x10, 0x02, 0x00, 0x91, // add  x16, x16, R_AARCH64_ADD_ABS_LO12_NC(Dest)
       0x00, 0x02, 0x1f, 0xd6, // br   x16
   };
-  uint64_t s = getAArch64ThunkDestVA(destination, addend);
+  // if mayNeedLandingPad is true then destination is an
+  // AArch64BTILandingPadThunk that defines landingPad.
+  assert(!mayNeedLandingPad || landingPad != nullptr);
+  uint64_t s = mayNeedLandingPad ? landingPad->getVA(0)
+                                 : getAArch64ThunkDestVA(destination, addend);
   uint64_t p = getThunkTargetSym()->getVA();
   memcpy(buf, data, sizeof(data));
   ctx.target->relocateNoSym(buf, R_AARCH64_ADR_PREL_PG_HI21,
@@ -591,6 +634,47 @@ void AArch64ADRPThunk::addSymbols(ThunkSection &isec) {
   addSymbol("$x", STT_NOTYPE, 0, isec);
 }
 
+void AArch64BTILandingPadThunk::addSymbols(ThunkSection &isec) {
+  addSymbol(saver().save("__AArch64BTIThunk_" + destination.getName()),
+            STT_FUNC, 0, isec);
+  addSymbol("$x", STT_NOTYPE, 0, isec);
+}
+
+void AArch64BTILandingPadThunk::writeTo(uint8_t *buf) {
+  if (!getMayUseShortThunk()) {
+    writeLong(buf);
+    return;
+  }
+  write32(buf, 0xd503245f); // BTI c
+  // Control falls through to target in following section.
+}
+
+bool AArch64BTILandingPadThunk::getMayUseShortThunk() {
+  if (!mayUseShortThunk)
+    return false;
+  // If the target is the following instruction then we can fall
+  // through without the indirect branch.
+  uint64_t s = destination.getVA(addend);
+  uint64_t p = getThunkTargetSym()->getVA();
+  // This function is called before addresses are stable.  We need to
+  // work out the range from the thunk to the next section but the
+  // address of the start of the next section depends on the size of
+  // the thunks in the previous pass.  s - p + offset == 0 represents
+  // the first pass where the Thunk and following section are assigned
+  // the same offset.  s - p <= 4 is the last Thunk in the Thunk
+  // Section.
+  mayUseShortThunk = (s - p + offset == 0 || s - p <= 4);
+  return mayUseShortThunk;
+}
+
+void AArch64BTILandingPadThunk::writeLong(uint8_t *buf) {
+  uint64_t s = destination.getVA(addend);
+  uint64_t p = getThunkTargetSym()->getVA() + 4;
+  write32(buf, 0xd503245f);     // BTI c
+  write32(buf + 4, 0x14000000); // B S
+  ctx.target->relocateNoSym(buf + 4, R_AARCH64_CALL26, s - p);
+}
+
 // ARM Target Thunks
 static uint64_t getARMThunkDestVA(const Symbol &s) {
   uint64_t v = s.isInPlt() ? s.getPltVA() : s.getVA();
@@ -1279,9 +1363,12 @@ static Thunk *addThunkAArch64(Ctx &ctx, RelType type, Symbol &s, int64_t a) {
   if (type != R_AARCH64_CALL26 && type != R_AARCH64_JUMP26 &&
       type != R_AARCH64_PLT32)
     fatal("unrecognized relocation type");
+  bool mayNeedLandingPad =
+      (ctx.arg.andFeatures & GNU_PROPERTY_AARCH64_FEATURE_1_BTI) &&
+      !isAArch64BTILandingPad(s, a);
   if (ctx.arg.picThunk)
-    return make<AArch64ADRPThunk>(ctx, s, a);
-  return make<AArch64ABSLongThunk>(ctx, s, a);
+    return make<AArch64ADRPThunk>(ctx, s, a, mayNeedLandingPad);
+  return make<AArch64ABSLongThunk>(ctx, s, a, mayNeedLandingPad);
 }
 
 // Creates a thunk for long branches or Thumb-ARM interworking.
@@ -1495,3 +1582,12 @@ Thunk *elf::addThunk(Ctx &ctx, const InputSection &isec, Relocation &rel) {
     llvm_unreachable("add Thunk only supported for ARM, AVR, Mips and PowerPC");
   }
 }
+
+Thunk *elf::addLandingPadThunk(Ctx &ctx, Symbol &s, int64_t a) {
+  switch (ctx.arg.emachine) {
+  case EM_AARCH64:
+    return make<AArch64BTILandingPadThunk>(ctx, s, a);
+  default:
+    llvm_unreachable("add landing pad only supported for AArch64");
+  }
+}

diff  --git a/lld/ELF/Thunks.h b/lld/ELF/Thunks.h
index 678bc483986d50..3929aa0aee8114 100644
--- a/lld/ELF/Thunks.h
+++ b/lld/ELF/Thunks.h
@@ -55,11 +55,18 @@ class Thunk {
     return true;
   }
 
+  // Thunks that indirectly branch to targets may need a synthetic landing
+  // pad generated close to the target. For example AArch64 when BTI is
+  // enabled.
+  virtual bool needsSyntheticLandingPad() { return false; }
+
   Defined *getThunkTargetSym() const { return syms[0]; }
 
   Ctx &ctx;
   Symbol &destination;
   int64_t addend;
+  // Alternative target when indirect branch to destination can't be used.
+  Symbol *landingPad = nullptr;
   llvm::SmallVector<Defined *, 3> syms;
   uint64_t offset = 0;
   // The alignment requirement for this Thunk, defaults to the size of the
@@ -71,6 +78,10 @@ class Thunk {
 // ThunkSection.
 Thunk *addThunk(Ctx &, const InputSection &isec, Relocation &rel);
 
+// Create a landing pad Thunk for use when indirect branches from Thunks
+// are restricted.
+Thunk *addLandingPadThunk(Ctx &, Symbol &s, int64_t a);
+
 void writePPC32PltCallStub(Ctx &, uint8_t *buf, uint64_t gotPltVA,
                            const InputFile *file, int64_t addend);
 void writePPC64LoadAndBranch(uint8_t *buf, int64_t offset);

diff  --git a/lld/test/ELF/aarch64-thunk-bti.s b/lld/test/ELF/aarch64-thunk-bti.s
new file mode 100644
index 00000000000000..a16e1569f358e3
--- /dev/null
+++ b/lld/test/ELF/aarch64-thunk-bti.s
@@ -0,0 +1,482 @@
+// REQUIRES: aarch64
+// RUN: rm -rf %t && split-file %s %t && cd %t
+// RUN: llvm-mc -filetype=obj -triple=aarch64 asm -o a.o
+// RUN: ld.lld --threads=1 --shared --script=lds a.o -o out.so --defsym absolute=0xf0000000
+// RUN: llvm-objdump -d --no-show-raw-insn out.so | FileCheck %s
+// RUN: llvm-objdump -d --no-show-raw-insn out.so | FileCheck %s --check-prefix=CHECK-PADS
+// RUN: llvm-mc -filetype=obj -triple=aarch64 shared -o shared.o
+// RUN: ld.lld --shared -o shared.so shared.o
+// RUN: ld.lld shared.so --script=lds a.o -o exe --defsym absolute=0xf0000000
+// RUN: llvm-objdump -d --no-show-raw-insn exe | FileCheck %s --check-prefix=CHECK-EXE
+// RUN: llvm-objdump -d --no-show-raw-insn exe | FileCheck %s --check-prefix=CHECK-PADS
+
+/// Test thunk generation when destination does not have a BTI compatible
+/// landing pad. Linker must generate landing pad sections for thunks that use
+/// indirect branches.
+
+//--- asm
+.section ".note.gnu.property", "a"
+.p2align 3
+.long 4
+.long 0x10
+.long 0x5
+.asciz "GNU"
+
+/// Enable BTI.
+.long 0xc0000000 // GNU_PROPERTY_AARCH64_FEATURE_1_AND.
+.long 4
+.long 1          // GNU_PROPERTY_AARCH64_FEATURE_1_BTI.
+.long 0
+
+
+/// Short thunks are direct branches so we don't need landing pads. Expect
+/// all thunks to branch directly to target.
+.section .text.0, "ax", %progbits
+.balign 0x1000
+.global _start
+.type _start, %function
+_start:
+ bl bti_c_target
+ bl bti_j_target
+ bl bti_jc_target
+ bl paciasp_target
+ bl pacibsp_target
+ bl .text.2 + 0x4 // fn2
+ b  .text.2 + 0x4 // fn2
+ bl fn1
+ b  fn1
+ bl fn3
+ b  fn3
+ bl fn4
+ b  fn4
+ bl via_plt
+/// We cannot add landing pads for absolute symbols.
+ bl absolute
+
+/// padding so that we require thunks that can be placed after this section.
+/// The thunks are close enough to the target to be short.
+ .space 0x1000
+
+// CHECK-PADS-LABEL: <_start>:
+// CHECK-PADS-NEXT: 10001000: bl      0x1000203c
+// CHECK-PADS-NEXT:           bl      0x10002040
+// CHECK-PADS-NEXT:           bl      0x10002044
+// CHECK-PADS-NEXT:           bl      0x10002048
+// CHECK-PADS-NEXT:           bl      0x1000204c
+// CHECK-PADS-NEXT:           bl      0x10002050
+// CHECK-PADS-NEXT:           b       0x10002050
+// CHECK-PADS-NEXT:           bl      0x10002054
+// CHECK-PADS-NEXT:           b       0x10002054
+// CHECK-PADS-NEXT:           bl      0x10002058
+// CHECK-PADS-NEXT:           b       0x10002058
+// CHECK-PADS-NEXT:           bl      0x1000205c
+// CHECK-PADS-NEXT:           b       0x1000205c
+// CHECK-PADS-NEXT:           bl      0x10002060
+// CHECK-PADS-NEXT:           bl      0x10002064
+
+// CHECK-LABEL: <__AArch64ADRPThunk_>:
+// CHECK-NEXT: 1000203c: b       0x18001000 <bti_c_target>
+
+// CHECK-LABEL: <__AArch64ADRPThunk_>:
+// CHECK-NEXT: 10002040: b       0x18001008 <bti_j_target>
+
+// CHECK-LABEL: <__AArch64ADRPThunk_>:
+// CHECK-NEXT: 10002044: b       0x18001010 <bti_jc_target>
+
+// CHECK-LABEL: <__AArch64ADRPThunk_>:
+// CHECK-NEXT: 10002048: b       0x18001018 <paciasp_target>
+
+// CHECK-LABEL: <__AArch64ADRPThunk_>:
+// CHECK-NEXT: 1000204c: b       0x18001020 <pacibsp_target>
+
+// CHECK-LABEL: <__AArch64ADRPThunk_>:
+// CHECK-NEXT: 10002050: b       0x18001038 <fn2>
+
+// CHECK-LABEL: <__AArch64ADRPThunk_>:
+// CHECK-NEXT: 10002054:       b       0x18001034 <fn1>
+
+// CHECK-LABEL: <__AArch64ADRPThunk_>:
+// CHECK-NEXT: 10002058:       b       0x18001040 <fn3>
+
+// CHECK-LABEL: <__AArch64ADRPThunk_>:
+// CHECK-NEXT: 1000205c:       b       0x18001050 <fn4>
+
+// CHECK-LABEL: <__AArch64ADRPThunk_via_plt>:
+// CHECK-NEXT: 10002060:       b       0x18001080 <via_plt at plt>
+
+// CHECK-LABEL: <__AArch64ADRPThunk_absolute>:
+// CHECK-NEXT: 10002064:       b       0x18001098 <absolute at plt>
+
+// CHECK-EXE-LABEL: <__AArch64AbsLongThunk_>:
+// CHECK-EXE-NEXT: 1000203c: b       0x18001000 <bti_c_target>
+
+// CHECK-EXE-LABEL: <__AArch64AbsLongThunk_>:
+// CHECK-EXE-NEXT: 10002040: b       0x18001008 <bti_j_target>
+
+// CHECK-EXE-LABEL: <__AArch64AbsLongThunk_>:
+// CHECK-EXE-NEXT: 10002044: b       0x18001010 <bti_jc_target>
+
+// CHECK-EXE-LABEL: <__AArch64AbsLongThunk_>:
+// CHECK-EXE-NEXT: 10002048: b       0x18001018 <paciasp_target>
+
+// CHECK-EXE-LABEL: <__AArch64AbsLongThunk_>:
+// CHECK-EXE-NEXT: 1000204c: b       0x18001020 <pacibsp_target>
+
+// CHECK-EXE-LABEL: <__AArch64AbsLongThunk_>:
+// CHECK-EXE-NEXT: 10002050: b       0x18001038 <fn2>
+
+// CHECK-EXE-LABEL: <__AArch64AbsLongThunk_>:
+// CHECK-EXE-NEXT: 10002054: b       0x18001034 <fn1>
+
+// CHECK-EXE-LABEL: <__AArch64AbsLongThunk_>:
+// CHECK-EXE-NEXT: 10002058: b       0x18001040 <fn3>
+
+// CHECK-EXE-LABEL: <__AArch64AbsLongThunk_>:
+// CHECK-EXE-NEXT: 1000205c: b       0x18001050 <fn4>
+
+// CHECK-EXE-LABEL: <__AArch64AbsLongThunk_via_plt>:
+// CHECK-EXE-NEXT: 10002060: b       0x18001080 <via_plt at plt>
+
+// CHECK-EXE-LABEL: <__AArch64AbsLongThunk_absolute>:
+// CHECK-EXE-NEXT: 10002064:   ldr     x16, 0x1000206c <__AArch64AbsLongThunk_absolute+0x8>
+// CHECK-EXE-NEXT:             br      x16
+// CHECK-EXE-NEXT: 00 00 00 f0 .word   0xf0000000
+// CHECK-EXE-NEXT: 00 00 00 00 .word   0x00000000
+
+.section .text.1, "ax", %progbits
+/// These indirect branch targets already have a BTI compatible landing pad,
+/// no alternative entry point required.
+.hidden bti_c_target
+.type bti_c_target, %function
+bti_c_target:
+ bti c
+ ret
+
+.hidden bti_j_target
+.type bti_j_target, %function
+bti_j_target:
+ bti j
+ ret
+
+.hidden bti_jc_target
+.type bti_jc_target, %function
+bti_jc_target:
+ bti jc
+ ret
+
+.hidden paciasp_target
+.type paciasp_target, %function
+paciasp_target:
+ paciasp
+ ret
+
+.hidden pacibsp_target
+.type pacibsp_target, %function
+pacibsp_target:
+ pacibsp
+ ret
+
+// CHECk-PADS-LABEL: <bti_c_target>:
+// CHECK-PADS: 18001000:      bti     c
+// CHECK-PADS-NEXT:           ret
+
+// CHECK-PADS-LABEL: <bti_j_target>:
+// CHECK-PADS-NEXT: 18001008: bti     j
+// CHECK-PADS-NEXT:           ret
+
+// CHECK-PADS-LABEL: <bti_jc_target>:
+// CHECK-PADS-NEXT: 18001010: bti     jc
+// CHECK-PADS-NEXT:           ret
+
+// CHECK-PADS-LABEL: <paciasp_target>:
+// CHECK-PADS-NEXT: 18001018: paciasp
+// CHECK-PADS-NEXT:           ret
+
+// CHECK-PADS-LABEL: <pacibsp_target>:
+// CHECK-PADS-NEXT: 18001020: pacibsp
+// CHECK-PADS-NEXT:           ret
+
+/// These functions do not have BTI compatible landing pads. Expect linker
+/// generated landing pads for indirect branch thunks.
+.section .text.2, "ax", %progbits
+.hidden fn1
+.type fn1, %function
+fn1:
+ ret
+.hidden fn2
+.type fn2, %function
+fn2:
+ ret
+
+// CHECK-PADS-LABEL: <__AArch64BTIThunk_>:
+// CHECK-PADS-NEXT: 18001028: bti     c
+// CHECK-PADS-NEXT:           b       0x18001038 <fn2>
+
+// CHECK-PADS-LABEL: <__AArch64BTIThunk_>:
+// CHECK-PADS-NEXT: 18001030: bti     c
+
+// CHECK-PADS-LABEL: <fn1>:
+// CHECK-PADS-NEXT: 18001034: ret
+
+// CHECK-PADS-LABEL <fn2>:
+// CHECK-PADS:      18001038: ret
+
+/// Section with only one function at offset 0. Landing pad should be able to
+/// fall through.
+.section .text.3, "ax", %progbits
+.hidden fn3
+.type fn3, %function
+fn3:
+ ret
+
+// CHECK-PADS-LABEL: <__AArch64BTIThunk_>:
+// CHECK-PADS-NEXT: 1800103c: bti     c
+
+// CHECK-PADS-LABEL: <fn3>:
+// CHECK-PADS-NEXT: 18001040: ret
+
+/// Section with only one function at offset 0, also with a high alignment
+/// requirement. Check that we don't fall through into alignment padding.
+.section .text.4, "ax", %progbits
+.hidden fn4
+.type fn4, %function
+.balign 16
+fn4:
+ ret
+
+// CHECK-PADS-LABEL: <__AArch64BTIThunk_>:
+// CHECK-PADS:      18001044: bti     c
+// CHECK-PADS-NEXT:           b       0x18001050 <fn4>
+// CHECK-PADS-NEXT:           udf     #0x0
+
+// CHECK-PADS-LABEL: <fn4>:
+// CHECK-PADS-NEXT: 18001050: ret
+
+.section .long_calls, "ax", %progbits
+.global long_calls
+.type long_calls, %function
+long_calls:
+/// Expect thunk to target as targets have BTI or implicit BTI.
+ bl bti_c_target
+ bl bti_j_target
+ bl bti_jc_target
+ bl paciasp_target
+ bl pacibsp_target
+/// Expect thunk to target a linker generated entry point with BTI landing pad.
+/// Two calls to make sure only one landing pad is created.
+ bl .text.2 + 0x4 // fn2
+ b  .text.2 + 0x4 // fn2
+/// fn2 before fn1 so that landing pad for fn1 can fall through.
+ bl fn1
+ b  fn1
+ bl fn3
+ b  fn3
+ bl fn4
+ b  fn4
+/// PLT entries reachable via Thunks have a BTI c at the start of each entry
+/// so no additional landing pad required.
+ bl via_plt
+/// We cannot add landing pads for absolute symbols.
+ bl absolute
+
+/// PLT entries have BTI at start.
+// CHECK-LABEL: <via_plt at plt>:
+// CHECK-NEXT:           bti     c
+// CHECK-NEXT:           adrp    x16, 0x30000000
+// CHECK-NEXT:           ldr     x17, [x16, #0x198]
+// CHECK-NEXT:           add     x16, x16, #0x198
+// CHECK-NEXT:           br      x17
+// CHECK-NEXT:           nop
+
+// CHECK: <absolute at plt>:
+// CHECK-NEXT:           bti     c
+// CHECK-NEXT:           adrp    x16, 0x30000000
+// CHECK-NEXT:           ldr     x17, [x16, #0x1a0]
+// CHECK-NEXT:           add     x16, x16, #0x1a0
+// CHECK-NEXT:           br      x17
+// CHECK-NEXT:           nop
+
+// CHECK-EXE-LABEL: <via_plt at plt>:
+// CHECK-EXE-NEXT: 18001080: bti     c
+// CHECK-EXE-NEXT:           adrp    x16, 0x30000000
+// CHECK-EXE-NEXT:           ldr     x17, [x16, #0x1e8]
+// CHECK-EXE-NEXT:           add     x16, x16, #0x1e8
+// CHECK-EXE-NEXT:           br      x17
+// CHECK-EXE-NEXT:           nop
+
+// CHECK-LABEL: <long_calls>:
+// CHECK-NEXT: 30000000: bl      0x3000003c <__AArch64ADRPThunk_>
+// CHECK-NEXT:           bl      0x30000048 <__AArch64ADRPThunk_>
+// CHECK-NEXT:           bl      0x30000054 <__AArch64ADRPThunk_>
+// CHECK-NEXT:           bl      0x30000060 <__AArch64ADRPThunk_>
+// CHECK-NEXT:           bl      0x3000006c <__AArch64ADRPThunk_>
+// CHECK-NEXT:           bl      0x30000078 <__AArch64ADRPThunk_>
+// CHECK-NEXT:           b       0x30000078 <__AArch64ADRPThunk_>
+// CHECK-NEXT:           bl      0x30000084 <__AArch64ADRPThunk_>
+// CHECK-NEXT:           b       0x30000084 <__AArch64ADRPThunk_>
+// CHECK-NEXT:           bl      0x30000090 <__AArch64ADRPThunk_>
+// CHECK-NEXT:           b       0x30000090 <__AArch64ADRPThunk_>
+// CHECK-NEXT:           bl      0x3000009c <__AArch64ADRPThunk_>
+// CHECK-NEXT:           b       0x3000009c <__AArch64ADRPThunk_>
+// CHECK-NEXT:           bl      0x300000a8 <__AArch64ADRPThunk_via_plt>
+// CHECK-NEXT:           bl      0x300000b4 <__AArch64ADRPThunk_absolute>
+
+/// bti_c_target.
+// CHECK-LABEL: <__AArch64ADRPThunk_>:
+// CHECK-NEXT: 3000003c: adrp    x16, 0x18001000 <bti_c_target>
+// CHECK-NEXT:           add     x16, x16, #0x0
+// CHECK-NEXT:           br      x16
+/// bti_j_target.
+// CHECK-LABEL: <__AArch64ADRPThunk_>:
+// CHECK-NEXT:           adrp    x16, 0x18001000 <bti_c_target>
+// CHECK-NEXT:           add     x16, x16, #0x8
+// CHECK-NEXT:           br      x16
+/// bti_jc_target.
+// CHECK-LABEL: <__AArch64ADRPThunk_>:
+// CHECK-NEXT:           adrp    x16, 0x18001000 <bti_c_target>
+// CHECK-NEXT:           add     x16, x16, #0x10
+// CHECK-NEXT:           br      x16
+/// paciasp_target.
+// CHECK-LABEL: <__AArch64ADRPThunk_>:
+// CHECK-NEXT:           adrp    x16, 0x18001000 <bti_c_target>
+// CHECK-NEXT:           add     x16, x16, #0x18
+// CHECK-NEXT:           br      x16
+/// pacibsp_target.
+// CHECK-LABEL: <__AArch64ADRPThunk_>:
+// CHECK-NEXT:           adrp    x16, 0x18001000 <bti_c_target>
+// CHECK-NEXT:           add     x16, x16, #0x20
+// CHECK-NEXT:           br      x16
+/// Landing pad for fn2.
+// CHECK-LABEL: <__AArch64ADRPThunk_>:
+// CHECK-NEXT:           adrp    x16, 0x18001000 <bti_c_target>
+// CHECK-NEXT:           add     x16, x16, #0x28
+// CHECK-NEXT:           br      x16
+/// Landing pad for fn1.
+// CHECK-LABEL: <__AArch64ADRPThunk_>:
+// CHECK-NEXT:           adrp    x16, 0x18001000 <bti_c_target>
+// CHECK-NEXT:           add     x16, x16, #0x30
+// CHECK-NEXT:           br      x16
+/// Landing pad for fn3.
+// CHECK-LABEL: <__AArch64ADRPThunk_>:
+// CHECK-NEXT:           adrp    x16, 0x18001000 <bti_c_target>
+// CHECK-NEXT:           add     x16, x16, #0x3c
+// CHECK-NEXT:           br      x16
+/// Landing pad for fn4.
+// CHECK-LABEL: <__AArch64ADRPThunk_>:
+// CHECK-NEXT:           adrp    x16, 0x18001000 <bti_c_target>
+// CHECK-NEXT:           add     x16, x16, #0x44
+// CHECK-NEXT:           br      x16
+
+// CHECK-LABEL: <__AArch64ADRPThunk_via_plt>:
+// CHECK-NEXT:           adrp    x16, 0x18001000 <bti_c_target>
+// CHECK-NEXT:           add     x16, x16, #0x80
+// CHECK-NEXT:           br      x16
+
+// CHECK-LABEL: <__AArch64ADRPThunk_absolute>:
+// CHECK-NEXT:           adrp    x16, 0x18001000 <bti_c_target>
+// CHECK-NEXT:           add     x16, x16, #0x98
+// CHECK-NEXT:           br      x16
+
+// CHECK-EXE-LABEL: <long_calls>:
+// CHECK-EXE-NEXT: 30000000: bl      0x3000003c <__AArch64AbsLongThunk_>
+// CHECK-EXE-NEXT:           bl      0x3000004c <__AArch64AbsLongThunk_>
+// CHECK-EXE-NEXT:           bl      0x3000005c <__AArch64AbsLongThunk_>
+// CHECK-EXE-NEXT:           bl      0x3000006c <__AArch64AbsLongThunk_>
+// CHECK-EXE-NEXT:           bl      0x3000007c <__AArch64AbsLongThunk_>
+// CHECK-EXE-NEXT:           bl      0x3000008c <__AArch64AbsLongThunk_>
+// CHECK-EXE-NEXT:           b       0x3000008c <__AArch64AbsLongThunk_>
+// CHECK-EXE-NEXT:           bl      0x3000009c <__AArch64AbsLongThunk_>
+// CHECK-EXE-NEXT:           b       0x3000009c <__AArch64AbsLongThunk_>
+// CHECK-EXE-NEXT:           bl      0x300000ac <__AArch64AbsLongThunk_>
+// CHECK-EXE-NEXT:           b       0x300000ac <__AArch64AbsLongThunk_>
+// CHECK-EXE-NEXT:           bl      0x300000bc <__AArch64AbsLongThunk_>
+// CHECK-EXE-NEXT:           b       0x300000bc <__AArch64AbsLongThunk_>
+// CHECK-EXE-NEXT:           bl      0x300000cc <__AArch64AbsLongThunk_via_plt>
+// CHECK-EXE-NEXT:           bl      0x300000dc <__AArch64AbsLongThunk_absolute>
+
+// CHECK-EXE-LABEL: 000000003000003c <__AArch64AbsLongThunk_>:
+// CHECK-EXE-NEXT: 3000003c: ldr     x16, 0x30000044 <__AArch64AbsLongThunk_+0x8>
+// CHECK-EXE-NEXT:           br      x16
+// CHECK-EXE-NEXT:     00 10 00 18   .word   0x18001000
+// CHECK-EXE-NEXT:     00 00 00 00   .word   0x00000000
+
+// CHECK-EXE-LABEL: <__AArch64AbsLongThunk_>:
+// CHECK-EXE-NEXT: 3000004c: ldr     x16, 0x30000054 <__AArch64AbsLongThunk_+0x8>
+// CHECK-EXE-NEXT:           br      x16
+// CHECK-EXE-NEXT:     08 10 00 18   .word   0x18001008
+// CHECK-EXE-NEXT:     00 00 00 00   .word   0x00000000
+
+// CHECK-EXE-LABEL: <__AArch64AbsLongThunk_>:
+// CHECK-EXE-NEXT: 3000005c: ldr     x16, 0x30000064 <__AArch64AbsLongThunk_+0x8>
+// CHECK-EXE-NEXT:           br      x16
+// CHECK-EXE-NEXT:     10 10 00 18   .word   0x18001010
+// CHECK-EXE-NEXT:     00 00 00 00   .word   0x00000000
+
+// CHECK-EXE-LABEL: <__AArch64AbsLongThunk_>:
+// CHECK-EXE-NEXT: 3000006c: ldr     x16, 0x30000074 <__AArch64AbsLongThunk_+0x8>
+// CHECK-EXE-NEXT:           br      x16
+// CHECK-EXE-NEXT:     18 10 00 18   .word   0x18001018
+// CHECK-EXE-NEXT:     00 00 00 00   .word   0x00000000
+
+// CHECK-EXE-LABEL: <__AArch64AbsLongThunk_>:
+// CHECK-EXE-NEXT: 3000007c: ldr     x16, 0x30000084 <__AArch64AbsLongThunk_+0x8>
+// CHECK-EXE-NEXT:           br      x16
+// CHECK-EXE-NEXT:     20 10 00 18   .word   0x18001020
+// CHECK-EXE-NEXT:     00 00 00 00   .word   0x00000000
+
+// CHECK-EXE-LABEL: <__AArch64AbsLongThunk_>:
+// CHECK-EXE-NEXT: 3000008c: ldr     x16, 0x30000094 <__AArch64AbsLongThunk_+0x8>
+// CHECK-EXE-NEXT:           br      x16
+// CHECK-EXE-NEXT:     28 10 00 18   .word   0x18001028
+// CHECK-EXE-NEXT:     00 00 00 00   .word   0x00000000
+
+// CHECK-EXE-LABEL: <__AArch64AbsLongThunk_>:
+// CHECK-EXE-NEXT: 3000009c: ldr     x16, 0x300000a4 <__AArch64AbsLongThunk_+0x8>
+// CHECK-EXE-NEXT:           br      x16
+// CHECK-EXE-NEXT:     30 10 00 18   .word   0x18001030
+// CHECK-EXE-NEXT:     00 00 00 00   .word   0x00000000
+
+// CHECK-EXE-LABEL: <__AArch64AbsLongThunk_>:
+// CHECK-EXE-NEXT: 300000ac: ldr     x16, 0x300000b4 <__AArch64AbsLongThunk_+0x8>
+// CHECK-EXE-NEXT:           br      x16
+// CHECK-EXE-NEXT:     3c 10 00 18   .word   0x1800103c
+// CHECK-EXE-NEXT:     00 00 00 00   .word   0x00000000
+
+// CHECK-EXE-LABEL: <__AArch64AbsLongThunk_>:
+// CHECK-EXE-NEXT: 300000bc: ldr     x16, 0x300000c4 <__AArch64AbsLongThunk_+0x8>
+// CHECK-EXE-NEXT:           br      x16
+// CHECK-EXE-NEXT:     44 10 00 18   .word   0x18001044
+// CHECK-EXE-NEXT:     00 00 00 00   .word   0x00000000
+
+// CHECK-EXE-LABEL: <__AArch64AbsLongThunk_via_plt>:
+// CHECK-EXE-NEXT: 300000cc: ldr     x16, 0x300000d4 <__AArch64AbsLongThunk_via_plt+0x8>
+// CHECK-EXE-NEXT:           br      x16
+// CHECK-EXE-NEXT:     80 10 00 18   .word   0x18001080
+// CHECK-EXE-NEXT:     00 00 00 00   .word   0x00000000
+
+// CHECK-EXE-LABEL: <__AArch64AbsLongThunk_absolute>:
+// CHECK-EXE-NEXT: 300000dc: ldr     x16, 0x300000e4 <__AArch64AbsLongThunk_absolute+0x8>
+// CHECK-EXE-NEXT:           br      x16
+// CHECK-EXE-NEXT:     00 00 00 f0   .word   0xf0000000
+// CHECK-EXE-NEXT:     00 00 00 00   .word   0x00000000
+
+//--- lds
+PHDRS {
+  low PT_LOAD FLAGS(0x1 | 0x4);
+  mid PT_LOAD FLAGS(0x1 | 0x4);
+  high PT_LOAD FLAGS(0x1 | 0x4);
+}
+SECTIONS {
+  .rodata 0x10000000 : { *(.note.gnu.property) } :low
+  .text_low : { *(.text.0) } :low
+  .text 0x18001000 : { *(.text.*) } :mid
+  .plt : { *(.plt) } :mid
+  .text_high 0x30000000 : { *(.long_calls) } :high
+}
+
+//--- shared
+.text
+.global via_plt
+.type via_plt, %function
+via_plt:
+ ret


        


More information about the llvm-commits mailing list