[llvm] [BOLT][AArch64] Keep relocations for linker-relaxed instructions. NFCI (PR #129980)

Maksim Panchenko via llvm-commits llvm-commits at lists.llvm.org
Wed Mar 5 19:32:58 PST 2025


https://github.com/maksfb created https://github.com/llvm/llvm-project/pull/129980

We used to filter out relocations corresponding to NOP+ADR instruction pairs that were a result of linker "relaxation" optimization. However, these relocations will be useful for reversing the linker optimization. Keep the relocations and ignore them while symbolizing ADR instruction operands.

>From 97d74f56ea5dd37998eea54427af628e72c3012e Mon Sep 17 00:00:00 2001
From: Maksim Panchenko <maks at fb.com>
Date: Wed, 5 Mar 2025 17:40:29 -0800
Subject: [PATCH] [BOLT][AArch64] Keep relocations for linker-relaxed
 instructions. NFCI

We used to filter out relocations corresponding to NOP+ADR instruction
pairs that were a result of linker "relaxation" optimization. However,
these relocations will be useful for reversing the linker optimization.
Keep the relocations and ignore them while symbolizing ADR instruction
operands.
---
 bolt/lib/Core/Relocation.cpp                  | 23 -------------------
 .../Target/AArch64/AArch64MCSymbolizer.cpp    | 19 +++++++++++----
 2 files changed, 15 insertions(+), 27 deletions(-)

diff --git a/bolt/lib/Core/Relocation.cpp b/bolt/lib/Core/Relocation.cpp
index e9a9741bc3716..523ab8480cc90 100644
--- a/bolt/lib/Core/Relocation.cpp
+++ b/bolt/lib/Core/Relocation.cpp
@@ -271,22 +271,11 @@ static bool skipRelocationProcessAArch64(uint64_t &Type, uint64_t Contents) {
     return (Contents & 0xfc000000) == 0x14000000;
   };
 
-  auto IsAdr = [](uint64_t Contents) -> bool {
-    // The bits 31-24 are 0b0xx10000
-    return (Contents & 0x9f000000) == 0x10000000;
-  };
-
   auto IsAddImm = [](uint64_t Contents) -> bool {
     // The bits 30-23 are 0b00100010
     return (Contents & 0x7F800000) == 0x11000000;
   };
 
-  auto IsNop = [](uint64_t Contents) -> bool { return Contents == 0xd503201f; };
-
-  // The linker might eliminate the instruction and replace it with NOP, ignore
-  if (IsNop(Contents))
-    return true;
-
   // The linker might relax ADRP+LDR instruction sequence for loading symbol
   // address from GOT table to ADRP+ADD sequence that would point to the
   // binary-local symbol. Change relocation type in order to process it right.
@@ -332,18 +321,6 @@ static bool skipRelocationProcessAArch64(uint64_t &Type, uint64_t Contents) {
   }
   }
 
-  // The linker might relax ADRP+ADD or ADRP+LDR sequences to the ADR+NOP
-  switch (Type) {
-  default:
-    break;
-  case ELF::R_AARCH64_ADR_PREL_PG_HI21:
-  case ELF::R_AARCH64_ADD_ABS_LO12_NC:
-  case ELF::R_AARCH64_ADR_GOT_PAGE:
-  case ELF::R_AARCH64_LD64_GOT_LO12_NC:
-    if (IsAdr(Contents))
-      return true;
-  }
-
   return false;
 }
 
diff --git a/bolt/lib/Target/AArch64/AArch64MCSymbolizer.cpp b/bolt/lib/Target/AArch64/AArch64MCSymbolizer.cpp
index 7145e77a1edbb..d08bca6e0fc3e 100644
--- a/bolt/lib/Target/AArch64/AArch64MCSymbolizer.cpp
+++ b/bolt/lib/Target/AArch64/AArch64MCSymbolizer.cpp
@@ -31,10 +31,6 @@ bool AArch64MCSymbolizer::tryAddingSymbolicOperand(
   if (BC.MIB->isBranch(Inst) || BC.MIB->isCall(Inst))
     return false;
 
-  // TODO: add handling for linker "relaxation". At the moment, relocations
-  // corresponding to "relaxed" instructions are excluded from BinaryFunction
-  // relocation list.
-
   const uint64_t InstOffset = InstAddress - Function.getAddress();
   const Relocation *Relocation = Function.getRelocationAt(InstOffset);
 
@@ -49,6 +45,21 @@ bool AArch64MCSymbolizer::tryAddingSymbolicOperand(
         BC.MIB->getTargetExprFor(Inst, Expr, *Ctx, RelType)));
   };
 
+  // The linker can convert ADRP+ADD and ADRP+LDR instruction sequences into
+  // NOP+ADR. After the conversion, the linker might keep the relocations and
+  // if we try to symbolize ADR's operand using outdated relocations, we might
+  // get unexpected results. Hence, we check for the conversion/relaxation, and
+  // ignore the relocation. The symbolization is done based on the PC-relative
+  // value of the operand instead.
+  if (Relocation && BC.MIB->isADR(Inst)) {
+    if (Relocation->Type == ELF::R_AARCH64_ADD_ABS_LO12_NC ||
+        Relocation->Type == ELF::R_AARCH64_LD64_GOT_LO12_NC) {
+      LLVM_DEBUG(dbgs() << "BOLT-DEBUG: ignoring relocation at 0x"
+                        << Twine::utohexstr(InstAddress) << '\n');
+      Relocation = nullptr;
+    }
+  }
+
   if (Relocation) {
     addOperand(Relocation->Symbol, Relocation->Addend, Relocation->Type);
     return true;



More information about the llvm-commits mailing list