[llvm] [BOLT][AArch64] Keep relocations for linker-relaxed instructions. NFCI (PR #129980)
via llvm-commits
llvm-commits at lists.llvm.org
Wed Mar 5 19:33:18 PST 2025
llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-bolt
Author: Maksim Panchenko (maksfb)
<details>
<summary>Changes</summary>
We used to filter out relocations corresponding to NOP+ADR instruction pairs that were a result of linker "relaxation" optimization. However, these relocations will be useful for reversing the linker optimization. Keep the relocations and ignore them while symbolizing ADR instruction operands.
---
Full diff: https://github.com/llvm/llvm-project/pull/129980.diff
2 Files Affected:
- (modified) bolt/lib/Core/Relocation.cpp (-23)
- (modified) bolt/lib/Target/AArch64/AArch64MCSymbolizer.cpp (+15-4)
``````````diff
diff --git a/bolt/lib/Core/Relocation.cpp b/bolt/lib/Core/Relocation.cpp
index e9a9741bc3716..523ab8480cc90 100644
--- a/bolt/lib/Core/Relocation.cpp
+++ b/bolt/lib/Core/Relocation.cpp
@@ -271,22 +271,11 @@ static bool skipRelocationProcessAArch64(uint64_t &Type, uint64_t Contents) {
return (Contents & 0xfc000000) == 0x14000000;
};
- auto IsAdr = [](uint64_t Contents) -> bool {
- // The bits 31-24 are 0b0xx10000
- return (Contents & 0x9f000000) == 0x10000000;
- };
-
auto IsAddImm = [](uint64_t Contents) -> bool {
// The bits 30-23 are 0b00100010
return (Contents & 0x7F800000) == 0x11000000;
};
- auto IsNop = [](uint64_t Contents) -> bool { return Contents == 0xd503201f; };
-
- // The linker might eliminate the instruction and replace it with NOP, ignore
- if (IsNop(Contents))
- return true;
-
// The linker might relax ADRP+LDR instruction sequence for loading symbol
// address from GOT table to ADRP+ADD sequence that would point to the
// binary-local symbol. Change relocation type in order to process it right.
@@ -332,18 +321,6 @@ static bool skipRelocationProcessAArch64(uint64_t &Type, uint64_t Contents) {
}
}
- // The linker might relax ADRP+ADD or ADRP+LDR sequences to the ADR+NOP
- switch (Type) {
- default:
- break;
- case ELF::R_AARCH64_ADR_PREL_PG_HI21:
- case ELF::R_AARCH64_ADD_ABS_LO12_NC:
- case ELF::R_AARCH64_ADR_GOT_PAGE:
- case ELF::R_AARCH64_LD64_GOT_LO12_NC:
- if (IsAdr(Contents))
- return true;
- }
-
return false;
}
diff --git a/bolt/lib/Target/AArch64/AArch64MCSymbolizer.cpp b/bolt/lib/Target/AArch64/AArch64MCSymbolizer.cpp
index 7145e77a1edbb..d08bca6e0fc3e 100644
--- a/bolt/lib/Target/AArch64/AArch64MCSymbolizer.cpp
+++ b/bolt/lib/Target/AArch64/AArch64MCSymbolizer.cpp
@@ -31,10 +31,6 @@ bool AArch64MCSymbolizer::tryAddingSymbolicOperand(
if (BC.MIB->isBranch(Inst) || BC.MIB->isCall(Inst))
return false;
- // TODO: add handling for linker "relaxation". At the moment, relocations
- // corresponding to "relaxed" instructions are excluded from BinaryFunction
- // relocation list.
-
const uint64_t InstOffset = InstAddress - Function.getAddress();
const Relocation *Relocation = Function.getRelocationAt(InstOffset);
@@ -49,6 +45,21 @@ bool AArch64MCSymbolizer::tryAddingSymbolicOperand(
BC.MIB->getTargetExprFor(Inst, Expr, *Ctx, RelType)));
};
+ // The linker can convert ADRP+ADD and ADRP+LDR instruction sequences into
+ // NOP+ADR. After the conversion, the linker might keep the relocations and
+ // if we try to symbolize ADR's operand using outdated relocations, we might
+ // get unexpected results. Hence, we check for the conversion/relaxation, and
+ // ignore the relocation. The symbolization is done based on the PC-relative
+ // value of the operand instead.
+ if (Relocation && BC.MIB->isADR(Inst)) {
+ if (Relocation->Type == ELF::R_AARCH64_ADD_ABS_LO12_NC ||
+ Relocation->Type == ELF::R_AARCH64_LD64_GOT_LO12_NC) {
+ LLVM_DEBUG(dbgs() << "BOLT-DEBUG: ignoring relocation at 0x"
+ << Twine::utohexstr(InstAddress) << '\n');
+ Relocation = nullptr;
+ }
+ }
+
if (Relocation) {
addOperand(Relocation->Symbol, Relocation->Addend, Relocation->Type);
return true;
``````````
</details>
https://github.com/llvm/llvm-project/pull/129980
More information about the llvm-commits
mailing list