[lld] [lld][LoongArch] GOT indirection to PC relative optimization (PR #123743)
Zhaoxin Yang via llvm-commits
llvm-commits at lists.llvm.org
Mon Jul 28 00:56:41 PDT 2025
================
@@ -1152,6 +1154,57 @@ void LoongArch::tlsdescToLe(uint8_t *loc, const Relocation &rel,
}
}
+// Try GOT indirection to PC relative optimization when relaxation is enabled.
+// From:
+// * pcalau12i $a0, %got_pc_hi20(sym_got)
+// * ld.w/d $a0, $a0, %got_pc_lo12(sym_got)
+// To:
+// * pcalau12i $a0, %pc_hi20(sym)
+// * addi.w/d $a0, $a0, %pc_lo12(sym)
+//
+// Note: Althouth the optimization has been performed, the GOT entries still
+// exists, similarly to AArch64. Eliminating the entries will increase code
+// complexity.
+bool LoongArch::tryGotToPCRel(uint8_t *loc, const Relocation &rHi20,
+ const Relocation &rLo12, uint64_t secAddr) const {
+ if (!rHi20.sym->isDefined() || rHi20.sym->isPreemptible ||
+ rHi20.sym->isGnuIFunc() ||
+ (ctx.arg.isPic && !cast<Defined>(*rHi20.sym).section))
+ return false;
+
+ Symbol &sym = *rHi20.sym;
+ uint64_t symLocal = sym.getVA(ctx) + rHi20.addend;
+ // Check if the address difference is within +/-2GB range.
+ // For simplicity, the range mentioned here is an approximate estimate and is
+ // not fully equivalent to the entire region that PC-relative addressing can
+ // cover.
+ int64_t pageOffset =
+ getLoongArchPage(symLocal) - getLoongArchPage(secAddr + rHi20.offset);
+ if (!isInt<20>(pageOffset >> 12))
+ return false;
----------------
ylzsx wrote:
Assuming there is a GOT access from the start of the `.text` section to the start of the `.rodata` section, the result from `getLoongArchPageDelta` differs between the following two linker scripts.
```
uint64_t elf::getLoongArchPageDelta(uint64_t dest, uint64_t pc, RelType type) {
uint64_t pcalau12i_pc = pc;
uint64_t result = getLoongArchPage(dest) - getLoongArchPage(pcalau12i_pc);
if (dest & 0x800)
result += 0x1000 - 0x1'0000'0000;
if (result & 0x8000'0000)
result += 0x1'0000'0000;
return result;
}
```
**Case 1:**
```
SECTIONS {
.rodata 0x1800: { *(.rodata) }
.text 0x2800: { *(.text) }
}
```
uint64_t result = getLoongArchPage(dest) - getLoongArchPage(pcalau12i_pc) = 0xffff'ffff'ffff'f000;
dest & 0x800 == 1, so result = result + 0x1000 - 0x1'0000'0000 = 0xffff'ffff'0000'0000;
result & 0x8000'0000 == 0, so result = 0xffff'ffff'0000'0000;
**Case 2:**
```
SECTIONS {
.rodata 0x1800: { *(.rodata) }
.text 0x2800: { *(.text) }
}
```
uint64_t result = getLoongArchPage(dest) - getLoongArchPage(pcalau12i_pc) = 0xffff'ffff'ffff'f000;
dest & 0x800 == 0, so result = 0xffff'ffff'ffff'f000;
result & 0x8000'0000 != 0, so result = result + 0x1'0000'0000 = 0x0000'0000'0xffff'f000;
In fact, the distance in both cases is the same, and optimization is possible. Therefore, determining how to use getLoongArchPageDelta to make the judgment is a matter.
https://github.com/llvm/llvm-project/pull/123743
More information about the llvm-commits
mailing list