[lld] [lld][ELF] Add range extension thunks for x86-64 (PR #180266)

Mon Feb 9 03:34:52 PST 2026

================
@@ -1396,6 +1406,50 @@ void RetpolineZNow::writePlt(uint8_t *buf, const Symbol &sym,
   write32le(buf + 8, ctx.in.plt->getVA() - pltEntryAddr - 12);
 }
 
+// For x86-64, thunks are needed when the displacement between the branch
+// instruction and its target exceeds the 32-bit signed range (2GiB).
+// This can happen in very large binaries where .text exceeds 2GiB.
+bool X86_64::needsThunk(RelExpr expr, RelType type, const InputFile *file,
+                        uint64_t branchAddr, const Symbol &s,
+                        int64_t a) const {
+  // Only branch relocations need thunks.
+  // R_X86_64_PLT32 is used for call/jmp instructions and always needs thunks.
+  // R_X86_64_PC32 is more general and can be used for both branches and data
+  // accesses (lea, mov). We only create thunks for function symbols.
+  if (type != R_X86_64_PLT32 && (type != R_X86_64_PC32 || !s.isFunc()))
+    return false;
+
+  // If the target requires a PLT entry, check if we can reach the PLT
+  if (s.isInPlt(ctx)) {
+    uint64_t dst = s.getPltVA(ctx) + a;
+    return !inBranchRange(type, branchAddr, dst);
+  }
+
+  // For direct calls/jumps, check if we can reach the destination
+  uint64_t dst = s.getVA(ctx, a);
+  return !inBranchRange(type, branchAddr, dst);
+}
+
+// Check if a branch from src to dst is within the 32-bit signed range.
+bool X86_64::inBranchRange(RelType type, uint64_t src, uint64_t dst) const {
+  // x86-64 RIP-relative branches use a 32-bit signed displacement.
+  // The displacement is relative to the address after the instruction,
+  // which is typically 4-5 bytes after the relocation location.
+  // We use a conservative range check here.
+  int64_t offset = dst - src;
+  return llvm::isInt<32>(offset);
+}
+
+// Return the spacing for thunk sections. We want thunks to be placed
+// at intervals such that all branches can reach either the target or
+// a thunk. With a 2GiB range, we place thunks every ~1GiB to allow
+// branches to reach in either direction.
+uint32_t X86_64::getThunkSectionSpacing() const {
----------------
smithp35 wrote:

Some thoughts from the thunk spacing. Mostly summarised in https://github.com/llvm/llvm-project/blob/main/lld/ELF/Arch/ARM.cpp#L432 

For the current implementation, we have approximately even spaced pools, so for a +-Range bytes the worst case is a branch at the start of the first section as that can only +Range bytes forwards. For AArch64 we have an easy choice of choosing +Range (- contingency for added thunks and other addressDependentContent). Arm is a bit more complicated as there are two ranges (unconditional and conditional), as the former massively dominates the latter we use the longer unconditional range, leaving the conditional branches to their own individual pools.

As an aside:

There could be scope for an alternative implementation with an uneven split range split between pools. That special cases the first (and possibly last) pools to be within +Range bytes, but then extending the intermediate ones to 2 * Range bytes. In that case branches in the first half go to the previous pool and branches in the second half go to the next pool.

The downside of that would be a more complex implementation, and more branches to thunks will be at their extreme range, which increases the chances that they will drift out of range (due to other thunks and addressDependentContent reasons) and need additional ThunkSections being created. Unless minimising pools is a goal.

Arm's proprietary linker started off without a pool. It places each thunk as its own section at the point where it can be reused by as many callers as possible. This does mean fewer thunks are needed, but it has the downside of scattering the thunks all over the program, with small changes in inputs leading to lots of thunk related changes in the program. Certain users doing binary difference compression for firmware over the air updates hated this as it ruined their compression. That prompted the addition of a pool based system.


https://github.com/llvm/llvm-project/pull/180266