[PATCH] D157020: [WIP][lld/ELF] Don't relax R_X86_64_(REX_)GOTPCRELX when offset is too far

Fangrui Song via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Aug 3 13:04:38 PDT 2023


MaskRay added a comment.

In D157020#4558525 <https://reviews.llvm.org/D157020#4558525>, @aeubanks wrote:

> In D157020#4558475 <https://reviews.llvm.org/D157020#4558475>, @MaskRay wrote:
>
>> This should state the motication, which is likely to alleviate relocation overflow pressure.
>>
>>> On x86-64, linkers optimize some GOT-indirect instructions (R_X86_64_REX_GOTPCRELX; e.g. movq var at GOTPCREL(%rip), %rax) to PC-relative instructions. The distance between a code section and .got is usually smaller than The distance between a code section and .data/.bss. ld.lld's one-pass relocation scanning scheme has a limitation: if it decides to suppress a GOT entry and it turns out that optimizing the instruction will lead to relocation overflow, the decision cannot be reverted. It should be easy to work around the issue with -Wl,--no-relax.
>
> I wouldn't say this is to "alleviate relocation overflow pressure", I'd say this is straight up just a bug fix. The linker shouldn't be doing this optimization if it breaks linking right?

I think it's unfair to call it a linker bug.  Having R_X86_64_REX_GOTPCRELX and R_X86_64_GOTPCRELX out-of-range issues implies that the program no longer fits into the small code model, and we are in the realm of doing extra stuff to try to make program work, with certain constraints in the toolchain pieces.
Whether the linker provides the GOTPCRELX range check for deciding optimization is an add-on instead of a requirement.
With lld's one-pass main relocation scanning architecture, it's costly to perform the additional check, so we can say we made a deliberate choice not to support this case in the presence of `--relax`.

>> However, I'm not sure adding another relocation scanning pass for this purpose is a good idea. It's quite a bit of code targeted with very specific workloads where `-Wl,--no-relax` can be used instead. Our relocation scanning is more complex than mold and makes us slower. I'm fairly concerned of more relocation passes.
>
> Perhaps we can avoid the extra relocation scan in cases where we detect that the max offset is under 2^31?
> I'm pretty `--no-relax` will be unacceptable performance-wise for accessing extern globals.

I doubt that the percentage is going to be large.
But really, global variable accesses really should not be a bottleneck of well-written applications, especially in the oversized server applications.

>> (Another minor thing is that I think our relocation scanning really needs overhaul to improve performance. This tricky case would add another item to account for when we do the refactoring.)




Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157020/new/

https://reviews.llvm.org/D157020



More information about the llvm-commits mailing list