[lld] [lld][ELF] Merge equivalent symbols found during ICF (PR #139493)
Peter Smith via llvm-commits
llvm-commits at lists.llvm.org
Wed Jun 11 09:07:11 PDT 2025
smithp35 wrote:
Before I start, please tell me if these suggestions aren't helping. I've managed to cross at least one person's red-line with most of them.
Arm does have a team that work on Chromium performance, if you know who they are it would be useful to contact them to see if they can prioritise/lobby for someone to work on the outliner. If you don't, I can refer this to them although it would likely have more weight going directly.
If the outliner were fixed, IIUC that would fix the current situation but we'd be left with legacy objects or some highly non-idiomatic assembly code. In that case we could try and detect the problematic case. I thought that this might be just as complex as the code here, but I think there may be a simple enough way to do it in a separate pass before ICF [1]. For the problem to occur we need a pair of GOT generating relocations split across two separate sections. For this to be legal AArch64 code there would have to be a branch instruction with a `R_AARCH64_JUMP26` relocation to a symbol in another section, in between the ADRP (with GOT generating relocation) and the ADD/LDR (with GOT generating relocation).
A linear [2] scan through the relocations could detect problematic sections, and either refuse to do ICF or exclude the sections from ICF. Something like:
```
if ADRP got generating reloc to local symbol
adrp_local_count++;
if ADD or LDR got generating symbol to local symbol
state_adrp_local--;
if unconditional branch to another section S and state_adrp_local != 0
section S cannot be merged by ICF.
```
I don't think that we would need to track which symbol. Only that there is at least one branch relocation to another section and there is still an unbalanced pair of ADRP, LDR or ADD [3]
[1] Could be run as an option, perhaps with an option to skip if we already know our inputs are clear.
[2] Strictly speaking relocations don't have to be ordered in r_offset order, but this may be true for clang using the outliner.
[3] In theory something like the following could slip through, but doing the second half of the calculation first is highly unidiomatic.
```
ADRP x0, :got: foo
ADD x1, x1 :got_lo12: bar
B <other section>
```
https://github.com/llvm/llvm-project/pull/139493
More information about the llvm-commits
mailing list