[lld] ELF: Add branch-to-branch optimization. (PR #138366)
Peter Collingbourne via llvm-commits
llvm-commits at lists.llvm.org
Tue May 6 10:49:02 PDT 2025
pcc wrote:
> The bolt folk have been asking if the range-extension thunks can output relocations so that they can recreate the control flow more easily. It looks like this will handle emit-relocs naturally, could be worth a test case to check.
It doesn't look like that works because InputSection::copyRelocations reads the original relocation section and not the relocations array.
> This looks like it runs before range extension thunks, which likely simplifies the logic as we don't need to check for branch range. Due to the way thunks are currently placed it could be possible that we end up removing a branch to a branch, only for the thunk generation code placing a range extension thunk such that a long-branch is generated.
Correct. I placed it before range extension thunks deliberately, as it means that in some cases we would be able to avoid the range extension thunk entirely. I had the CFI jump table case in mind, where typically all the jump tables will appear near the end of the program (link position of the combined full LTO object file). For example:
```
f1:
b f2
f2.cfi:
ret
[....]
f2:
b f2.cfi
```
So it's neutral (or possibly positive) in terms of the number of branches executed, but possibly negative for code size. That being said, this optimization did not change the number of thunks created in lld-speed-test/firefox-arm64, but that doesn't use CFI. (The optimization didn't fire at all in lld-speed-test/chrome because of BTI instructions, but that should be fixable in a followup.)
> Alternately running it earlier than garbage collection would in theory permit some sections (branch in its own section) to be removed, but that's likely to be a neglible saving.
Correct. Also, if it ran before ICF that could expose more ICF opportunities but that's likely to be minor as well.
I considered placing it before ICF and GC, but we don't really have any infrastructure for rewriting relocation targets until reaching the relocation scan in the writer, and if we moved that earlier it would mean doing extra work (creating relocation lists for ICF'd and GC'd sections) for only a minor benefit.
https://github.com/llvm/llvm-project/pull/138366
More information about the llvm-commits
mailing list