[llvm-branch-commits] [lld] ELF: CFI jump table relaxation. (PR #147424)
Peter Collingbourne via llvm-branch-commits
llvm-branch-commits at lists.llvm.org
Sat May 2 20:19:23 PDT 2026
================
@@ -312,6 +313,196 @@ bool X86_64::deleteFallThruJmpInsn(InputSection &is,
return true;
}
+void X86_64::relaxCFIJumpTables() const {
+ // Relax CFI jump tables.
+ // - Split jump table into pieces and place target functions inside the jump
+ // table if small enough.
+ // - Move jump table before last called function and delete last branch
+ // instruction.
+ DenseMap<InputSection *, SmallVector<InputSection *, 0>> sectionReplacements;
+ SmallVector<InputSection *, 0> storage;
+ for (OutputSection *osec : ctx.outputSections) {
+ if (!(osec->flags & SHF_EXECINSTR))
+ continue;
+ for (InputSection *sec : getInputSections(*osec, storage)) {
+ if (sec->type != SHT_LLVM_CFI_JUMP_TABLE || sec->entsize == 0 ||
+ sec->size % sec->entsize != 0)
+ continue;
+
+ // We're going to replace the jump table with this list of sections. This
+ // list will be made up of slices of the original section and function
+ // bodies that were moved into the jump table.
+ SmallVector<InputSection *, 0> replacements;
+
+ // r is the only relocation in a jump table entry. Figure out whether it
+ // is a branch pointing to the start of a statically known section that
+ // hasn't already been moved while processing a different jump table
+ // section, and if so return it.
+ auto getMovableSection = [&](Relocation &r) -> InputSection * {
+ if (r.type != R_X86_64_PC32 && r.type != R_X86_64_PLT32)
+ return nullptr;
+ auto *sym = dyn_cast_or_null<Defined>(r.sym);
+ if (!sym || sym->isPreemptible || sym->isGnuIFunc() ||
+ sym->value + r.addend != -4ull)
+ return nullptr;
+ auto *target = dyn_cast_or_null<InputSection>(sym->section);
+ if (!target || sectionReplacements.count(target))
+ return nullptr;
+ return target;
+ };
+
+ // Figure out the movable section for the last entry. We do this first
+ // because the last entry controls which output section the jump table is
+ // placed into, which affects move eligibility for other sections.
+ auto *lastSec = [&]() -> InputSection * {
----------------
pcc wrote:
This optimization isn't so much about binary size as performance (removing indirections added by jump tables reducing miss rate as noted in the commit message and working set size). That said, we could perhaps use # of indirections removed as a rough proxy for performance gains. IIRC this was a fairly substantial % of the gains because most jump tables only have a few functions and most of them don't fit in the jump table entry. And the performance gains may be improved in the future by teaching the compiler to use profile data to move the most frequently called function to the end of the jump table.
It's also worth noting that even small gains in performance are valuable for the jump table, as this overhead is fixed and can't directly be scaled by reducing the # of callsites that include CFI checks, because even an indirect call without a CFI check needs to go through the jump table.
https://github.com/llvm/llvm-project/pull/147424
More information about the llvm-branch-commits
mailing list