[llvm-branch-commits] [lld] ELF: Add branch-to-branch optimization. (PR #138366)

Thu May 22 21:31:51 PDT 2025

================
@@ -975,6 +977,62 @@ void AArch64::relocateAlloc(InputSectionBase &sec, uint8_t *buf) const {
   }
 }
 
+static std::optional<uint64_t> getControlTransferAddend(InputSection &is,
+                                                        Relocation &r) {
+  // Identify a control transfer relocation for the branch-to-branch
+  // optimization. A "control transfer relocation" means a B or BL
+  // target but it also includes relative vtable relocations for example.
+  //
+  // We require the relocation type to be JUMP26, CALL26 or PLT32. With a
+  // relocation type of PLT32 the value may be assumed to be used for branching
+  // directly to the symbol and the addend is only used to produce the relocated
+  // value (hence the effective addend is always 0). This is because if a PLT is
+  // needed the addend will be added to the address of the PLT, and it doesn't
+  // make sense to branch into the middle of a PLT. For example, relative vtable
+  // relocations use PLT32 and 0 or a positive value as the addend but still are
+  // used to branch to the symbol.
+  //
+  // With JUMP26 or CALL26 the only reasonable interpretation of a non-zero
+  // addend is that we are branching to symbol+addend so that becomes the
+  // effective addend.
+  if (r.type == R_AARCH64_PLT32)
+    return 0;
+  if (r.type == R_AARCH64_JUMP26 || r.type == R_AARCH64_CALL26)
+    return r.addend;
+  return std::nullopt;
+}
+
+static std::pair<Relocation *, uint64_t> getBranchInfo(InputSection &is,
+                                                       uint64_t offset) {
+  auto *i = std::lower_bound(
+      is.relocations.begin(), is.relocations.end(), offset,
+      [](Relocation &r, uint64_t offset) { return r.offset < offset; });
+  if (i != is.relocations.end() && i->offset == offset &&
+      i->type == R_AARCH64_JUMP26) {
+    return {i, i->addend};
+  }
----------------
pcc wrote:

Regarding BTI instructions, that should work, but let's do that in a followup.

In principle, a hot patch could overwrite an initial B instruction as well, so in general users desiring hot patch compatibility would need to disable this entirely by passing  `--no-branch-to-branch`. Since hot patching is uncommon I think we probably shouldn't accommodate hot patching by default. We generally expect the program not to write to read-only sections (e.g. ICF and string tail merging will merge read-only sections even though the sections/strings could be written to by bypassing page protections and affect all merged sections) and this optimization is consistent with that. I checked the linker flags used by the Linux kernel (which I know hot patches itself at startup) and it doesn't pass a `-O` flag so it won't be broken by this change.

While thinking about hot patching I realized that we should have a check that the target section is not writable, so I added that.

https://github.com/llvm/llvm-project/pull/138366