[lld] [lld-macho] Handle InputSection targets in branch range extension logic (PR #126347)
via llvm-commits
llvm-commits at lists.llvm.org
Sat Feb 8 18:59:28 PST 2025
llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-lld-macho
Author: None (alx32)
<details>
<summary>Changes</summary>
# Branch Extension Support for InputSection-type Relocations
## Problem
The branch extension algorithm was originally designed to handle branch relocations that point directly to symbols. However, when linking Rust code, we encountered edge cases where branch target relocations point to `InputSections` instead of symbols. This caused issues because:
1. The branch extension algorithm couldn't process these `InputSection`-based relocations
2. No thunks were being created for these cases
3. This could lead to linker errors due to branches being out of range
## Solution
This patch adds support for `InputSection`-based relocations by:
1. Detecting when a relocation targets an `InputSection` instead of a Symbol
2. Finding the first symbol at offset 0 in the targeted `InputSection`
3. Using this symbol as the branch target for thunk generation
## Implementation Details
- Added a new helper function `getReferentSymbol()` to get the branch target symbol, regardless of wether a relocation points to a `Symbol` or to an `InputSection`
- If no symbol exists at offset 0 in the `InputSection`, a warning is emitted to alert about potential branch range issues
- Modified the thunk generation logic to handle both Symbol and InputSection-based relocations
This change ensures proper branch extension support for edge case Rust-generated code while maintaining compatibility with existing symbol-based relocations.
## Testing
This was tested to handle the known problematic scenario - but there doesn't seem to be a way for us to generate a branch relocation pointing to an InputSection as part of a test.
---
Full diff: https://github.com/llvm/llvm-project/pull/126347.diff
1 Files Affected:
- (modified) lld/MachO/ConcatOutputSection.cpp (+32-2)
``````````diff
diff --git a/lld/MachO/ConcatOutputSection.cpp b/lld/MachO/ConcatOutputSection.cpp
index d64816ec695ad52..375f81d632f2389 100644
--- a/lld/MachO/ConcatOutputSection.cpp
+++ b/lld/MachO/ConcatOutputSection.cpp
@@ -116,6 +116,30 @@ void ConcatOutputSection::addInput(ConcatInputSection *input) {
DenseMap<Symbol *, ThunkInfo> lld::macho::thunkMap;
+// Returns the target Symbol that a relocation refers to.
+// A Reloc can refer to either a Symbol directly, or to an InputSection.
+// For InputSection referents, we return the first Symbol at offset 0.
+// This conversion is necessary because the thunk generation algorithm
+// can only handle Symbols as branch targets, not InputSections.
+static Symbol *getReferentSymbol(const Reloc &r) {
+ if (auto *sym = r.referent.dyn_cast<Symbol *>()) {
+ return sym;
+ } else if (auto *isec = r.referent.dyn_cast<InputSection *>()) {
+ // Use the first symbol at offset 0 in the InputSection
+ for (Defined *sym : isec->symbols) {
+ if (sym->value == 0) {
+ return sym;
+ }
+ }
+ // Handle absence of suitable symbol
+ warn("Branch-range extension: No symbol at offset 0 in InputSection '" +
+ toString(isec) + "', possible branch out of range errors may occur.");
+ return nullptr;
+ } else {
+ llvm_unreachable("Unexpected referent type");
+ }
+}
+
// Determine whether we need thunks, which depends on the target arch -- RISC
// (i.e., ARM) generally does because it has limited-range branch/call
// instructions, whereas CISC (i.e., x86) generally doesn't. RISC only needs
@@ -145,7 +169,10 @@ bool TextOutputSection::needsThunks() const {
for (Reloc &r : isec->relocs) {
if (!target->hasAttr(r.type, RelocAttrBits::BRANCH))
continue;
- auto *sym = cast<Symbol *>(r.referent);
+ // Get the Symbol that the relocation targets.
+ Symbol *sym = getReferentSymbol(r);
+ if (!sym)
+ continue;
// Pre-populate the thunkMap and memoize call site counts for every
// InputSection and ThunkInfo. We do this for the benefit of
// estimateStubsInRangeVA().
@@ -325,7 +352,10 @@ void TextOutputSection::finalize() {
backwardBranchRange < callVA ? callVA - backwardBranchRange : 0;
uint64_t highVA = callVA + forwardBranchRange;
// Calculate our call referent address
- auto *funcSym = cast<Symbol *>(r.referent);
+ Symbol *funcSym = getReferentSymbol(r);
+ if (!funcSym)
+ continue;
+
ThunkInfo &thunkInfo = thunkMap[funcSym];
// The referent is not reachable, so we need to use a thunk ...
if (funcSym->isInStubs() && callVA >= stubsInRangeVA) {
``````````
</details>
https://github.com/llvm/llvm-project/pull/126347
More information about the llvm-commits
mailing list