[lld] [lld-macho] Fix branch extension thunk estimation logic (PR #120529)

Tue Jan 7 12:56:55 PST 2025

================
@@ -184,15 +184,45 @@ uint64_t TextOutputSection::estimateStubsInRangeVA(size_t callIdx) const {
     InputSection *isec = inputs[i];
     isecEnd = alignToPowerOf2(isecEnd, isec->align) + isec->getSize();
   }
+
+  // Tally up any thunks that have already been placed that have VA higher than
+  // inputs[callIdx]. First, find the index of the first thunk that is beyond
+  // the current inputs[callIdx].
+  auto itPostcallIdxThunks =
+      llvm::partition_point(thunks, [isecVA](const ConcatInputSection *t) {
+        return t->getVA() <= isecVA;
+      });
+  uint64_t existingForwardThunks = thunks.end() - itPostcallIdxThunks;
+
----------------
drodriguez wrote:

IIUC, the `thunks` used here are derived from the `thunkMap` (see [1]), and `thunkMap` is iterated above (lines 171-179). One should be able to figure out this value during the previous iteration and avoid the `partition_point` call.

Something like:

```
// Tally the functions which still have call sites remaining to process,
// which yields the maximum number of thunks we might yet place.
size_t maxPotentialThunks = 0;
// Tally up any thunks that have already been placed that have VA higher than
// inputs[callIdx].
size_t existingForwardThunks = 0;
for (auto &tp : thunkMap) {
  ThunkInfo &ti = tp.second;
  // This overcounts: Only sections that are in forward jump range from the
  // currently-active section get finalized, and all input sections are
  // finalized when estimateStubsInRangeVA() is called. So only backward
  // jumps will need thunks, but we count all jumps.
  if (ti.callSitesUsed < ti.callSiteCount)
    maxPotentialThunks += 1;
  if (ti.isec->getVA() <= isecVA) // it might be `>` in the conditional, not sure
    existingForwardThunks += 1
}
```

Because the dependency on `isecVA`, the code needs to move under the calculation of `isecVA` (more or less where the new code is added in this PR).

(It also looks like we might be double counting in `maxPotentialThunks` and `existingForwardThunks`, but there might be some not-easy-to-see dependency in how `thunkMap` is used and when `estimateStubsInRangeVA` is called that avoids that situation).

[1]: https://github.com/llvm/llvm-project/blob/ea14bdb0356cdda727ac032470f6a0a2102d1281/lld/MachO/ConcatOutputSection.cpp#L356

https://github.com/llvm/llvm-project/pull/120529