[lld] [lld-macho] Fix branch extension thunk estimation logic (PR #120529)

Tue Jan 7 17:27:45 PST 2025

================
@@ -184,15 +184,45 @@ uint64_t TextOutputSection::estimateStubsInRangeVA(size_t callIdx) const {
     InputSection *isec = inputs[i];
     isecEnd = alignToPowerOf2(isecEnd, isec->align) + isec->getSize();
   }
+
+  // Tally up any thunks that have already been placed that have VA higher than
+  // inputs[callIdx]. First, find the index of the first thunk that is beyond
+  // the current inputs[callIdx].
+  auto itPostcallIdxThunks =
+      llvm::partition_point(thunks, [isecVA](const ConcatInputSection *t) {
+        return t->getVA() <= isecVA;
+      });
+  uint64_t existingForwardThunks = thunks.end() - itPostcallIdxThunks;
+
----------------
drodriguez wrote:

If I understand your answer correctly, then when `if (ti.callSitesUsed < ti.callSiteCount)` is not taken, it is the only case where the previous code was undercounting, so adding the right `else if` in there (either `ti.isec->getVA() <= isecVA` or `ti.isec->getVA() > isecVA`, which one I am not sure) should only add to the estimation those thunks that were forgotten before. Yes, the new code will be doing more checks that with `partition_point` (specially because for the last input section I would expect many thunks to match `ti.callSitesUsed == ti.callSiteCount`), but at the same time it will reduce the overcounting, which should improve the size of the final binary (by not unnecessarily creating more stubs because `stubsInRangeVA` is too high).

https://github.com/llvm/llvm-project/pull/120529