[lld] [lld-macho] Fix branch extension thunk estimation logic (PR #120529)
Daniel RodrÃguez Troitiño via llvm-commits
llvm-commits at lists.llvm.org
Tue Jan 7 15:15:23 PST 2025
================
@@ -184,15 +184,45 @@ uint64_t TextOutputSection::estimateStubsInRangeVA(size_t callIdx) const {
InputSection *isec = inputs[i];
isecEnd = alignToPowerOf2(isecEnd, isec->align) + isec->getSize();
}
+
+ // Tally up any thunks that have already been placed that have VA higher than
+ // inputs[callIdx]. First, find the index of the first thunk that is beyond
+ // the current inputs[callIdx].
+ auto itPostcallIdxThunks =
+ llvm::partition_point(thunks, [isecVA](const ConcatInputSection *t) {
+ return t->getVA() <= isecVA;
+ });
+ uint64_t existingForwardThunks = thunks.end() - itPostcallIdxThunks;
+
----------------
drodriguez wrote:
I did not have an intuition for the total number of entries in `thunkMap`. If we are talking about 100k or those numbers, doing only ~16 comparisons is probably going to always be quicker.
My comment about double counting is because of the name `maxPotentialThunks` and the way `thunkMap` seems to be copied progressively into `thunks`, but never cleaned up. I don't pretend to understand the flow of the code (it is difficult to figure out what's going on and how things will happen during runtime), but the name `maxPotentialThunks` might not be the best if it was undercounting (originally). IIUC the code calls `estimateStubsInRangeVA` only once during `finalize` (which I think only happens once). At the point `estimateStubsInRangeVA` is called, part of `thunkMap` has already been pushed into `thunks`. The more we overcount, the higher `stubsInRangeVA` will be, and it might be that we are creating thunks for the end of the text section when a simple jump will had be enough. I don't know how bad is that size increase in the final binary when we are overcounting. It seems to me that we were undercounting when `ti.callSitesUsed == ti.callSiteCount` (I imagine `callSitesUsed` cannot be higher than `callSiteCount`). It might be possible to have a better estimation during the `thunksMap` processing, for example changing the `if (ti.isec->getVA() <= isecVA)` I was proposing above to `else if (ti.isec->getVA() <= isecVA)` (which I think was when the original code was undercounting).
Again, it is complicated code, I probably don't understand it correctly, and there's a couple of things that might make overcounting not as important of not undercounting. Feel free to ignore my feedback.
https://github.com/llvm/llvm-project/pull/120529
More information about the llvm-commits
mailing list