[PATCH] D108930: [lld/mac] Leave more room for thunks in thunk placement code

Nico Weber via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Aug 30 11:39:25 PDT 2021


thakis created this revision.
thakis added a reviewer: lld-macho.
Herald added a reviewer: gkm.
Herald added a project: lld-macho.
thakis requested review of this revision.

Fixes PR51578 in practice.

Currently there's only enough room for a single thunk, which for real-life code
isn't enough. The error case only happens when there are many branch statements
very close to each other (0 or 1 instructions apart), with the function at the
finalization barrier small.

There's a FIXME on what to do if we hit this case, but that suggestion sounds
complicated to me (see end of PR51578 comment 5 for why).

Instead, just leave more room for thunks. Chromium's unit_tests links fine with
room for 3 thunks. Leave room for 100, which should fix this for most cases in
practice.

There's little cost for leaving lots of room: This slop value only determines
when we finalize sections, and we insert thunks for forward jumps into
unfinalized sections. So leaving room means we'll need a few more thunks, but
the thunk jump range is 128 MiB while a single thunk is just 12 bytes.

For Chromium's unit_tests:
With a slop of   3: thunk calls = 355418, thunks = 10903
With a slop of 100: thunk calls = 355426, thunks = 10904

Chances are 100 is enough for all use cases we'll hit in practice, but even
bumping it to 1000 would probably be fine.


https://reviews.llvm.org/D108930

Files:
  lld/MachO/ConcatOutputSection.cpp


Index: lld/MachO/ConcatOutputSection.cpp
===================================================================
--- lld/MachO/ConcatOutputSection.cpp
+++ lld/MachO/ConcatOutputSection.cpp
@@ -222,6 +222,11 @@
   size_t thunkCallCount = 0;
   size_t thunkCount = 0;
 
+  // Walk all sections in order. Finalize all sections that are less than
+  // forwardBranchRange in front of it.
+  // isecVA is the address of the current section.
+  // isecAddr is the start address of the first non-finalized section.
+
   // inputs[finalIdx] is for finalization (address-assignment)
   size_t finalIdx = 0;
   // Kick-off by ensuring that the first input section has an address
@@ -232,12 +237,22 @@
     ConcatInputSection *isec = inputs[callIdx];
     assert(isec->isFinal);
     uint64_t isecVA = isec->getVA();
-    // Assign addresses up-to the forward branch-range limit
+
+    // Assign addresses up-to the forward branch-range limit.
+    // Every call instruction needs a small number of bytes (on Arm64: 4),
+    // and each inserted thunk needs a slighly larger number of bytes
+    // (on Arm64: 12). If a section starts with a branch instruction and
+    // contains several branch instructions in succession, then the distance
+    // from the current position to the position where the thunks are inserted
+    // grows. So leave room for a bunch of thunks.
+    unsigned slop = 100 * thunkSize;
     while (finalIdx < endIdx && isecAddr + inputs[finalIdx]->getSize() <
-                                    isecVA + forwardBranchRange - thunkSize)
+                                    isecVA + forwardBranchRange - slop)
       finalizeOne(inputs[finalIdx++]);
+
     if (isec->callSiteCount == 0)
       continue;
+
     if (finalIdx == endIdx && stubsInRangeVA == TargetInfo::outOfRangeVA) {
       // When we have finalized all input sections, __stubs (destined
       // to follow __text) comes within range of forward branches and
@@ -293,13 +308,10 @@
       }
       // ... otherwise, create a new thunk.
       if (isecAddr > highVA) {
-        // When there is small-to-no margin between highVA and
-        // isecAddr and the distance between subsequent call sites is
-        // smaller than thunkSize, then a new thunk can go out of
-        // range.  Fix by unfinalizing inputs[finalIdx] to reduce the
-        // distance between callVA and highVA, then shift some thunks
-        // to occupy address-space formerly occupied by the
-        // unfinalized inputs[finalIdx].
+        // There were too many consecutive branch instructions for `slop`
+        // above. If you hit this: For the current algorithm, just bumping up
+        // slop above and trying again is probably simplest. (See also PR51578
+        // comment 5).
         fatal(Twine(__FUNCTION__) + ": FIXME: thunk range overrun");
       }
       thunkInfo.isec =


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D108930.369507.patch
Type: text/x-patch
Size: 2854 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20210830/dcebf456/attachment.bin>


More information about the llvm-commits mailing list