[PATCH] D146425: [SystemZ] Enable AtomicExpandPass for i128

Ulrich Weigand via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Mar 31 09:29:26 PDT 2023


uweigand added a comment.

> In theory, of course it would have been very nice if the register allocator would not run out of registers (still does without this), and also if it could split ("uncoalesce") as needed to minimize spilling. It does unfortunately not, so there remains the option to try to avoid spilling in shouldCoalesce(). Should we keep it simple here, or should we experiment a little and try to at least not increase spilling compared to main? Not sure if there is a natural heuristic that includes the CDSG loop without explicitly checking for that case...

While the raw spill counts give some indication, of course, it would be good to also look at actual performance impact of the change.   A few inline comments about the heuristics ...

Also, shouldn't there be some generic heuristics of whether or not coalescing is worthwhile?   It seems to me this would always be a tradeoff for any register class, not just for the 128-bit pairs?



================
Comment at: llvm/lib/Target/SystemZ/SystemZRegisterInfo.cpp:397
   BitVector PhysClobbered(getNumRegs());
-  MEE++;
-  for (; MII != MEE; ++MII) {
-    for (const MachineOperand &MO : MII->operands())
-      if (MO.isReg() && MO.getReg().isPhysical()) {
-        for (MCSuperRegIterator SI(MO.getReg(), this, true/*IncludeSelf*/);
-             SI.isValid(); ++SI)
-          if (NewRC->contains(*SI)) {
-            PhysClobbered.set(*SI);
-            break;
-          }
-      }
-  }
-
-  // Demand an arbitrary margin of free regs.
-  unsigned const DemandedFreeGR128 = 3;
-  if (PhysClobbered.count() > (NewRC->getNumRegs() - DemandedFreeGR128))
-    return false;
+  unsigned const AllowedClobbers = NewRC->getNumRegs() - 3;
+  unsigned const SearchLim = 50;
----------------
Does it help tweaking this heuristic a bit?  What if we use 4 or 2 instead of 3?


================
Comment at: llvm/lib/Target/SystemZ/SystemZRegisterInfo.cpp:398
+  unsigned const AllowedClobbers = NewRC->getNumRegs() - 3;
+  unsigned const SearchLim = 50;
+  unsigned Count = 0;
----------------
This introduces yet another weird heuristics.   Is this even necessary at all?  What are the compile-time impacts of just not doing this check?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D146425/new/

https://reviews.llvm.org/D146425



More information about the llvm-commits mailing list