[PATCH] D15302: [Greedy regalloc] Replace analyzeSiblingValues with something new [Part1]

Quentin Colombet via llvm-commits llvm-commits at lists.llvm.org
Tue Mar 15 17:37:58 PDT 2016


qcolombet requested changes to this revision.
qcolombet added a comment.
This revision now requires changes to proceed.

Hi Wei,

The benchmarks are still running, but so far, the numbers look good.

Anyhow, I finally made time for the review.

Generally speaking this looks almost good to me. The quadratic behavior of the first loop in runHoistAllSpills scares me and we need to look for a better alternative.
Moreover we need better comments for the APIs.
Finally, the tests change with more moves are worrisome. Could you explain why this happens and how we will fix that?
It seems to me we are choosing an insertion point for the store that happens too late.

I have highlighted all my concerns with the inline comments.

Cheers,
-Quentin


================
Comment at: include/llvm/CodeGen/LiveRangeEdit.h:123
@@ -119,2 +122,3 @@
   LiveRangeEdit(LiveInterval *parent, SmallVectorImpl<unsigned> &newRegs,
+                SmallPtrSet<MachineInstr *, 32> *deadRemats,
                 MachineFunction &MF, LiveIntervals &lis, VirtRegMap *vrm,
----------------
Put that as at the end of the list with nullptr as default parameter.

================
Comment at: include/llvm/CodeGen/LiveRangeEdit.h:221
@@ +220,3 @@
+    // For regallocs other than Greedy, DeadRemats is nullptr for now.
+    if (DeadRemats)
+      DeadRemats->insert(inst);
----------------
Maybe just say that DeadRemats is an optional field.
Mentioning Greedy here does not bring any value IMO.

================
Comment at: lib/CodeGen/InlineSpiller.cpp:74
@@ +73,3 @@
+  // have the same stackslot and have equal values defined by Original VNI.
+  // These spills are mergable and are hoist candiates.
+  typedef DenseMap<std::pair<int, VNInfo *>, SmallPtrSet<MachineInstr *, 16>>
----------------
mergeable

================
Comment at: lib/CodeGen/InlineSpiller.cpp:79
@@ +78,3 @@
+
+  /// Virt2SibingsMap - This is the map from original register to a set
+  /// containing all its siblings. To hoist a spill to another BB, we need
----------------
Do not repeat the name of the field in the comment.

================
Comment at: lib/CodeGen/InlineSpiller.cpp:81
@@ +80,3 @@
+  /// containing all its siblings. To hoist a spill to another BB, we need
+  /// to find out a live sibling there and use it as the RHS of the new spill.
+  DenseMap<unsigned, DenseSet<unsigned>> Virt2SiblingsMap;
----------------
 […] as the source (instead of RHS) of the new ..

================
Comment at: lib/CodeGen/InlineSpiller.cpp:82
@@ +81,3 @@
+  /// to find out a live sibling there and use it as the RHS of the new spill.
+  DenseMap<unsigned, DenseSet<unsigned>> Virt2SiblingsMap;
+
----------------
How big are the sets?
I would expect very few siblings on average and was wondering if a SmallSetVector or SmallSet would be more appropriate.

================
Comment at: lib/CodeGen/InlineSpiller.cpp:85
@@ +84,3 @@
+  bool isSpillCandBB(unsigned OrigReg, VNInfo *OrigVNI, MachineBasicBlock *BB,
+                     unsigned &LiveReg);
+  void getVisitOrders(
----------------
Please use reference for values that cannot be nullptr. I.e., OrigVNI and BB.

================
Comment at: lib/CodeGen/InlineSpiller.cpp:90
@@ +89,3 @@
+      SmallVectorImpl<MachineInstr *> &SpillsToRm,
+      DenseMap<MachineDomTreeNode *, unsigned> &SpillsToKept,
+      DenseMap<MachineDomTreeNode *, MachineInstr *> &SpillBBToSpill);
----------------
SpillsToKeep

================
Comment at: lib/CodeGen/InlineSpiller.cpp:213
@@ +212,3 @@
+  HoistSpiller *HSpiller = new HoistSpiller(pass, mf, vrm);
+  (dyn_cast<InlineSpiller>(spiller))->setHSpiller(HSpiller);
+}
----------------
Hid the instantiation of the hoist spiller helper in the inliner spiller.
The positive side effect is that we won’t leak the memory!

================
Comment at: lib/CodeGen/InlineSpiller.cpp:223
@@ +222,3 @@
+  assert(NewVRegs.size() == 0 &&
+         "No new vregs should be generated in hoistAllSpills");
+}
----------------
Hid the call to the hoist spiller helper in the inliner spiller.

================
Comment at: lib/CodeGen/InlineSpiller.cpp:998
@@ +997,3 @@
+// .i.e, there should be a living sibling of OrigReg at the insert point.
+bool HoistSpiller::isSpillCandBB(unsigned OrigReg, VNInfo *OrigVNI,
+                                 MachineBasicBlock *BB, unsigned &LiveReg) {
----------------
.i.e => i.e.

================
Comment at: lib/CodeGen/InlineSpiller.cpp:998
@@ +997,3 @@
+// .i.e, there should be a living sibling of OrigReg at the insert point.
+bool HoistSpiller::isSpillCandBB(unsigned OrigReg, VNInfo *OrigVNI,
+                                 MachineBasicBlock *BB, unsigned &LiveReg) {
----------------
qcolombet wrote:
> .i.e => i.e.
Use doxygen style comment.

================
Comment at: lib/CodeGen/InlineSpiller.cpp:1010
@@ +1009,3 @@
+
+  for (auto const ent : Siblings) {
+    LiveInterval &LI = LIS.getInterval(ent);
----------------
DenseSet does not guarantee that the iteration order is stable from one run to another, does it?

Although we should have several siblings live at the same time, this is theoretically possible.
In other words, we should use a container that has a deterministic iteration order for reproducibility. See the earlier suggestion I made.

================
Comment at: lib/CodeGen/InlineSpiller.cpp:1024
@@ +1023,3 @@
+/// same time.
+void HoistSpiller::getVisitOrders(
+    MachineBasicBlock *Root, SmallPtrSet<MachineInstr *, 16> &Spills,
----------------
This method would benefit a more verbose comment.
Maybe something along the line:
Starting from \p Root find a top-down traversal order of the dominator tree to visit all basic blocks containing the elements of \p Spills.
Redundent spills will be found and put into \p SpillsToRm at the same time.
\p SpillBBToSpill will populated as part of the process and maps a basic block to the first store occurring in the basic block.

\post SpillToRm.union(Spills at post) == Spills at pre

What is the usage of SpillsToKept? In particular the unsigned part?

Should we consider to moving some of the arguments to field of the current hoistspill instance?

================
Comment at: lib/CodeGen/InlineSpiller.cpp:1024
@@ +1023,3 @@
+/// same time.
+void HoistSpiller::getVisitOrders(
+    MachineBasicBlock *Root, SmallPtrSet<MachineInstr *, 16> &Spills,
----------------
qcolombet wrote:
> This method would benefit a more verbose comment.
> Maybe something along the line:
> Starting from \p Root find a top-down traversal order of the dominator tree to visit all basic blocks containing the elements of \p Spills.
> Redundent spills will be found and put into \p SpillsToRm at the same time.
> \p SpillBBToSpill will populated as part of the process and maps a basic block to the first store occurring in the basic block.
> 
> \post SpillToRm.union(Spills at post) == Spills at pre
> 
> What is the usage of SpillsToKept? In particular the unsigned part?
> 
> Should we consider to moving some of the arguments to field of the current hoistspill instance?
This method does a bunch of things!
Although I understand we want to share the logic that does the traversal and such, I found that it makes the code harder to read.
I’d say as it is now and with more comment like I suggested, this is fine, but in general I rather have a better separation of concerns then try to optimize if it turns out to be problematic.
I am guessing you already went through that process, we are just lacking the history :).

================
Comment at: lib/CodeGen/InlineSpiller.cpp:1043
@@ +1042,3 @@
+      SpillsToRm.push_back(SpillToRm);
+      SpillBBToSpill[MDT.DT->getNode(Block)] = SpillToKeep;
+    } else {
----------------
Any chance this could be updated when we insert the spill?
Like I said, it just feels like getVisitOrders does too many things.

================
Comment at: lib/CodeGen/InlineSpiller.cpp:1051
@@ +1050,3 @@
+
+  SmallPtrSet<MachineDomTreeNode *, 8> WorkSet;
+  SmallPtrSet<MachineDomTreeNode *, 8> NodesOnPath;
----------------
Please document what is WorkSet supposed to contain.

================
Comment at: lib/CodeGen/InlineSpiller.cpp:1053
@@ +1052,3 @@
+  SmallPtrSet<MachineDomTreeNode *, 8> NodesOnPath;
+  MachineDomTreeNode *RootIDomNode = MDT[Root]->getIDom();
+  // For every node on the dominator tree with spill, walk upside on the
----------------
I think we should describe what is the expected root, because it seems strange to me that we don’t just take the node for the Root.

================
Comment at: lib/CodeGen/InlineSpiller.cpp:1065
@@ +1064,3 @@
+    while (Node != RootIDomNode) {
+      if (Node != MDT[Block] && SpillBBToSpill[Node]) {
+        SpillToRm = SpillBBToSpill[MDT[Block]];
----------------
More comments, e.g.,
Node dominates Block and already store the value.
This store is redundant.

================
Comment at: lib/CodeGen/InlineSpiller.cpp:1078
@@ +1077,3 @@
+    } else {
+      SpillsToKept[MDT[Block]] = 0;
+      WorkSet.insert(NodesOnPath.begin(), NodesOnPath.end());
----------------
Ok, found the meaning of the unsigned elsewhere…
A comment here as well would be great.

================
Comment at: lib/CodeGen/InlineSpiller.cpp:1098
@@ +1097,3 @@
+  } while (idx != Orders.size());
+
+  DEBUG(dbgs() << "Orders size is " << Orders.size() << "\n");
----------------
Assert Orders.size == WorkSet.size?

================
Comment at: lib/CodeGen/InlineSpiller.cpp:1120
@@ +1119,3 @@
+  // during hoisting. If the spill to be inserted is an original spill
+  // (not a hoisted one), the value of the map entry is 0. If the spill
+  // is a hoisted spill, the value of the map entry is the VReg to be used
----------------
We do not insert the original store, it is already there, right?

================
Comment at: lib/CodeGen/InlineSpiller.cpp:1135
@@ +1134,3 @@
+      SpillsInSubTree;
+  // Iterate Orders set in reverse order, which will be a bottom-top order
+  // in the dominator tree. Once we visit a dom tree node, we know its
----------------
I believe we usually use bottom-up instead of bottom-top.

================
Comment at: lib/CodeGen/InlineSpiller.cpp:1137
@@ +1136,3 @@
+  // in the dominator tree. Once we visit a dom tree node, we know its
+  // children has already been visited and the spill locations in the
+  // subtrees of all the children have been determined.
----------------
have

================
Comment at: lib/CodeGen/InlineSpiller.cpp:1174
@@ +1173,3 @@
+    for (const auto SpillBB : SpillsInSubTree[*RIt])
+      SpillCost += MBFI.getBlockFreq(SpillBB->getBlock());
+
----------------
If the subtrees get big, we will end up recomputing this cost a bunch of time.
Could it be something we keep alongside the subtree?

================
Comment at: lib/CodeGen/InlineSpiller.cpp:1181
@@ +1180,3 @@
+                                       : BranchProbability(1, 1);
+    if (SpillCost > MBFI.getBlockFreq(Block) * MarginProb) {
+      // Hoist: Move spills to current Block.
----------------
We could add a mode for the hoist spiller, where code size is the priority. I.e., always hoist when SpillsInSubTree.size > 1
A follow-up patch is fine.

================
Comment at: lib/CodeGen/InlineSpiller.cpp:1208
@@ +1207,3 @@
+  }
+  // For spills in SpillsToKept with LiveReg set (.i.e, not original spill),
+  // save them to SpillsToIns.
----------------
typo.

================
Comment at: lib/CodeGen/InlineSpiller.cpp:1210
@@ +1209,3 @@
+  // save them to SpillsToIns.
+  for (const auto ent : SpillsToKept) {
+    if (ent.second)
----------------
Variable must start with a capital letter.
Also why use ent for the name?

================
Comment at: lib/CodeGen/InlineSpiller.cpp:1218
@@ +1217,3 @@
+/// to a less hot spot.
+void HoistSpiller::hoistAllSpills(LiveRangeEdit &Edit) {
+  // Save the mapping between stackslot and its original reg.
----------------
Explain the general algorithm here.

================
Comment at: lib/CodeGen/InlineSpiller.cpp:1234
@@ +1233,3 @@
+      Virt2SiblingsMap[Original].insert(Reg);
+  }
+
----------------
This loop scares me.
Any chance this information can be built as we insert spill.

================
Comment at: lib/CodeGen/InlineSpiller.cpp:1242
@@ +1241,3 @@
+    SmallPtrSet<MachineInstr *, 16> &EqValSpills = ent.second;
+    if (!ent.second.size())
+      continue;
----------------
empty

================
Comment at: lib/CodeGen/RegAllocGreedy.cpp:401
@@ -399,2 +400,3 @@
   void tryHintsRecoloring();
+  void postOptimization();
 
----------------
I would have put this into the base class.

================
Comment at: lib/CodeGen/RegAllocGreedy.cpp:2571
@@ +2570,3 @@
+  eliminateDeadRemats();
+  startHoistSpiller(*MF, *VRM, *LIS, &spiller());
+}
----------------
spiller().postOptimization()

================
Comment at: lib/CodeGen/RegAllocGreedy.cpp:2597
@@ -2589,2 +2596,3 @@
   SpillerInstance.reset(createInlineSpiller(*this, *MF, *VRM));
+  createHoistSpiller(*this, *MF, *VRM, &spiller());
   Loops = &getAnalysis<MachineLoopInfo>();
----------------
Should be created within the inline spiller. See my comment on createInlineSpiller.

================
Comment at: lib/CodeGen/RegAllocPBQP.cpp:130
@@ +129,3 @@
+  /// always available for the remat of all the siblings of the original reg.
+  SmallPtrSet<MachineInstr *, 32> DeadRemats;
+
----------------
This should be a separate patch.

================
Comment at: lib/CodeGen/RegAllocPBQP.cpp:156
@@ -149,1 +155,3 @@
+  /// Remove dead defs because of rematerialization.
+  void eliminateDeadRemats(LiveIntervals &LIS);
 };
----------------
Ditto.

================
Comment at: lib/CodeGen/RegAllocPBQP.cpp:731
@@ +730,3 @@
+  DeadRemats.clear();
+}
+
----------------
We shouldn’t have to touch this.

================
Comment at: lib/CodeGen/RegisterCoalescer.cpp:463
@@ -464,1 +462,3 @@
+  LiveRangeEdit(nullptr, NewRegs, nullptr, *MF, *LIS, nullptr, this)
+      .eliminateDeadDefs(DeadDefs);
 }
----------------
We shouldn’t have to touch this.

================
Comment at: lib/CodeGen/Spiller.h:38
@@ -37,3 +37,3 @@
                                MachineFunction &mf,
                                VirtRegMap &vrm);
 
----------------
add a bool here that default to false for using a post optimization.

================
Comment at: lib/CodeGen/Spiller.h:46
@@ -40,1 +45,3 @@
+  void startHoistSpiller(MachineFunction &mf, VirtRegMap &vrm,
+                         LiveIntervals &lis, Spiller *);
 }
----------------
Get rid of those.

================
Comment at: lib/CodeGen/SplitKit.cpp:727
@@ +726,3 @@
+/// be hoisted because of too much cost.
+void SplitEditor::removeRedundentCopies(DenseSet<unsigned> &NotToHoistSet,
+                                        SmallVectorImpl<VNInfo *> &BackCopies) {
----------------
Please don’t repeat the comment from the declaration.

================
Comment at: lib/CodeGen/SplitKit.cpp:742
@@ +741,3 @@
+
+  // For VNI aggregation of each ParentVNI, collect dominated, .i.e,
+  // redundent VNIs to BackCopies.
----------------
i.e.

================
Comment at: lib/CodeGen/SplitKit.cpp:749
@@ +748,3 @@
+    for (auto Ent1 : EqualVNs[ParentVNI->id]) {
+      for (auto Ent2 : EqualVNs[ParentVNI->id]) {
+        if (Ent1 == Ent2 || DominatedVNIs.count(Ent1) ||
----------------
We should we start iterating with the next iterator instead of starting over.
The next call to count should early continue the loop but still!

================
Comment at: lib/CodeGen/SplitKit.cpp:1134
@@ -1053,3 +1133,3 @@
   case SM_Speed:
-    llvm_unreachable("Spill mode 'speed' not implemented yet");
+    hoistCopies();
   }
----------------
Add a comment saying that hoistCopies will behave differently between size and speed, otherwise it feels like those modes are the same.

================
Comment at: lib/CodeGen/SplitKit.h:333
@@ -331,4 +332,3 @@
 
-  /// hoistCopiesForSize - Hoist back-copies to the complement interval in a
-  /// way that minimizes code size. This implements the SM_Size spill mode.
-  void hoistCopiesForSize();
+  /// removeRedundentCopies - Remove redundent back-copies if it has been
+  /// decided those back-copies will not be hoisted.
----------------
Don’t repeat the name of the method.

================
Comment at: lib/CodeGen/SplitKit.h:335
@@ +334,3 @@
+  /// decided those back-copies will not be hoisted.
+  void removeRedundentCopies(DenseSet<unsigned> &NotToHoistSet,
+                             SmallVectorImpl<VNInfo *> &BackCopies);
----------------
Given how this is used, the actual name of this method should be computeRedundentBackCopies.

================
Comment at: lib/CodeGen/SplitKit.h:338
@@ +337,3 @@
+
+  /// hoistCopies - Hoist back-copies to the complement interval.
+  void hoistCopies();
----------------
Ditto.

================
Comment at: test/CodeGen/X86/vselect-minmax.ll:4898
@@ -4898,1 +4897,3 @@
+; SSE2-NEXT:    movdqa %xmm11, %xmm8
+; SSE2-NEXT:    movdqa %xmm11, -{{[0-9]+}}(%rsp) # 16-byte Spill
 ; SSE2-NEXT:    pxor %xmm10, %xmm8
----------------
Why is this happening?

================
Comment at: test/CodeGen/X86/vselect-minmax.ll:7620
@@ -7618,1 +7619,3 @@
+; SSE2-NEXT:    movdqa %xmm11, %xmm8
+; SSE2-NEXT:    movdqa %xmm11, -{{[0-9]+}}(%rsp) # 16-byte Spill
 ; SSE2-NEXT:    pxor %xmm10, %xmm8
----------------
Ditto.


Repository:
  rL LLVM

http://reviews.llvm.org/D15302





More information about the llvm-commits mailing list