[PATCH] D64523: [SLPVectorizer] Fix getSpillCost() calculation

Nikita Popov via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jul 10 11:30:52 PDT 2019


nikic created this revision.
nikic added reviewers: RKSimon, jmolloy.
Herald added subscribers: llvm-commits, hiraditya.
Herald added a project: LLVM.

Original motivation here is a performance problem encountered with the following IR, for which getSpillCost() takes an unreasonable amount of time: https://gist.github.com/nikic/a5bbc62632d68f76dd8fa9a4e81d9594

However, it turns out that the source of the perf problem is a correctness problem as well: The current implementation works by going through the tree nodes, remembering the previous node we looked at and inspecting all instructions between the previous and current node. Of course, because this is a tree and not just a chain, the "previous" node in the tree vector may occur after the current one. In this case the implementation ends up scanning through the whole block (even going through the effort of wrapping around from the start of the block to the end and continuing from there). If this happens often enough, this is not just very slow, but also results in gigantic spill costs (I saw costs >500000 for the sample code).

I'm changing the implementation to compute the cost in a single pass from the root node to the start of the basic block, keeping track of which values are currently live along the way. This resolves both the performance and the correctness issues.

The downside here is that I'm stopping at the edge of the basic block. I'm not sure how I would extend this to work across multiple basic blocks and not sure if that would even be worthwhile in the context of what the SLP vectorizer does.

(Side note: Intrinsic calls probably shouldn't count towards the spill cost. In this example the "calls" are all fshr/fshl, which are certainly not going to cause any spilling...)


Repository:
  rL LLVM

https://reviews.llvm.org/D64523

Files:
  llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp


Index: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
===================================================================
--- llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -3323,65 +3323,43 @@
   // live. When we see a call instruction that is not part of our tree,
   // query TTI to see if there is a cost to keeping values live over it
   // (for example, if spills and fills are required).
-  unsigned BundleWidth = VectorizableTree.front()->Scalars.size();
+  TreeEntry *Root = &*VectorizableTree.front();
+  unsigned BundleWidth = Root->Scalars.size();
   int Cost = 0;
 
-  SmallPtrSet<Instruction*, 4> LiveValues;
-  Instruction *PrevInst = nullptr;
+  Instruction *RootInst = cast<Instruction>(Root->Scalars[0]);
+  BasicBlock *BB = RootInst->getParent();
 
-  for (const auto &TEPtr : VectorizableTree) {
-    Instruction *Inst = dyn_cast<Instruction>(TEPtr->Scalars[0]);
-    if (!Inst)
-      continue;
+  SmallPtrSet<Instruction*, 8> LiveValues;
+  LiveValues.insert(RootInst);
 
-    if (!PrevInst) {
-      PrevInst = Inst;
-      continue;
-    }
+  // We only check the basic block that contains the root instruction.
+  BasicBlock::reverse_iterator It = RootInst->getIterator().getReverse();
+  for (; It != BB->rend() && !LiveValues.empty(); ++It) {
+    Instruction *I = &*It;
 
-    // Update LiveValues.
-    LiveValues.erase(PrevInst);
-    for (auto &J : PrevInst->operands()) {
-      if (isa<Instruction>(&*J) && getTreeEntry(&*J))
-        LiveValues.insert(cast<Instruction>(&*J));
+    if (LiveValues.erase(I)) {
+      for (Value *Op : I->operands())
+        if (isa<Instruction>(Op) && getTreeEntry(Op))
+          LiveValues.insert(cast<Instruction>(Op));
+      continue;
     }
 
-    LLVM_DEBUG({
-      dbgs() << "SLP: #LV: " << LiveValues.size();
-      for (auto *X : LiveValues)
-        dbgs() << " " << X->getName();
-      dbgs() << ", Looking at ";
-      Inst->dump();
-    });
-
-    // Now find the sequence of instructions between PrevInst and Inst.
-    unsigned NumCalls = 0;
-    BasicBlock::reverse_iterator InstIt = ++Inst->getIterator().getReverse(),
-                                 PrevInstIt =
-                                     PrevInst->getIterator().getReverse();
-    while (InstIt != PrevInstIt) {
-      if (PrevInstIt == PrevInst->getParent()->rend()) {
-        PrevInstIt = Inst->getParent()->rbegin();
-        continue;
-      }
-
-      // Debug informations don't impact spill cost.
-      if ((isa<CallInst>(&*PrevInstIt) &&
-           !isa<DbgInfoIntrinsic>(&*PrevInstIt)) &&
-          &*PrevInstIt != PrevInst)
-        NumCalls++;
-
-      ++PrevInstIt;
-    }
+    // Debug informations doesn't impact spill cost.
+    if (isa<CallInst>(I) && !isa<DbgInfoIntrinsic>(I)) {
+      LLVM_DEBUG({
+        dbgs() << "SLP: #LV: " << LiveValues.size();
+        for (auto *X : LiveValues)
+          dbgs() << " " << X->getName();
+        dbgs() << ", Looking at ";
+        I->dump();
+      });
 
-    if (NumCalls) {
       SmallVector<Type*, 4> V;
       for (auto *II : LiveValues)
         V.push_back(VectorType::get(II->getType(), BundleWidth));
-      Cost += NumCalls * TTI->getCostOfKeepingLiveOverCall(V);
+      Cost += TTI->getCostOfKeepingLiveOverCall(V);
     }
-
-    PrevInst = Inst;
   }
 
   return Cost;


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D64523.209026.patch
Type: text/x-patch
Size: 3358 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20190710/f8e865a4/attachment.bin>


More information about the llvm-commits mailing list