[PATCH] D98714: [SLP] Add insertelement instructions to vectorizable tree

Alexey Bataev via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Apr 14 13:01:02 PDT 2021


ABataev added inline comments.


================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:2840-2845
+      int MinIndex = INT_MAX;
+      for (auto *V : VL)
+        MinIndex = std::min(
+            MinIndex,
+            int(cast<ConstantInt>(cast<InsertElementInst>(V)->getOperand(2))
+                    ->getZExtValue()));
----------------
Outline this code into a separate helper function


================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:2846-2853
+      bool IsConsecutive = true;
+      for (int I = 0, E = VL.size(); I < E; I++) {
+        int Index =
+            cast<ConstantInt>(cast<InsertElementInst>(VL[I])->getOperand(2))
+                ->getZExtValue() -
+            MinIndex;
+        IsConsecutive &= (I == Index);
----------------
Beter to turn this into lambda


================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:3505
     ScalarTy = CI->getOperand(0)->getType();
+  else if (InsertElementInst *IE = dyn_cast<InsertElementInst>(VL[0]))
+    ScalarTy = IE->getOperand(1)->getType();
----------------
`auto *IE`


================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:3639
+      InstructionCost Cost = 0;
+      auto *SrcVecTy = cast<FixedVectorType>(VL[0]->getType());
+
----------------
`VL[0]`->`VL0`?


================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:3650-3656
+      // TODO: fix getShuffleCost() for SK_InsertSubvector
+      // and uncomment this to accurately tune costs. For typical cases
+      // this should be zero when subvector is whole register.
+      // if (SrcVecTy->getNumElements() != VL.size()) {
+      //   Cost += TTI->getShuffleCost(TargetTransformInfo::SK_InsertSubvector,
+      //                               SrcVecTy, None, MinIndex, VecTy);
+      // }
----------------
I think this should be uncommented, otherwise, we are too optimistic about the cost.


================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:3975
+  // Vectorizing inserts of gathered values is redundant.
+  if (isa<InsertElementInst>(VectorizableTree[0]->Scalars[0]) &&
+      VectorizableTree[1]->State == TreeEntry::NeedToGather)
----------------
I think better to check that all elements are `InsertElementInst` and move this check closer  to the end of the function


================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:4204
+    if (auto *Insert = dyn_cast<InsertElementInst>(EU.Scalar)) {
+      auto &Scalars = getTreeEntry(EU.Scalar)->Scalars;
+      SmallVector<int> Mask(Scalars.size(), -1);
----------------
Better to create a set here


================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:4205
+      auto &Scalars = getTreeEntry(EU.Scalar)->Scalars;
+      SmallVector<int> Mask(Scalars.size(), -1);
+      int MinIndex = INT_MAX;
----------------
Use `UndefMaskElem` instead of `-1`


================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:4227-4228
+        auto *VecTy = cast<FixedVectorType>(EU.Scalar->getType());
+        ExtractCost += TTI->getShuffleCost(
+            TargetTransformInfo::SK_PermuteSingleSrc, VecTy, Mask);
+      }
----------------
Check for the kind of permutation in `Mask`, it may be an Identity of a Reverse kind.


================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:4575
     ScalarTy = Store->getValueOperand()->getType();
+  else if (InsertElementInst *IE = dyn_cast<InsertElementInst>(VL0))
+    ScalarTy = IE->getOperand(1)->getType();
----------------
`auto *IE`


================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:4636-4678
+    case Instruction::InsertElement: {
+      Builder.SetInsertPoint(VL0);
+      Value *V = vectorizeTree(E->getOperand(0));
+
+      const int NumElts =
+          cast<FixedVectorType>(VL0->getType())->getNumElements();
+      const int NumScalars = E->Scalars.size();
----------------
Can we leave it to InstCombiner and return the original insrtelement instruction as a result?


================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:5163
+      // Generate extract shuffle when Scalar is not the last insert
+      SmallVector<int> Mask(E->Scalars.size(), -1);
+      int MinIndex = INT_MAX;
----------------
`UndefMaskElem`


================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:5189-5190
+
+      LLVM_DEBUG(dbgs() << "SLP: Replaced:" << *User << ".\n");
+    } else if (auto *VecI = dyn_cast<Instruction>(Vec)) {
       if (PHINode *PH = dyn_cast<PHINode>(User)) {
----------------
```
  continue;
}
if (...) {
```


================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:6525-6527
+      ArrayRef<Value *> Ops = InsertUses.empty()
+                                  ? VL.slice(I, OpsWidth)
+                                  : InsertUses.slice(I, OpsWidth);
----------------
Why not just pass `InsertUses` as ёМДё here and drop `InsertUses` parameter completely?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98714/new/

https://reviews.llvm.org/D98714



More information about the llvm-commits mailing list