[PATCH] D98714: [SLP] Add insertelement instructions to vectorizable tree

Alexey Bataev via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon May 10 05:52:40 PDT 2021


ABataev added inline comments.


================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:4215-4217
+    // No extract cost for vector "scalar"
+    if (isa<FixedVectorType>(EU.Scalar->getType()))
+      continue;
----------------
I don't think this is correct. I think you need to use code from my patch D101555 for better cost estimation here.


================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:5102
     Value *Lane = Builder.getInt32(ExternalUse.Lane);
+    auto extractAndExtendIfNeeded = [&](Value *Vec) {
+      if (Scalar->getType() != Vec->getType()) {
----------------
`ExtractAndExtendIfNeeded`


================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:2883
+
+      if (IsConsecutive) {
+        TreeEntry *TE = newTreeEntry(VL, Bundle /*vectorized*/, S, UserTreeIdx);
----------------
anton-afanasyev wrote:
> ABataev wrote:
> > If not consecutive, we can include the cost of single permute and still vectorize it, no?
> Inserts here are coming from `findBuildAggregate()`, being already sorted by index and therefore consecutive, the only exclusion is that they can have gaps, so we check it here. I don't think we should process this rare case within this patch.
Can we check for the gaps during the analysis of the build vector sequence instead?


================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:3666-3667
+      APInt DemandedElts = APInt::getNullValue(NumElts);
+      for (auto *V : VL)
+        DemandedElts.setBit(*getInsertIndex(V, 0));
+
----------------
anton-afanasyev wrote:
> ABataev wrote:
> > Hm, I think you need to normalize insert indices here, the `MinIndex` (offset) might be non `0` but should be 0. Plus, I think, need to add (subtract from vectcost) a cost of subvector insertion from Index `MinIndex` (if `MinIndex != 0`)
> Why do we need to normalize insert indices? `DemandElts` is passed to `getScalarizationOverhead()` which just summarizes cost of all eliminated insertelements (with non-normalized operand indices).
Ah, you're getting it for SrcVecTy, got it.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98714/new/

https://reviews.llvm.org/D98714



More information about the llvm-commits mailing list