[PATCH] D98714: [SLP] Add insertelement instructions to vectorizable tree

Thu May 6 09:25:06 PDT 2021

ABataev added inline comments.

================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:4670-4671
+        // Create shuffle to resize vector
+        for (unsigned I = 0; I < NumElts; I++)
+          Mask[I] = (I < NumScalars) ? I : UndefMaskElem;
+        V = Builder.CreateShuffleVector(V, UndefValue::get(V->getType()), Mask);
----------------
anton-afanasyev wrote:
> ABataev wrote:
> > ```
> > SmallVector<int, 16> Mask(NumElts, UndefMaskElem);
> > std::iota(Mask.begin(), std::next(Mask.begin(), NumScalars), 0); 
> > ```
> > 
> > Also, I think just `std::iota(Mask.begin(), Mask.end(), 0); ` shall work too
> Thanks, changed to `std::iota(Mask.begin(), std::next(Mask.begin(), NumScalars), 0);`
> 
> Yes, `std::iota(Mask.begin(), Mask.end(), 0);` gives the same result, but wouldn't it lead to redundant code lowered?
`std::iota(Mask.begin(), Mask.end(), 0);` will produce the same code as `std::iota(Mask.begin(), std::next(Mask.begin(), NumScalars), 0);` but only if `NumScalars <= Mask.size() * 2`. Otherwise the compiler may crash in some cases. So, better to keep `std::iota(Mask.begin(), std::next(Mask.begin(), NumScalars), 0);`

================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:4672-4680
+        V = Builder.CreateShuffleVector(V, UndefValue::get(V->getType()), Mask);
+
+        for (unsigned I = 0; I < NumElts; I++)
+          Mask[I] =
+              (I < MinIndex || I >= MaxIndex) ? I : NumElts - MinIndex + I;
+
+        V = Builder.CreateShuffleVector(
----------------
anton-afanasyev wrote:
> anton-afanasyev wrote:
> > ABataev wrote:
> > > ABataev wrote:
> > > > Can we use `ShuffleBuilder` here?
> > > Did you include these costs in the cost model?
> > They are included in `getScalarizationOverhead()`.
> Yes, I thought about `ShuffleBuilder`, but here we just need to create two shuffles of special kind for vector resizing. It requires `ShuffleInstructionBuilder` expanding, don't think it's worth it.
I rather doubt in it. First, you subtract the scalarization overhead cost from the vector cost, but here you need to add the costs of subvector insert and permutation of 2 vectors

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98714/new/

https://reviews.llvm.org/D98714