[PATCH] D98714: [SLP] Add insertelement instructions to vectorizable tree
Alexey Bataev via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu May 6 09:25:06 PDT 2021
ABataev added inline comments.
================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:4670-4671
+ // Create shuffle to resize vector
+ for (unsigned I = 0; I < NumElts; I++)
+ Mask[I] = (I < NumScalars) ? I : UndefMaskElem;
+ V = Builder.CreateShuffleVector(V, UndefValue::get(V->getType()), Mask);
----------------
anton-afanasyev wrote:
> ABataev wrote:
> > ```
> > SmallVector<int, 16> Mask(NumElts, UndefMaskElem);
> > std::iota(Mask.begin(), std::next(Mask.begin(), NumScalars), 0);
> > ```
> >
> > Also, I think just `std::iota(Mask.begin(), Mask.end(), 0); ` shall work too
> Thanks, changed to `std::iota(Mask.begin(), std::next(Mask.begin(), NumScalars), 0);`
>
> Yes, `std::iota(Mask.begin(), Mask.end(), 0);` gives the same result, but wouldn't it lead to redundant code lowered?
`std::iota(Mask.begin(), Mask.end(), 0);` will produce the same code as `std::iota(Mask.begin(), std::next(Mask.begin(), NumScalars), 0);` but only if `NumScalars <= Mask.size() * 2`. Otherwise the compiler may crash in some cases. So, better to keep `std::iota(Mask.begin(), std::next(Mask.begin(), NumScalars), 0);`
================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:4672-4680
+ V = Builder.CreateShuffleVector(V, UndefValue::get(V->getType()), Mask);
+
+ for (unsigned I = 0; I < NumElts; I++)
+ Mask[I] =
+ (I < MinIndex || I >= MaxIndex) ? I : NumElts - MinIndex + I;
+
+ V = Builder.CreateShuffleVector(
----------------
anton-afanasyev wrote:
> anton-afanasyev wrote:
> > ABataev wrote:
> > > ABataev wrote:
> > > > Can we use `ShuffleBuilder` here?
> > > Did you include these costs in the cost model?
> > They are included in `getScalarizationOverhead()`.
> Yes, I thought about `ShuffleBuilder`, but here we just need to create two shuffles of special kind for vector resizing. It requires `ShuffleInstructionBuilder` expanding, don't think it's worth it.
I rather doubt in it. First, you subtract the scalarization overhead cost from the vector cost, but here you need to add the costs of subvector insert and permutation of 2 vectors
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D98714/new/
https://reviews.llvm.org/D98714
More information about the llvm-commits
mailing list