[PATCH] D98714: [SLP] Add insertelement instructions to vectorizable tree

Alexey Bataev via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri May 7 04:09:59 PDT 2021


ABataev added inline comments.


================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:4672-4680
+        V = Builder.CreateShuffleVector(V, UndefValue::get(V->getType()), Mask);
+
+        for (unsigned I = 0; I < NumElts; I++)
+          Mask[I] =
+              (I < MinIndex || I >= MaxIndex) ? I : NumElts - MinIndex + I;
+
+        V = Builder.CreateShuffleVector(
----------------
anton-afanasyev wrote:
> ABataev wrote:
> > anton-afanasyev wrote:
> > > ABataev wrote:
> > > > anton-afanasyev wrote:
> > > > > anton-afanasyev wrote:
> > > > > > ABataev wrote:
> > > > > > > ABataev wrote:
> > > > > > > > Can we use `ShuffleBuilder` here?
> > > > > > > Did you include these costs in the cost model?
> > > > > > They are included in `getScalarizationOverhead()`.
> > > > > Yes, I thought about `ShuffleBuilder`, but here we just need to create two shuffles of special kind for vector resizing. It requires `ShuffleInstructionBuilder` expanding, don't think it's worth it.
> > > > I rather doubt in it. First, you subtract the scalarization overhead cost from the vector cost, but here you need to add the costs of subvector insert and permutation of 2 vectors
> > > >>> Did you include these costs in the cost model?
> > > >> They are included in getScalarizationOverhead().
> > > > I rather doubt in it. First, you subtract the scalarization overhead cost from the vector cost, but here you need to add the costs of subvector insert and permutation of 2 vectors
> > > 
> > > These subvector inserts are lowered to nop actually, so they cost nothing. We need this code when processing big vector of scalars part-by-part, but every chunk fits the whole vector register (condition `bits >= MinVecRegSize`), so actually there is no inserting. The result consisting of several vector registers is returned then.
> > This might be not true for some targets/code patterns. Plus, the second pattern is a permutation/combining of 2 vectors, its cost is at least 1
> > This might be not true for some targets/code patterns. Plus, the second pattern is a permutation/combining of 2 vectors, its cost is at least 1
> 
> Both of these shuffles are just the one operation of "inserting", we need the first one to expand source vector since shufflevector needs the same size of operands.
> 
> We can add cost of this shuffles, but this prevents several vectorization being performed before, since this cost is redundant (may be `TTI->getShuffleCost()` should be tuned for `TargetTransformInfo::SK_InsertVector`?)
Need to check that the vectorized code is really profitable and if so, need to tune the cost model, yes.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98714/new/

https://reviews.llvm.org/D98714



More information about the llvm-commits mailing list