[PATCH] D76956: [TTI][SLP] Add TTI interface to estimate cost of chain of vector inserts/extracts.

Thu Apr 9 09:43:45 PDT 2020

ABataev added inline comments.

================
Comment at: llvm/test/Transforms/SLPVectorizer/X86/resched.ll:22-29
+; CHECK-NEXT:    [[TMP1:%.*]] = insertelement <8 x i32> undef, i32 [[CONV31_I]], i32 0
+; CHECK-NEXT:    [[TMP2:%.*]] = insertelement <8 x i32> [[TMP1]], i32 [[CONV31_I]], i32 1
+; CHECK-NEXT:    [[TMP3:%.*]] = insertelement <8 x i32> [[TMP2]], i32 [[CONV31_I]], i32 2
+; CHECK-NEXT:    [[TMP4:%.*]] = insertelement <8 x i32> [[TMP3]], i32 [[CONV31_I]], i32 3
+; CHECK-NEXT:    [[TMP5:%.*]] = insertelement <8 x i32> [[TMP4]], i32 [[CONV31_I]], i32 4
+; CHECK-NEXT:    [[TMP6:%.*]] = insertelement <8 x i32> [[TMP5]], i32 [[CONV31_I]], i32 5
+; CHECK-NEXT:    [[TMP7:%.*]] = insertelement <8 x i32> [[TMP6]], i32 [[CONV31_I]], i32 6
----------------
spatel wrote:
> vdmitrie wrote:
> > spatel wrote:
> > > Did SLP fail to recognize that this is a splat shuffle? I would have expected it to produce splat IR:
> > > (shuffle (insert X, 0), zeroinitializer)
> > > ...instead of a chain of inserts.
> > No. It did not fail to recognize a splat. As I see from code single element is not shuffled deliberately:
> > 
> > Value *BoUpSLP::vectorizeTree(ArrayRef<Value *> VL) {
> > ...
> >     // Do not shuffle single element or if number of unique values is not power
> >     // of 2.
> >     if (UniqueValues.size() == VL.size() || UniqueValues.size() <= 1 ||
> >         !llvm::isPowerOf2_32(UniqueValues.size()))
> >       ReuseShuffleIndicies.clear();
> > ...
> > 
> I tried to step through the debug spew from SLP, but I can't tell what is happening on this example. I only see a call to getGatherCost() at current line 3318 of SLPVectorizer.cpp, so I thought that is the point where we check for a splat. 
> 
> I don't understand this part of the model/usage very well so others should review this patch, but there's still a concern here: ideally, SLP should not be producing this chain of inserts if it's a splat op. I don't know if that changes how we view the diff for the cost model.
I think, SLP recognizes splat here, but models it through gather, and relies on the InstCombiner to transform it into a real shuffle.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76956/new/

https://reviews.llvm.org/D76956