[PATCH] D30200: [SLP] Fix for PR31880: shuffle and vectorize repeated scalar ops on extracted elements

Simon Pilgrim via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun Jun 25 02:43:24 PDT 2017


RKSimon added a reviewer: RKSimon.
RKSimon added a comment.

A couple of comments, but otherwise I think this looks almost ready to go.



================
Comment at: lib/Transforms/Vectorize/SLPVectorizer.cpp:241
+  // have blending of 2 vectors.
+  return {!UsedElements1.any(), Vec2
+                                    ? TargetTransformInfo::SK_PermuteTwoSrc
----------------
mkuper wrote:
> ABataev wrote:
> > mkuper wrote:
> > > mkuper wrote:
> > > > I think if the order of extracts and inserts just happens to be the same, we'll overestimate the cost. But that's fine for now. Could you check, and if we indeed overestimate, add a FIXME?
> > > When can !UsedElements1.any() be true?
> > Yes, we're overestimated. Reworked it a bit. If `UsedElements1.any()` is `false`, it is `SK_Alternate` (i.e. blending, because we're not crossing lanes in `extractelement` instructions for different vectors), otherwise it is still estimated as `SK_PermuteTwoSrc` or `SK_PermuteSingleSrc`.
> Oh, I'm sorry, I misread &= as |=.
> 
> This still looks wrong, though. The way I understand it, right now SK_Alternate (which usually, for x86, maps to a blend - but the two are not equivalent) has very specific requirements for the mask. See isAlternateVectorMask() in CostModel.cpp. I don't think your code matches that.
> 
> Adding Elena  who should probably know better than I do.
@mkuper is right - SK_Alternate is not a simple blend, it should only match shuffles which alternate between sequential elements from the 2 vectors (e.g. 0, 5, 2, 7). I've no idea why we went for that instead of a general SK_Blend but that's where we are...


================
Comment at: lib/Transforms/Vectorize/SLPVectorizer.cpp:1765
+                    cast<ExtractElementInst>(VL[I])->getIndexOperand())
+                    ->getZExtValue());
+          }
----------------
Any way that this can be tied up? Maybe pull out cast<ExtractElementInst>(VL[I])->getIndexOperand())

I hate seeing -> on new lines....


================
Comment at: lib/Transforms/Vectorize/SLPVectorizer.cpp:1789
           // The same, if have only one user, it will be vectorized for sure.
-          if (E->hasOneUse() ||
-              std::all_of(E->user_begin(), E->user_end(), [this](User *U) {
-                return ScalarToTreeEntry.count(U) > 0;
-              }))
+          if (areAllUsersVectorized(E))
             // Take credit for instruction that will become dead.
----------------
I think this move to BoUpSLP::areAllUsersVectorized can be done as a NFC pre-commit.


https://reviews.llvm.org/D30200





More information about the llvm-commits mailing list