[PATCH] D30200: [SLP] Fix for PR31880: shuffle and vectorize repeated scalar ops on extracted elements

Alexey Bataev via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jul 27 08:36:19 PDT 2017

ABataev marked an inline comment as done.
ABataev added inline comments.

Comment at: lib/Transforms/Vectorize/SLPVectorizer.cpp:241
+  // have blending of 2 vectors.
+  return {!UsedElements1.any(), Vec2
+                                    ? TargetTransformInfo::SK_PermuteTwoSrc
RKSimon wrote:
> mkuper wrote:
> > ABataev wrote:
> > > mkuper wrote:
> > > > mkuper wrote:
> > > > > I think if the order of extracts and inserts just happens to be the same, we'll overestimate the cost. But that's fine for now. Could you check, and if we indeed overestimate, add a FIXME?
> > > > When can !UsedElements1.any() be true?
> > > Yes, we're overestimated. Reworked it a bit. If `UsedElements1.any()` is `false`, it is `SK_Alternate` (i.e. blending, because we're not crossing lanes in `extractelement` instructions for different vectors), otherwise it is still estimated as `SK_PermuteTwoSrc` or `SK_PermuteSingleSrc`.
> > Oh, I'm sorry, I misread &= as |=.
> > 
> > This still looks wrong, though. The way I understand it, right now SK_Alternate (which usually, for x86, maps to a blend - but the two are not equivalent) has very specific requirements for the mask. See isAlternateVectorMask() in CostModel.cpp. I don't think your code matches that.
> > 
> > Adding Elena  who should probably know better than I do.
> @mkuper is right - SK_Alternate is not a simple blend, it should only match shuffles which alternate between sequential elements from the 2 vectors (e.g. 0, 5, 2, 7). I've no idea why we went for that instead of a general SK_Blend but that's where we are...
Reworked this part

Comment at: lib/Transforms/Vectorize/SLPVectorizer.cpp:1789
           // The same, if have only one user, it will be vectorized for sure.
-          if (E->hasOneUse() ||
-              std::all_of(E->user_begin(), E->user_end(), [this](User *U) {
-                return ScalarToTreeEntry.count(U) > 0;
-              }))
+          if (areAllUsersVectorized(E))
             // Take credit for instruction that will become dead.
RKSimon wrote:
> I think this move to BoUpSLP::areAllUsersVectorized can be done as a NFC pre-commit.
Ok, will do


More information about the llvm-commits mailing list