[PATCH] D79783: [LoopVectorize] Fallback to a scalar epilogue when TP fails

Ayal Zaks via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun May 24 04:16:46 PDT 2020


Ayal added inline comments.


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:5027
+                         "hint/switch and using a scalar epilogue instead.\n");
+    ScalarEpilogueStatus = CM_ScalarEpilogueAllowed;
+    return MaxVF;
----------------
Pierre-vh wrote:
> Pierre-vh wrote:
> > Ayal wrote:
> > > Ayal wrote:
> > > > Pierre-vh wrote:
> > > > > Ayal wrote:
> > > > > > Why change ScalarEpilogueStatus?
> > > > > (Sorry, forgot to post this comment)
> > > > > It seems to be needed, my tests don't pass if unless I change it (they have conflicting ASM for both run lines, and the content is different)
> > > > Why change ScalarEpilogueStatus?
> > > Add `-check-prefix` to the FileCheck of one of the runs?
> > > https://llvm.org/docs/CommandGuide/FileCheck.html#cmdoption-filecheck-check-prefix
> > This is what's added when I don't change `ScalarEpilogueStatus`, and it's set to `ScalarEpilogueNotNeededUsePredicate`:
> > 
> > ```
> > ; CHECK-NEXT:    [[BROADCAST_SPLATINSERT:%.*]] = insertelement <4 x i32> undef, i32 [[OFFSET_IDX]], i32 0
> > ; CHECK-NEXT:    [[BROADCAST_SPLAT:%.*]] = shufflevector <4 x i32> [[BROADCAST_SPLATINSERT]], <4 x i32> undef, <4 x i32> zeroinitializer
> > ; CHECK-NEXT:    [[INDUCTION:%.*]] = add <4 x i32> [[BROADCAST_SPLAT]], <i32 0, i32 -1, i32 -2, i32 -3>
> > ```
> > 
> > It's because of this in `InnerLoopVectorizer::widenIntOrFpInduction` in LoopVectorize.cpp:1920
> > ```
> >   // All IV users are scalar instructions, so only emit a scalar IV, not a
> >   // vectorised IV. Except when we tail-fold, then the splat IV feeds the
> >   // predicate used by the masked loads/stores.
> >   if (!Cost->isScalarEpilogueAllowed()) // ScalarEpilogueStatus == CM_ScalarEpilogueAllowed
> >     CreateSplatIV(ScalarIV, Step);
> > ```
> > I am guessing that this should be using `Cost->foldTailByMasking()` instead, which returns the `FoldTailByMasking` boolean instead of looking at `ScalarEpilogueStatus`.
> > 
> After looking a bit more into it:
>   - Changing `!Cost->isScalarEpilogueAllowed()` to `Cost->foldTailByMasking()` does not solve the issue entirely, it causes multiple test failures.
>   - The `INDUCTION` variable is unused, so this bit of code definitely shouldn't be there.
> Solution is either to change the `ScalarEpilogueStatus`, or find out why `Cost->foldTailByMasking()` doesn't work as expected.
> The tests that fail are:
> ```
> Failing Tests (3):
>   LLVM :: Transforms/LoopVectorize/X86/constant-fold.ll
>   LLVM :: Transforms/LoopVectorize/X86/vect.omp.force.small-tc.ll
>   LLVM :: Transforms/LoopVectorize/pr44488-predication.ll
> ```
> 
> Which all seem to expect this "INDUCTION" variable, one way or another. However, interestingly, the tests don't use the `INDUCTION` instruction at all, so maybe it's a bug? (Everytime there's a failure in one of those test, it's in a place where those 3 lines are expected, but not used at all)
> 
So now we know why ScalarEpilogueStatus needed to change...

@SjoerdMeijer tried to remove that presumably redundant call to CreateSplatIV() in D78911, recommitting it later under ScalarEpilogueStatus in rG9529597cf4562c64034943dacc29a4ff4fe93d86. We're still trying to figure out how to remove it: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20200518/784362.html

> I am guessing that this should be using Cost->foldTailByMasking() instead, which returns the FoldTailByMasking boolean instead of looking at ScalarEpilogueStatus.

guess so too, along with needed test fixings. Sjoerd, what do you think?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79783/new/

https://reviews.llvm.org/D79783





More information about the llvm-commits mailing list