[PATCH] D79783: [LoopVectorize] Fallback to a scalar epilogue when TP fails

Pierre van Houtryve via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu May 21 01:36:18 PDT 2020


Pierre-vh marked an inline comment as done.
Pierre-vh added inline comments.


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:5027
+                         "hint/switch and using a scalar epilogue instead.\n");
+    ScalarEpilogueStatus = CM_ScalarEpilogueAllowed;
+    return MaxVF;
----------------
Pierre-vh wrote:
> Ayal wrote:
> > Ayal wrote:
> > > Pierre-vh wrote:
> > > > Ayal wrote:
> > > > > Why change ScalarEpilogueStatus?
> > > > (Sorry, forgot to post this comment)
> > > > It seems to be needed, my tests don't pass if unless I change it (they have conflicting ASM for both run lines, and the content is different)
> > > Why change ScalarEpilogueStatus?
> > Add `-check-prefix` to the FileCheck of one of the runs?
> > https://llvm.org/docs/CommandGuide/FileCheck.html#cmdoption-filecheck-check-prefix
> This is what's added when I don't change `ScalarEpilogueStatus`, and it's set to `ScalarEpilogueNotNeededUsePredicate`:
> 
> ```
> ; CHECK-NEXT:    [[BROADCAST_SPLATINSERT:%.*]] = insertelement <4 x i32> undef, i32 [[OFFSET_IDX]], i32 0
> ; CHECK-NEXT:    [[BROADCAST_SPLAT:%.*]] = shufflevector <4 x i32> [[BROADCAST_SPLATINSERT]], <4 x i32> undef, <4 x i32> zeroinitializer
> ; CHECK-NEXT:    [[INDUCTION:%.*]] = add <4 x i32> [[BROADCAST_SPLAT]], <i32 0, i32 -1, i32 -2, i32 -3>
> ```
> 
> It's because of this in `InnerLoopVectorizer::widenIntOrFpInduction` in LoopVectorize.cpp:1920
> ```
>   // All IV users are scalar instructions, so only emit a scalar IV, not a
>   // vectorised IV. Except when we tail-fold, then the splat IV feeds the
>   // predicate used by the masked loads/stores.
>   if (!Cost->isScalarEpilogueAllowed()) // ScalarEpilogueStatus == CM_ScalarEpilogueAllowed
>     CreateSplatIV(ScalarIV, Step);
> ```
> I am guessing that this should be using `Cost->foldTailByMasking()` instead, which returns the `FoldTailByMasking` boolean instead of looking at `ScalarEpilogueStatus`.
> 
After looking a bit more into it:
  - Changing `!Cost->isScalarEpilogueAllowed()` to `Cost->foldTailByMasking()` does not solve the issue entirely, it causes multiple test failures.
  - The `INDUCTION` variable is unused, so this bit of code definitely shouldn't be there.
Solution is either to change the `ScalarEpilogueStatus`, or find out why `Cost->foldTailByMasking()` doesn't work as expected.
The tests that fail are:
```
Failing Tests (3):
  LLVM :: Transforms/LoopVectorize/X86/constant-fold.ll
  LLVM :: Transforms/LoopVectorize/X86/vect.omp.force.small-tc.ll
  LLVM :: Transforms/LoopVectorize/pr44488-predication.ll
```

Which all seem to expect this "INDUCTION" variable, one way or another. However, interestingly, the tests don't use the `INDUCTION` instruction at all, so maybe it's a bug? (Everytime there's a failure in one of those test, it's in a place where those 3 lines are expected, but not used at all)



CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79783/new/

https://reviews.llvm.org/D79783





More information about the llvm-commits mailing list