[PATCH] D101916: [LoopVectorize] Fix crash for predicated instructions with scalable VF

Tue May 18 01:50:47 PDT 2021

sdesmalen added inline comments.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:5605

+  if (loopHasScalarWithPredication(MaxScalableVF,isScalarEpilogueAllowed())) {
+    reportVectorizationInfo(
----------------
CarolineConcatto wrote:
> sdesmalen wrote:
> > Should this be `!isScalarEpilogueAllowed()`? i.e. if a scalar epilogue loop is not allowed, then we know predication is required.
> Hey Sander,
> I may be wrong, but if the loop allows scalar epilogue then it should check if the instruction has a !mayDivideByZero(*I). If it does not allows scalar epilogue then we do not need to check the instruction, because the loop will not scalarize the epilogue.
> 
> 
> ```
>     // If predication is used for this block and the operation would otherwise
>     // be guarded, then this requires scalarizing.
>     if (blockNeedsPredication(I->getParent()) || VectorizeWithPredication)
>       return !mayDivideByZero(*I);
> ```
If the loop requires no predication, then we know that if the instruction divides by zero then that is an issue caused by the user, who should have avoided this case. If the loop requires predication - either because the user added some `if (condition)` around the divide, or because the compiler has chosen to fold the tail loop into the vector body and decided to use predication to enable/disable the lanes - then the LV must guarantee the program does not cause different behaviour after vectorizing the code. Because at the moment the LV cannot handle this case yet, it has to fall back on scalarization.

So, if the LV decided to use predication where no predication was needed before and the instruction may divide by zero, then it requires scalarization in order not to change the behaviour of the original program.

The question to ask is "does the loop have instructions that require predication, given that we need predication to handle the tail loop". The "do we need predication to handle the tail loop" part is only `true`, when the scalar epilogue is not allowed, because then the tail loop is folded into the main vector body.

e.g. 10 iterations with VF=4 without predication (and scalar tail):

    1st vector iteration handles 0..3
    2nd vector iteration handles 4..7
    scalar tail loop handles 8..9

Alternatively, 10 iterations with VF=4 and predication

    1st vector iteration handles 0..3, with predicate <true, true, true, true>
    2nd vector iteration handles 4..7, with predicate <true, true, true, true>
    3rd vector iteration handles 8, 9, xx, xx with predicate <true, true, false, false>

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101916/new/

https://reviews.llvm.org/D101916