[PATCH] D146199: [LoopVectorize] Don't tail-fold for scalable VFs when there is no scalar tail

Wed Mar 29 05:56:42 PDT 2023

ABataev added inline comments.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:5173
+
+  if (MaxPowerOf2RuntimeVF) {
+    assert((UserVF.isNonZero() || isPowerOf2_32(*MaxPowerOf2RuntimeVF)) &&
----------------
david-arm wrote:
> @ABataev @sdesmalen I am happy to add an extra check here such as
> 
>   if (MaxPowerOf2RuntimeVF && *MaxPowerOf2RuntimeVF > 0) {
> 
> but I there may be no way to add a test to defend the non-zero behaviour. You'd need to have one of these scenarios:
> 
> 1. computeFeasibleMaxVF returns 0 and 0 for both fixed-width and scalable VFs. Not sure how this could ever happen?
> 2. computeFeasibleMaxVF returns 0 for fixed-width and non-zero for scalable VFs, but there is no vscale max or vscale is not a power of 2. This may be possible in theory, but difficult to find an actual test that exhibits this behaviour.
> 
> Regardless, even if I can't write a test for this I am happy to add a check!
I think it can be reproduced if:
1. Legal->getMaxSafeVectorWidthInBits() / WidestType == 0 (i.e. there is a dependency in the loop, but WidestType is larger than the dependency).
2. UserVF is specified with the fixed vector length, which is greater than 0.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D146199/new/

https://reviews.llvm.org/D146199