[PATCH] D146199: [LoopVectorize] Don't tail-fold for scalable VFs when there is no scalar tail

Thu Mar 23 09:27:16 PDT 2023

sdesmalen added inline comments.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:5155
+  // we choose.
+  std::optional<unsigned> MaxRuntimeVF = std::nullopt;
+  if (MaxFactors.FixedVF.isVector())
----------------
sdesmalen wrote:
> You can write this a bit more compactly using std::max:
> 
>   std::optional<unsigned> MaxPowerOf2RuntimeVF =
>       MaxFactors.FixedVF.getFixedValue();
>   if (MaxFactors.ScalableVF) {
>     std::optional<unsigned> MaxVScale = getMaxVScale(*TheFunction, TTI);
>     if (MaxVScale && TTI.isVScaleKnownToBeAPowerOfTwo()) {
>       MaxPowerOf2RuntimeVF = std::max<unsigned>(
>           *MaxPowerOf2RuntimeVF,
>           *MaxVScale * MaxFactors.ScalableVF.getKnownMinValue());
>     } else
>       MaxPowerOf2RuntimeVF = std::nullopt; // Stick with tail-folding for now.
>   }
> 
Sorry for changing my mind on this, but I wonder if `MaxPowerOf2RuntimeVF` is more appropriate name, since the value is `std::nullopt` if the VF is not a power of 2.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:5155-5172
+  std::optional<unsigned> MaxRuntimeVF = std::nullopt;
+  if (MaxFactors.FixedVF.isVector())
+    MaxRuntimeVF = MaxFactors.FixedVF.getFixedValue();
+  if (MaxFactors.ScalableVF) {
+    // If vscale is a power of 2 then we only need to look at the maximum
+    // possible value of vscale and determine the maximum possible scalable
+    // VF. Otherwise for arbitrary values of vscale we'd have to ask if for
----------------
You can write this a bit more compactly using std::max:

  std::optional<unsigned> MaxPowerOf2RuntimeVF =
      MaxFactors.FixedVF.getFixedValue();
  if (MaxFactors.ScalableVF) {
    std::optional<unsigned> MaxVScale = getMaxVScale(*TheFunction, TTI);
    if (MaxVScale && TTI.isVScaleKnownToBeAPowerOfTwo()) {
      MaxPowerOf2RuntimeVF = std::max<unsigned>(
          *MaxPowerOf2RuntimeVF,
          *MaxVScale * MaxFactors.ScalableVF.getKnownMinValue());
    } else
      MaxPowerOf2RuntimeVF = std::nullopt; // Stick with tail-folding for now.
  }

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:5159-5163
+    // If vscale is a power of 2 then we only need to look at the maximum
+    // possible value of vscale and determine the maximum possible scalable
+    // VF. Otherwise for arbitrary values of vscale we'd have to ask if for
+    // every possible value of vscale the trip count is a perfect multiple
+    // of the VF.
----------------
I find this comment more confusing than enlightening. I also think that the issue is more that we decide early in the process whether or not to use tail folding, before we get to decide on a VF. We could make a more fine-grained choice if we would make those choices together. But when I read this comment, I'm not really thinking about that.

In any case, I'm happy for the comment to be removed since the code itself is clear enough.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D146199/new/

https://reviews.llvm.org/D146199