[PATCH] D146199: [LoopVectorize] Don't tail-fold for scalable VFs when there is no scalar tail

Sander de Smalen via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Mar 23 09:27:16 PDT 2023


sdesmalen added inline comments.


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:5155
+  // we choose.
+  std::optional<unsigned> MaxRuntimeVF = std::nullopt;
+  if (MaxFactors.FixedVF.isVector())
----------------
sdesmalen wrote:
> You can write this a bit more compactly using std::max:
> 
>   std::optional<unsigned> MaxPowerOf2RuntimeVF =
>       MaxFactors.FixedVF.getFixedValue();
>   if (MaxFactors.ScalableVF) {
>     std::optional<unsigned> MaxVScale = getMaxVScale(*TheFunction, TTI);
>     if (MaxVScale && TTI.isVScaleKnownToBeAPowerOfTwo()) {
>       MaxPowerOf2RuntimeVF = std::max<unsigned>(
>           *MaxPowerOf2RuntimeVF,
>           *MaxVScale * MaxFactors.ScalableVF.getKnownMinValue());
>     } else
>       MaxPowerOf2RuntimeVF = std::nullopt; // Stick with tail-folding for now.
>   }
> 
Sorry for changing my mind on this, but I wonder if `MaxPowerOf2RuntimeVF` is more appropriate name, since the value is `std::nullopt` if the VF is not a power of 2.


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:5155-5172
+  std::optional<unsigned> MaxRuntimeVF = std::nullopt;
+  if (MaxFactors.FixedVF.isVector())
+    MaxRuntimeVF = MaxFactors.FixedVF.getFixedValue();
+  if (MaxFactors.ScalableVF) {
+    // If vscale is a power of 2 then we only need to look at the maximum
+    // possible value of vscale and determine the maximum possible scalable
+    // VF. Otherwise for arbitrary values of vscale we'd have to ask if for
----------------
You can write this a bit more compactly using std::max:

  std::optional<unsigned> MaxPowerOf2RuntimeVF =
      MaxFactors.FixedVF.getFixedValue();
  if (MaxFactors.ScalableVF) {
    std::optional<unsigned> MaxVScale = getMaxVScale(*TheFunction, TTI);
    if (MaxVScale && TTI.isVScaleKnownToBeAPowerOfTwo()) {
      MaxPowerOf2RuntimeVF = std::max<unsigned>(
          *MaxPowerOf2RuntimeVF,
          *MaxVScale * MaxFactors.ScalableVF.getKnownMinValue());
    } else
      MaxPowerOf2RuntimeVF = std::nullopt; // Stick with tail-folding for now.
  }



================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:5159-5163
+    // If vscale is a power of 2 then we only need to look at the maximum
+    // possible value of vscale and determine the maximum possible scalable
+    // VF. Otherwise for arbitrary values of vscale we'd have to ask if for
+    // every possible value of vscale the trip count is a perfect multiple
+    // of the VF.
----------------
I find this comment more confusing than enlightening. I also think that the issue is more that we decide early in the process whether or not to use tail folding, before we get to decide on a VF. We could make a more fine-grained choice if we would make those choices together. But when I read this comment, I'm not really thinking about that.

In any case, I'm happy for the comment to be removed since the code itself is clear enough.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D146199/new/

https://reviews.llvm.org/D146199



More information about the llvm-commits mailing list