[PATCH] D96021: [LoopVectorize] NFC: Move UserVF feasibility checks to separate function.

Mon Feb 22 08:47:32 PST 2021

sdesmalen added a comment.

Hi @fhahn, thanks for looking into these two patches.

Today the code in computeFeasibleMaxVF is roughly structured as:

  if (Need to care about UserVF) {
    // Validate, clamp or fall back to fixed-width or scalars.
    // Meanwhile, giving various UserVF-specific remarks/debug output.
  } else
    // Determine MaxFeasibleVF some other way.

Because the function has gotten so big and have practically non-overlapping code paths, I don't see any downside in splitting this code into separate functions to handle their respective cases.

My main reason for doing these two NFC patches first is that the code-paths for fixed- and scalable max VFs would be identical, i.e.  no need to maintain separate variables for Max(Scalable|Fixed)VF, or have lots of conditional code based on whether the UserVF or MaxVF is scalable. With these NFC tweaks, scalable vectors just fit in naturally without a lot of specific 'scalable' changes (see D96023 <https://reviews.llvm.org/D96023> which extends the function to work on scalable vectors, by calling a different interface to determine the WidestRegister. Not really any other scalable-specific changes are needed).

Patch D96022 <https://reviews.llvm.org/D96022> moves the clamping operation to a separate function, so that it works for both scalable and fixed-width vectors (based on MaxVscale), making it one of the few only places that need to be very scalable-specific. The interface is generic though, it only asks to clamp "some VF" to some "num elements". computeFeasibleUserVF then only emits different remarks/debug-output based on the clamped value.
After this bit of cleanup, the code-paths for both computeMaxUserVF and computeFeasibleMaxVF do something very similar to what you suggested in D96997 <https://reviews.llvm.org/D96997>:

- Start of with a large vector size (based on register width)
- Clamp the vector size to max number of elements <legal max or tripcount>
- Pick MaxVF.

I find the code in D96997 <https://reviews.llvm.org/D96997> quite difficult to follow. Not necessarily because of the changes you made, I already find the code difficult to follow for the fixed-width case alone, but adding the scalable case just adds to the complication. Things like:

  if (some condition specific to scalable VFs) {
    :
    ScalableMaxVF = ElementCount::getScalable(0)
    UserVF = ElementCount::getScalable(0)
  }

  :

  if (some other condition specific to scalable VFs) {
    :
    ScalableMaxVF = ElementCount::getScalable(0)
  }

make me weary for example. I think having the code split up makes the code a lot easier to follow than having the combination {UserVF, auto-select} x {Fixed, Scalable} in one function. After D96021 <https://reviews.llvm.org/D96021> and D96022 <https://reviews.llvm.org/D96022>, practically all code-paths are the same for both the fixed- and scalable case.

In D96021#2572946 <https://reviews.llvm.org/D96021#2572946>, @fhahn wrote:

> I have not looked at the other patches in this stack in detail, so I might be missing something that makes things more complicated. But to pick a scalable VF, we'd just have to split out the logic to compute the `MaxVectorSize` (something like `computeMaxVFBasedOnRegisters`) and then call it for both scalable & fixed width vectors and limit by the computed maximum VFs. I think that would be similar to what's done in D96023 <https://reviews.llvm.org/D96023>, but overall I think it might be simpler than the `clampFeasibleMaxVF` approach, because less parameters need to be passed around and we also avoid re-computing the max VFs to clamp with.

I'd be happy to see if more things can be simplified once things work for scalable vectors, but I don't really see many things that are recomputed after D96023 <https://reviews.llvm.org/D96023>?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D96021/new/

https://reviews.llvm.org/D96021