[PATCH] D96021: [LoopVectorize] NFC: Move UserVF feasibility checks to separate function.
Sander de Smalen via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Feb 22 08:47:32 PST 2021
sdesmalen added a comment.
Hi @fhahn, thanks for looking into these two patches.
Today the code in computeFeasibleMaxVF is roughly structured as:
if (Need to care about UserVF) {
// Validate, clamp or fall back to fixed-width or scalars.
// Meanwhile, giving various UserVF-specific remarks/debug output.
} else
// Determine MaxFeasibleVF some other way.
Because the function has gotten so big and have practically non-overlapping code paths, I don't see any downside in splitting this code into separate functions to handle their respective cases.
My main reason for doing these two NFC patches first is that the code-paths for fixed- and scalable max VFs would be identical, i.e. no need to maintain separate variables for Max(Scalable|Fixed)VF, or have lots of conditional code based on whether the UserVF or MaxVF is scalable. With these NFC tweaks, scalable vectors just fit in naturally without a lot of specific 'scalable' changes (see D96023 <https://reviews.llvm.org/D96023> which extends the function to work on scalable vectors, by calling a different interface to determine the WidestRegister. Not really any other scalable-specific changes are needed).
Patch D96022 <https://reviews.llvm.org/D96022> moves the clamping operation to a separate function, so that it works for both scalable and fixed-width vectors (based on MaxVscale), making it one of the few only places that need to be very scalable-specific. The interface is generic though, it only asks to clamp "some VF" to some "num elements". computeFeasibleUserVF then only emits different remarks/debug-output based on the clamped value.
After this bit of cleanup, the code-paths for both computeMaxUserVF and computeFeasibleMaxVF do something very similar to what you suggested in D96997 <https://reviews.llvm.org/D96997>:
- Start of with a large vector size (based on register width)
- Clamp the vector size to max number of elements <legal max or tripcount>
- Pick MaxVF.
I find the code in D96997 <https://reviews.llvm.org/D96997> quite difficult to follow. Not necessarily because of the changes you made, I already find the code difficult to follow for the fixed-width case alone, but adding the scalable case just adds to the complication. Things like:
if (some condition specific to scalable VFs) {
:
ScalableMaxVF = ElementCount::getScalable(0)
UserVF = ElementCount::getScalable(0)
}
:
if (some other condition specific to scalable VFs) {
:
ScalableMaxVF = ElementCount::getScalable(0)
}
make me weary for example. I think having the code split up makes the code a lot easier to follow than having the combination {UserVF, auto-select} x {Fixed, Scalable} in one function. After D96021 <https://reviews.llvm.org/D96021> and D96022 <https://reviews.llvm.org/D96022>, practically all code-paths are the same for both the fixed- and scalable case.
In D96021#2572946 <https://reviews.llvm.org/D96021#2572946>, @fhahn wrote:
> I have not looked at the other patches in this stack in detail, so I might be missing something that makes things more complicated. But to pick a scalable VF, we'd just have to split out the logic to compute the `MaxVectorSize` (something like `computeMaxVFBasedOnRegisters`) and then call it for both scalable & fixed width vectors and limit by the computed maximum VFs. I think that would be similar to what's done in D96023 <https://reviews.llvm.org/D96023>, but overall I think it might be simpler than the `clampFeasibleMaxVF` approach, because less parameters need to be passed around and we also avoid re-computing the max VFs to clamp with.
I'd be happy to see if more things can be simplified once things work for scalable vectors, but I don't really see many things that are recomputed after D96023 <https://reviews.llvm.org/D96023>?
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D96021/new/
https://reviews.llvm.org/D96021
More information about the llvm-commits
mailing list