[PATCH] D37425: LoopVectorize: MaxVF should not be larger than the loop trip count
Zvi Rackover via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Sep 4 16:08:47 PDT 2017
zvi added inline comments.
================
Comment at: llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp:6165
if (!OptForSize) // Remaining checks deal with scalar loop when OptForSize.
return computeFeasibleMaxVF(OptForSize);
----------------
Ayal wrote:
> Would be good to restrict MaxVF to TC even if we're not OptForSize, although it's probably less likely to have a known TC that is larger than Tiny but smaller than current MaxVF.
Thanks for pointing that out! Created [[ https://bugs.llvm.org/show_bug.cgi?id=34468 | pr34468 ]] showing a missed opportunity for TinyTripCount < ConstTripCount < MaxVL.
================
Comment at: llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp:6243
+ } else if (ConstTripCount && ConstTripCount < MaxVectorSize &&
+ isPowerOf2_32(ConstTripCount))
+ MaxVectorSize = ConstTripCount;
----------------
Ayal wrote:
> Better set MaxVectorSize to PowerOf2Floor(ConstTripCount), to also handle the case where ConstTripCount is not a power of 2.
>
> Also, while we're at it: `computeFeasibleMaxVF()` is currently not required/documented to return a power of 2. We should either check that MaxVF is a power of 2 when checking `if (TC % MaxVF != 0)`, and then can simply set MaxVectorSize to ConstTripCount here. Otherwise we should document/assert that MaxVF is a power of 2.
> Better set MaxVectorSize to PowerOf2Floor(ConstTripCount), to also handle the case where ConstTripCount is not a power of 2.
What would there be a benefit of this suggestion given that we currently bail out if there is a need for a tail-loop in the case of short-trip-count? Should we tie this with the work on vectorizing short trip count and allowing runtime checks/tail loops?
>Also, while we're at it: computeFeasibleMaxVF() is currently not required/documented to return a power of 2. We should either check that MaxVF is a power of 2 when checking if (TC % MaxVF != 0), and then can simply set MaxVectorSize to ConstTripCount here. Otherwise we should document/assert that MaxVF is a power of 2.
It did not occur to me that MaxVF may not be a power of two. Can we label this as a follow-up work?
Repository:
rL LLVM
https://reviews.llvm.org/D37425
More information about the llvm-commits
mailing list