[PATCH] D142894: [LoopVectorize] Use overflow-check analysis to improve tail-folding.
David Sherwood via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Feb 14 09:29:58 PST 2023
david-arm added a comment.
Nice! I had a few minor comments, except for a possible issue with the use of `getSmallBestKnownTC`.
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1560
+ bool isIndvarOverflowCheckKnownFalse(
+ ElementCount VF, std::optional<unsigned> UF = std::nullopt) const;
----------------
Can you add some `///` comments here please?
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:3040
+ bool IVUpdateCannotOverflow = Cost->isIndvarOverflowCheckKnownFalse(VF, UF);
+ TailFoldingStyle Style = Cost->getTailFoldingStyle(IVUpdateCannotOverflow);
if (Style == TailFoldingStyle::None)
----------------
This is just a thought, but I think you can probably simplify this to just:
TailFoldingStyle Style = Cost->getTailFoldingStyle();
if (Style == TailFoldingStyle::None)
CheckMinIters =
Builder.CreateICmp(P, Count, CreateStep(), "min.iters.check");
else if (VF.isScalable() &&
Style != TailFoldingStyle::DataAndControlFlowWithoutRuntimeCheck &&
!Cost->isIndvarOverflowCheckKnownFalse(VF, UF)) {
...
That way you only need to call `isIndvarOverflowCheckKnownFalse` just before you're about to actually create the checks. What do you think?
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:3497
+ // vector loop induction variable.
+ if (std::optional<unsigned> TC = getSmallBestKnownTC(*PSE.getSE(), TheLoop)) {
+ unsigned MaxVFElts = VF.getKnownMinValue();
----------------
I wonder if this is safe because `getSmallBestKnownTC` also returns a value from profiling for the expected trip count, which is more like a hint rather than a definite?
================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/sve-tail-folding-overflow-checks.ll:45
+;
+entry:
+ %cmp6.not = icmp eq i32 %N, 0
----------------
nit: Maybe the entry blocks in both tests can be removed to simplify the IR and CHECK lines? I think the only thing that matters here is the `zext` in the `for.body.preheader`, right?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D142894/new/
https://reviews.llvm.org/D142894
More information about the llvm-commits
mailing list