[PATCH] D79175: [ARM][MVE] Tail-Predication: use @llvm.get.active.lane.mask to get the BTC
Eli Friedman via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Jun 10 12:13:36 PDT 2020
efriedma added inline comments.
================
Comment at: llvm/lib/Target/ARM/MVETailPredication.cpp:456
+ auto *ElementCount = SE->getAddExpr(BTC, One);
+ // Tmp = ElementCount + (VW-1)
+ auto *Tmp = SE->getAddExpr(ElementCount,
----------------
SjoerdMeijer wrote:
> efriedma wrote:
> > Can `ElementCount + (VW-1)` overflow? Do we need to check for that?
> We are not generating code for `ElementCount + (VW-1) `, so that one is fine. We do want to know about overflow for `Ceil`, so will add a check for that.
Not sure I understand; even if we aren't generating code, we're using it as input to the safety check. Does the math there work correctly even if it overflows?
================
Comment at: llvm/lib/Target/ARM/MVETailPredication.cpp:462
+
+ if (!llvm::isKnownNonNegativeInLoop(Ceil, L, *SE)) {
+ LLVM_DEBUG(dbgs() << "ARM TP: overflow detected in: "; Ceil->dump());
----------------
Ceil is the result of a UDiv; it trivially can't be negative.
================
Comment at: llvm/test/CodeGen/Thumb2/LowOverheadLoops/tail-pred-const.ll:253
+ ; %1 = icmp ult <4 x i32> %induction, <i32 32002, i32 32002, i32 32002, i32 32002>
+ %1 = call <4 x i1> @llvm.get.active.lane.mask.v4i1.i32(i32 %index, i32 32002)
+
----------------
I don't understand how this loop is supposed to work. %index is zero in the first iteration, and UINT_MAX-3 in the second iteration.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D79175/new/
https://reviews.llvm.org/D79175
More information about the llvm-commits
mailing list