[PATCH] D79175: [ARM][MVE] Tail-Predication: use @llvm.get.active.lane.mask to get the BTC

Eli Friedman via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jun 10 12:13:36 PDT 2020


efriedma added inline comments.


================
Comment at: llvm/lib/Target/ARM/MVETailPredication.cpp:456
+  auto *ElementCount = SE->getAddExpr(BTC, One);
+  // Tmp = ElementCount + (VW-1)
+  auto *Tmp = SE->getAddExpr(ElementCount,
----------------
SjoerdMeijer wrote:
> efriedma wrote:
> > Can `ElementCount + (VW-1)` overflow?  Do we need to check for that?
> We are not generating code for `ElementCount + (VW-1) `, so that one is fine.  We do want to know about overflow for `Ceil`, so will add a check for that.
Not sure I understand; even if we aren't generating code, we're using it as input to the safety check.  Does the math there work correctly even if it overflows?


================
Comment at: llvm/lib/Target/ARM/MVETailPredication.cpp:462
+
+  if (!llvm::isKnownNonNegativeInLoop(Ceil, L, *SE)) {
+    LLVM_DEBUG(dbgs() << "ARM TP: overflow detected in: "; Ceil->dump());
----------------
Ceil is the result of a UDiv; it trivially can't be negative.


================
Comment at: llvm/test/CodeGen/Thumb2/LowOverheadLoops/tail-pred-const.ll:253
+  ; %1 = icmp ult <4 x i32> %induction, <i32 32002, i32 32002, i32 32002, i32 32002>
+  %1 = call <4 x i1> @llvm.get.active.lane.mask.v4i1.i32(i32 %index, i32 32002)
+
----------------
I don't understand how this loop is supposed to work.  %index is zero in the first iteration, and UINT_MAX-3 in the second iteration.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79175/new/

https://reviews.llvm.org/D79175





More information about the llvm-commits mailing list