[PATCH] D79175: [ARM][MVE] Tail-Predication: use @llvm.get.active.lane.mask to get the BTC

Eli Friedman via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu May 28 13:13:47 PDT 2020


efriedma added inline comments.


================
Comment at: llvm/lib/Target/ARM/MVETailPredication.cpp:348
+  Value *NumElements = Builder.CreateAdd(ScalarBTC,
+      ConstantInt::get(ScalarBTC->getType(), 1), "num.elements");
+
----------------
samparker wrote:
> efriedma wrote:
> > This addition can overflow.
> And that's not okay, right? Trip count will always be BTC + 1 and we don't handle uncountable loops.
The masks are wrong if it overflows.


================
Comment at: llvm/lib/Target/ARM/MVETailPredication.cpp:382
   // TODO: This add likely already exists in the loop.
   Value *Remaining = Builder.CreateSub(Processed, Factor);
   Processed->addIncoming(Remaining, L->getLoopLatch());
----------------
samparker wrote:
> efriedma wrote:
> > This subtraction can also overflow.
> But that's okay, right? This predication is only really useful when wrapping happens and the intrinsic reflects that overflow can/will happen.
The problem here is that the original get_active_lane_mask can do something like this:

Iteration 1: get_active_lane_mask(0, 5) -> all-true
Iteration 2: get_active_lane_mask(4, 5) -> <true, false, false, false>
Iteration 3: get_active_lane_mask(8, 5) -> all-false

In the rewritten code, you end up with this:

Iteration 1: vctp(5) -> all-true
Iteration 2: vctp(1)- > <true, false, false, false>
Iteration 3: vctp(-3)- > all-true

There are a couple ways you could deal with this:

1. Use saturating subtraction
2. Prove the trip count is small enough that this never happens, i.e. `ceil(ElementCount / VectorWidth) >= TripCount`.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79175/new/

https://reviews.llvm.org/D79175





More information about the llvm-commits mailing list