[PATCH] D79175: [ARM][MVE] Tail-Predication: use @llvm.get.active.lane.mask to get the BTC

Sjoerd Meijer via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jun 10 13:56:02 PDT 2020


SjoerdMeijer marked an inline comment as done.
SjoerdMeijer added inline comments.


================
Comment at: llvm/lib/Target/ARM/MVETailPredication.cpp:456
+  auto *ElementCount = SE->getAddExpr(BTC, One);
+  // Tmp = ElementCount + (VW-1)
+  auto *Tmp = SE->getAddExpr(ElementCount,
----------------
efriedma wrote:
> SjoerdMeijer wrote:
> > efriedma wrote:
> > > SjoerdMeijer wrote:
> > > > efriedma wrote:
> > > > > Can `ElementCount + (VW-1)` overflow?  Do we need to check for that?
> > > > We are not generating code for `ElementCount + (VW-1) `, so that one is fine.  We do want to know about overflow for `Ceil`, so will add a check for that.
> > > Not sure I understand; even if we aren't generating code, we're using it as input to the safety check.  Does the math there work correctly even if it overflows?
> > The Ceil expression doesn't have the non-wrapping flags. Therefore, my understanding is, that this 
> > ceiling calculation is done in a higher bit range and so no information is lost.
> > Therefore, my understanding is, that this ceiling calculation is done in a higher bit range
> 
> SCEV math is modular math; it happens in the width of SCEV::getType().  (So Add, Mul, and AddRec can overflow.)  If you want wider math, you need to explicitly zero-extend.
Ahhhh, thanks for explaining.

This is a real puzzle..... I think I am going to solve this differently then, because I am afraid we wouldn't be able to put any meaningful bound on `ElementCount + (VW-1)` (have seen this already but will double-check). I think I am going to use the TripCount (TC) for this, which usually looks like this:

   (1 + ((-4 + (4 * ((3 + %N) /u 4))<nuw>) /u 4))<nuw><nsw>

For which we are able to find useful value ranges like this:

   TC: [1,1073741825)

Because TC uses %N, and is also used in ``ElementCount + (VW-1)`, I think that means that if: 

     upperbound(TC) <= UINT_MAX - VectorWidth
 
that we are okay.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79175/new/

https://reviews.llvm.org/D79175





More information about the llvm-commits mailing list