[PATCH] D79175: [ARM][MVE] Tail-Predication: use @llvm.get.active.lane.mask to get the BTC
Sjoerd Meijer via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Jun 10 13:56:02 PDT 2020
SjoerdMeijer marked an inline comment as done.
SjoerdMeijer added inline comments.
================
Comment at: llvm/lib/Target/ARM/MVETailPredication.cpp:456
+ auto *ElementCount = SE->getAddExpr(BTC, One);
+ // Tmp = ElementCount + (VW-1)
+ auto *Tmp = SE->getAddExpr(ElementCount,
----------------
efriedma wrote:
> SjoerdMeijer wrote:
> > efriedma wrote:
> > > SjoerdMeijer wrote:
> > > > efriedma wrote:
> > > > > Can `ElementCount + (VW-1)` overflow? Do we need to check for that?
> > > > We are not generating code for `ElementCount + (VW-1) `, so that one is fine. We do want to know about overflow for `Ceil`, so will add a check for that.
> > > Not sure I understand; even if we aren't generating code, we're using it as input to the safety check. Does the math there work correctly even if it overflows?
> > The Ceil expression doesn't have the non-wrapping flags. Therefore, my understanding is, that this
> > ceiling calculation is done in a higher bit range and so no information is lost.
> > Therefore, my understanding is, that this ceiling calculation is done in a higher bit range
>
> SCEV math is modular math; it happens in the width of SCEV::getType(). (So Add, Mul, and AddRec can overflow.) If you want wider math, you need to explicitly zero-extend.
Ahhhh, thanks for explaining.
This is a real puzzle..... I think I am going to solve this differently then, because I am afraid we wouldn't be able to put any meaningful bound on `ElementCount + (VW-1)` (have seen this already but will double-check). I think I am going to use the TripCount (TC) for this, which usually looks like this:
(1 + ((-4 + (4 * ((3 + %N) /u 4))<nuw>) /u 4))<nuw><nsw>
For which we are able to find useful value ranges like this:
TC: [1,1073741825)
Because TC uses %N, and is also used in ``ElementCount + (VW-1)`, I think that means that if:
upperbound(TC) <= UINT_MAX - VectorWidth
that we are okay.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D79175/new/
https://reviews.llvm.org/D79175
More information about the llvm-commits
mailing list