[PATCH] D17201: [SCEV] Introduce a guarded backedge taken count and use it in LAA and LV

Tue Mar 29 05:31:31 PDT 2016

sbaranga added inline comments.

================
Comment at: lib/Analysis/ScalarEvolutionExpander.cpp:2007
@@ -2006,2 +2006,3 @@
 
-  const SCEV *ExitCount = SE.getBackedgeTakenCount(AR->getLoop());
+  SCEVUnionPredicate Pred;
+  const SCEV *ExitCount =
----------------
sanjoy wrote:
> sbaranga wrote:
> > sanjoy wrote:
> > > This part is a significant semantic change that I unfortunately hadn't
> > > (but really should have) noticed in earlier revisions.  While I don't
> > > yet have a specific example given the current state of the code, I
> > > think it is possible to end up with a circular logic fallacy here.
> > > 
> > > Earlier, the invariant was: given a no-overflow predicate, we would
> > > write a loop entry predicate `EP` (a function of the backedge count)
> > > that, when `true`, would imply no overflow.  Formally, `EP => NoOverflow`.
> > > However, now we're doing something different.  Now we
> > > have: `EP => (Backedge taken count is BE => NoOverflow)` (this part is
> > > the same as earlier), but `Backedge taken count is BE` is not
> > > axiomatically true -- it is true under the `NoOverflow`
> > > condition.  In other words, the set of logical equations we have are
> > > 
> > > ```
> > > EP => (Backedge taken count is BE => NoOverflow)
> > > NoOverflow => Backedge taken count is BE
> > > ```
> > > 
> > > Now we check `EP` at runtime, so it is known to be `true`.  Given
> > > that, there are two solutions to the above: {`Backedge taken count is
> > > BE` is `true`, `NoOverflow` is `true`}; or {`Backedge taken count is
> > > BE` is `false`, `NoOverflow` is `false`}.  The latter solution is
> > > problematic.
> > > 
> > > One problematic case that won't happen today, but illustrates what I'm
> > > talking about:
> > > 
> > > ```
> > >   for (u32 i = 0; ; i++)
> > >     store volatile (GEP a, i), 0
> > > ```
> > > 
> > > The above loop can run forever (assuming `u32` 's overflow is well
> > > defined), but let's say we decide to predicate the increment to an
> > > NUSW increment.  Given that, we know the loop won't run more than `-1`
> > > times (since otherwise we will have a side effect that uses poison).
> > > Howver, the predicate that you'll compute in SCEVExpander in such a
> > > case is `-1 == -1`, which is trivially true, and now the loop is
> > > miscompiled (instead of running forever, it'll just run `-1` times).
> > > 
> > > Now, I'll note that I've so far not have been able to come with a
> > > problematic case that will break in the current version of the patch,
> > > so it is possible that there is some deeper invariant here that is
> > > obvious.
> > Thanks! That is a very good point and we should be really careful here. I think this change is still correct:
> > 
> > Let's say that we return a symbolic answer X, and the correct answer is Y. The problem is when X != Y and we pass the predicate check.
> > 
> > Because the answer is wrong, some overflow must happen. Since we are passing the NoOverflow check, we need to have X < Y (otherwise we wouldn't be passing the check).
> > 
> > Note that for the first X iterations our predicate holds. I think what makes this correct is the fact that we're using the predicates in HowManyGreaterThans/HowManyLessThans, which means that we need to stop at iteration X.
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> So, to rephrase what I understood, by "with the predicate (IsNoWrap
> AR) the trip count of the loop is N", PSE really means "if the add
> recurrence AR does not overflow in the first N iterations, then the
> loop's count is N".  In particular, for the loop
> 
> ```
> for (unsigned i; ; i++)
>   side_effect_use(i);
> ```
> 
> predicating `i++` as NUSW does *not* let you conclude that the trip
> count of the loop is `UINT_MAX`, since just because `i++` does not
> overflow the in first `UINT_MAX` iterations does not guarantee that
> the loop will exit or have UB in the ``UINT_MAX+1` th iteration.
> 
> This is a subtle invariant, so at the very least this needs to be
> documented explicitly; and the places where we add predicates
> `HowManyLessThans` etc. need to be audited carefully (it sounds like
> you already have done most of that?).
> 
> Finally, if possible, it will be great if we can structure the code in
> a way that will make (accidentally) breaking this invariant difficult;
> though I don't have anything concrete in mind right now.
Yes, this is a very good description of the logic!

Indeed, this is subtle so it needs documenting. I did audit the places where we've added the predicates, but I'd like to have another look.


http://reviews.llvm.org/D17201