[PATCH] D17201: [SCEV] Introduce a guarded backedge taken count and use it in LAA and LV

Mon Mar 28 17:42:50 PDT 2016

sanjoy added inline comments.

================
Comment at: lib/Analysis/ScalarEvolutionExpander.cpp:2007
@@ -2006,2 +2006,3 @@
 
-  const SCEV *ExitCount = SE.getBackedgeTakenCount(AR->getLoop());
+  SCEVUnionPredicate Pred;
+  const SCEV *ExitCount =
----------------
sbaranga wrote:
> sanjoy wrote:
> > This part is a significant semantic change that I unfortunately hadn't
> > (but really should have) noticed in earlier revisions.  While I don't
> > yet have a specific example given the current state of the code, I
> > think it is possible to end up with a circular logic fallacy here.
> > 
> > Earlier, the invariant was: given a no-overflow predicate, we would
> > write a loop entry predicate `EP` (a function of the backedge count)
> > that, when `true`, would imply no overflow.  Formally, `EP => NoOverflow`.
> > However, now we're doing something different.  Now we
> > have: `EP => (Backedge taken count is BE => NoOverflow)` (this part is
> > the same as earlier), but `Backedge taken count is BE` is not
> > axiomatically true -- it is true under the `NoOverflow`
> > condition.  In other words, the set of logical equations we have are
> > 
> > ```
> > EP => (Backedge taken count is BE => NoOverflow)
> > NoOverflow => Backedge taken count is BE
> > ```
> > 
> > Now we check `EP` at runtime, so it is known to be `true`.  Given
> > that, there are two solutions to the above: {`Backedge taken count is
> > BE` is `true`, `NoOverflow` is `true`}; or {`Backedge taken count is
> > BE` is `false`, `NoOverflow` is `false`}.  The latter solution is
> > problematic.
> > 
> > One problematic case that won't happen today, but illustrates what I'm
> > talking about:
> > 
> > ```
> >   for (u32 i = 0; ; i++)
> >     store volatile (GEP a, i), 0
> > ```
> > 
> > The above loop can run forever (assuming `u32` 's overflow is well
> > defined), but let's say we decide to predicate the increment to an
> > NUSW increment.  Given that, we know the loop won't run more than `-1`
> > times (since otherwise we will have a side effect that uses poison).
> > Howver, the predicate that you'll compute in SCEVExpander in such a
> > case is `-1 == -1`, which is trivially true, and now the loop is
> > miscompiled (instead of running forever, it'll just run `-1` times).
> > 
> > Now, I'll note that I've so far not have been able to come with a
> > problematic case that will break in the current version of the patch,
> > so it is possible that there is some deeper invariant here that is
> > obvious.
> Thanks! That is a very good point and we should be really careful here. I think this change is still correct:
> 
> Let's say that we return a symbolic answer X, and the correct answer is Y. The problem is when X != Y and we pass the predicate check.
> 
> Because the answer is wrong, some overflow must happen. Since we are passing the NoOverflow check, we need to have X < Y (otherwise we wouldn't be passing the check).
> 
> Note that for the first X iterations our predicate holds. I think what makes this correct is the fact that we're using the predicates in HowManyGreaterThans/HowManyLessThans, which means that we need to stop at iteration X.
> 
> 
> 
> 
> 
> 
> 
So, to rephrase what I understood, by "with the predicate (IsNoWrap
AR) the trip count of the loop is N", PSE really means "if the add
recurrence AR does not overflow in the first N iterations, then the
loop's count is N".  In particular, for the loop

```
for (unsigned i; ; i++)
  side_effect_use(i);
```

predicating `i++` as NUSW does *not* let you conclude that the trip
count of the loop is `UINT_MAX`, since just because `i++` does not
overflow the in first `UINT_MAX` iterations does not guarantee that
the loop will exit or have UB in the ``UINT_MAX+1` th iteration.

This is a subtle invariant, so at the very least this needs to be
documented explicitly; and the places where we add predicates
`HowManyLessThans` etc. need to be audited carefully (it sounds like
you already have done most of that?).

Finally, if possible, it will be great if we can structure the code in
a way that will make (accidentally) breaking this invariant difficult;
though I don't have anything concrete in mind right now.


http://reviews.llvm.org/D17201