[PATCH] D17201: [SCEV] Introduce a guarded backedge taken count and use it in LAA and LV

Sun Mar 27 15:43:35 PDT 2016

sanjoy requested changes to this revision.
sanjoy added a comment.
This revision now requires changes to proceed.

Mostly nits, except that there may be a correctness issue in `SCEVExpander::generateOverflowCheck`.


================
Comment at: include/llvm/Analysis/ScalarEvolution.h:670
@@ +669,3 @@
+      EdgeInfo(BasicBlock *Block, const SCEV *Taken, SCEVUnionPredicate &P) :
+          ExitBlock(Block), Taken(Taken), Pred(std::move(P)) {}
+
----------------
SGTM, lets go with what we have now.  I still think the data layout here can be simplified, but we (I'm happy to take care of that) can do that later once this change is in.

================
Comment at: include/llvm/Analysis/ScalarEvolution.h:880
@@ -727,2 +879,3 @@
     /// execute if its exit condition were a conditional branch of ExitCond,
-    /// TBB, and FBB.
+    /// TBB, and FBB. If AllowPredicates is set, this call try to use a minimal
+    /// set of SCEV predicates in order to return an exact answer.
----------------
Nit (here and below) "this call will try to"

================
Comment at: lib/Analysis/ScalarEvolution.cpp:5481
@@ +5480,3 @@
+  for (unsigned i = 1, PredPos = 1; i < NumExits; ++i) {
+    auto &Exits = ExitNotTaken.ExtraInfo.getPointer()->Exits;
+    ExitNotTakenExtras *Ptr = nullptr;
----------------
Can you move this `auto &Exits` out of the loop?

================
Comment at: lib/Analysis/ScalarEvolution.cpp:5486
@@ +5485,3 @@
+
+    Exits.emplace_back(ExitNotTakenInfo(ExitCounts[i].ExitBlock,
+                                        ExitCounts[i].Taken, Ptr));
----------------
Can this just be `Exits.emplace_back(ExitCounts[i].ExitBlock, ExitCounts[i].taken, Ptr)`?

================
Comment at: lib/Analysis/ScalarEvolution.cpp:5489
@@ +5488,3 @@
+    if (Ptr)
+      Exits.back().ExtraInfo.getPointer()->Pred =
+          std::move(ExitCounts[i].Pred);
----------------
Why not move this to the place statement where you assign to `Ptr`:

```
if (!ExitCounts[i].Pred.isAlwaysTrue()) {
  Ptr = &ENT[PredPos++];
  Ptr->Pred = ...
}
```

================
Comment at: lib/Analysis/ScalarEvolutionExpander.cpp:2007
@@ -2006,2 +2006,3 @@
 
-  const SCEV *ExitCount = SE.getBackedgeTakenCount(AR->getLoop());
+  SCEVUnionPredicate Pred;
+  const SCEV *ExitCount =
----------------
This part is a significant semantic change that I unfortunately hadn't
(but really should have) noticed in earlier revisions.  While I don't
yet have a specific example given the current state of the code, I
think it is possible to end up with a circular logic fallacy here.

Earlier, the invariant was: given a no-overflow predicate, we would
write a loop entry predicate `EP` (a function of the backedge count)
that, when `true`, would imply no overflow.  Formally, `EP => NoOverflow`.
However, now we're doing something different.  Now we
have: `EP => (Backedge taken count is BE => NoOverflow)` (this part is
the same as earlier), but `Backedge taken count is BE` is not
axiomatically true -- it is true under the `NoOverflow`
condition.  In other words, the set of logical equations we have are

```
EP => (Backedge taken count is BE => NoOverflow)
NoOverflow => Backedge taken count is BE
```

Now we check `EP` at runtime, so it is known to be `true`.  Given
that, there are two solutions to the above: {`Backedge taken count is
BE` is `true`, `NoOverflow` is `true`}; or {`Backedge taken count is
BE` is `false`, `NoOverflow` is `false`}.  The latter solution is
problematic.

One problematic case that won't happen today, but illustrates what I'm
talking about:

```
  for (u32 i = 0; ; i++)
    store volatile (GEP a, i), 0
```

The above loop can run forever (assuming `u32` 's overflow is well
defined), but let's say we decide to predicate the increment to an
NUSW increment.  Given that, we know the loop won't run more than `-1`
times (since otherwise we will have a side effect that uses poison).
Howver, the predicate that you'll compute in SCEVExpander in such a
case is `-1 == -1`, which is trivially true, and now the loop is
miscompiled (instead of running forever, it'll just run `-1` times).

Now, I'll note that I've so far not have been able to come with a
problematic case that will break in the current version of the patch,
so it is possible that there is some deeper invariant here that is
obvious.


http://reviews.llvm.org/D17201