[PATCH] D102234: [SimpleLoopBoundSplit] Split Bound of Loop which has conditional branch with IV

Tue May 18 16:42:23 PDT 2021

reames added a comment.

I want to make a very high level suggestion on this.  This isn't really about the code per se, and more about the approach to writing the code.

I'd start with a really trivially transform for the loop:
for (i = 0; i < N; i++) { body }

Build a mechanism to produce a form which looks like so:
for (i = 0; i < N; i++) { body }
if (i != N) {

  for (; i < N; i++) { body }

}

This should (rightfully) look fairly odd as the second loop is dead.  However, once we have that, iteration splitting becomes much more straight forward.  A few observations:

- The "if (i != N)" is a loop guard and can be identified by getLoopGuardBranch.
- N is the exit value of the i addrec, and can be gotten from SCEV for any arbitrary AddRec for a known exit count.

Once we have this form, we can restrict the iteration space of the pre-loop without modifying the post loop at all.  (Provided we haven't run any optimization in between.  A slightly safer form would be to have the guard condition be an unknown value to prevent accidental optimization.)

The next core primitive is a routine which uses an Exact Exit Count (as defined in SCEV today) to *reduce* the number of iterations in a loop.  A key thing to note is that mutating existing IR is an optimization, but the routine is always allowed to introduce a new IV and clamp if needed.  That helps a lot in making the code robust.  Being able to use SCEVAddRecExpr::evaluateAtIteration also helps to simplify things a lot.

The final primitive is to generalize the existing exit count logic to work for when an arbitrary monotonic condition toggles.  (There's a bunch of ways of computing an "exit" count for the branch of interest.  This is merely one.)

Once we have both of those, we'd

1. Determine if splitting is worthwhile.  Pick a set of branches to eliminate.  (Must be able to compute "exit" counts.)
2. Produce dead loop form
3. For each branch we want to remove from the pre-loop, compute an "exit" count.
4. Then constrain the preloop by the umin of all the desired exit counts.
5. Simplify branches out of pre-loop.  Leave all generality of control flow in preloop.

Note carefully what this approach *doesn't* do.  It doesn't require the pass (as opposed to scev) to reason about conditions, signed vs unsigned, overflow, or whether two IVs are congruent.  It heavily reuses the existing logic in SCEV which (mostly) gets all those cases right already.

This works for any condition which is "monotonic"  (e.g. transitions from true to false (or vice versa) at most once in the original iteration space).  It does not work for branch conditions which can transition or more times without a bit more generalization.

Where this approach fails a bit is in handling multiple branches to be pruned.  As described above, it runs the preloop until *any* of the conditions are hit, and then the post-loop for the remainder.  That may or may not have been what was desired.

I think this approach can be used to formulate IRCE, but it requires a bit of care for the case where you don't know an index is in bounds or not on the first iteration.

For a profitability check, I'd start specifically with the case where the condition precisely splits a loop into two halves.  e.g. for ... { if (C) { body1 } else { body2 }.  This is the easiest to believe is generally profitable, and we can generalize the heuristic selection later.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D102234/new/

https://reviews.llvm.org/D102234