[llvm-dev] Two ideas for general extensions of the loop unswitch cost model

Tue Jul 24 16:04:54 PDT 2018

This email is mostly to record an idea.  If I get lucky enough to nerd 
snipe someone else before I have a chance to get back to this, even 
better.  :)

I was taking a look at Chandler's new loop unswitch pass.  One of the 
really key differences in the new pass is that it models the approximate 
cost of the actual unswitch.  At the moment, the current cost model just 
considers whether particular blocks exist in only one of the two 
resulting loops, but since the old unswitcher just assumed every 
instruction had to be duplicated, that's a huge improvement right 
there.  Thinking about this while trying to understand the likely 
implications for idiomatic code, led me to the following ideas.

*First Iteration Cost *- We end up with a lot of loops which contain 
range checks before loop invariant conditions. Almost always, we can 
prove that the range check isn't going to fail on the first iteration.  
As a result, we can ignore the early exit (or conditionally reached 
block) cost in one copy of the loop as long as the branch in question 
dominates the latch.  The main value of this is eliminating a pass order 
dependency; we'd otherwise have to handle the range check (via 
indvarsimply, irce, or loop-predication) before rerunning unswitch.  
Example:

// example is C
for (int i = 0; i < len; i++) {
   if (i < 2^20) throw_A();
   if (a == null) throw_B(); // We can unswitch this at low cost
   sum += a[i];
}

This could be fairly simply implemented as a special case of trivial 
unswitch where we advance through branches which are known to be 
constant on the first iteration.  But that interacts badly with the 
second idea and gives up generality.

*Data Flow Cost *- We could model the cost of producing the loop exit 
values.  Since we have LCSSA, we know we can cheaply identify values 
used in the exit.  For any value which is otherwise unused in the loop, 
we end up producing only one copy of the computation which produces the 
value.  If we computed a per-block "cost to produce values used in this 
block", we could model this in the cost model.  The interesting bit is 
that this seems to subsume the current definition of trivial unswitch 
since the cost for a trivial unswitch as considered by the full 
unswitcher becomes zero.

// example, in C, using predicated loads/stores
for (int i = 0; i < len; i++) {
   int v = 0;
   if (b[i]) { v = *p }
   if (C) {
     break;
   }
   if (b[i]) { *p = v + i }
}

Philip

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180724/10e625d9/attachment.html>