[PATCH] D21720: Unroll for uncountable loops

Thu Mar 9 16:28:44 PST 2017

hfinkel added a comment.

In https://reviews.llvm.org/D21720#696992, @evstupac wrote:

> > My biggest concern about this patch is that it doesn't solve the problem in a general way, but instead only catches two cases: s ^= 1 and s = -s.
>
> Why PhiCycle is not general? That could be InstCycle instead (for i&3, i&7, i%N,....).
>  "i%2", "i&1" are convertible to "s^=1".

Sure, but "are convertible to" is not helpful if they're not. Whether that's a useful canonical form might be a separate discussion (maybe it is if that's the only use of the induction variable). In any case, I agree that we should have something that is insensitive to how this is written.

However, we still need to make sure we're doing this in a way that is profitable. The point of making sure to unroll by an even factor for these cycle-2 recurrences is that it allows us to completely remove the PHI. It is not clear to me that the more-general case shares this property without additional work (by which I mean canonicalization work).

> 
> 
>> IMHO, a general approach here would be to teach unroller (or maybe SCEV) to analyze N sequential iterations. E.g. for the xor case and for 2 sequential iterations starting at the i-th iteration, we'd get something like  S_i_plus_2 = S_i_plus_1 ^ 1 = (S_i ^ 1) ^ 1 = S_i.

But you need more than this. You need to know how the intermediate iterations are related so that you can eliminate the PHI. This is especially true for the remainder case (because remainders are expensive compared to `&`, for example).

I suspect that SCEV would have a hard time doing this because the relations are not algebraic (although it might be able to do something interesting with the remainder case).

In any case, there is definitely a more-general case we can handle here, but it is the following: there needs to either be a cycle PHI, or a use of an otherwise-unused instruction variable that repeats in a fixed pattern, such that unrolling the loop by the length of that pattern allows us to eliminate the PHI. i&C and i%C have that property. Is there anything else?

> Current approach do this, but starting from 0 iteration (taking in account that some instruction are cycle).
> 
> Suppose we skip this, what about:
> 
> 1. Uncountable loops where previous value is reused (save one+ instruction for each value)?
> 2. Uncountable loops, that counts smth?

Repository:
  rL LLVM

https://reviews.llvm.org/D21720