[PATCH] D21720: Unroll for uncountable loops

Thu Mar 9 13:38:23 PST 2017

mzolotukhin added a comment.

>   full unroll consider "s^=1;" is constant "1" at all iterations >0. However it is 0 on even and 1 on odd.

If that's the case, I'd be happy to review a separate patch that just fixes that.

> I don't think that "easily be handled by using unroll pragma" is the right standard here. The same argument can be made for anything. We should only really require a pragma when we cannot design a reasonable heuristic (at this stage in the pipeline). This patch does not seem that large. I can review next week.

My biggest concern about this patch is that it doesn't solve the problem in a general way, but instead only catches two cases: `s ^= 1` and `s = -s`. While it looks kind of generic, I can see no way how it can be extended in future. E.g. it doesn't seem to be able to support another way to write the same idiom: `s = i%2` or `s = i&1`, and if we'd like to support it, we'll need yet another approach.

Your point about using pragma is generally valid, but here is my view on it. While in general we do need to try to cover as many cases without pragmas as possible, we need to provide a general implementation that potentially can help other cases as well. Patching the compiler here and there just to cover cases one by one would lead to unmaintainable and inefficient code in the end. Also, unlike many other loops in a loop with `s ^=1` the user would more likely correctly guess a favorable unroll factor (2), just by the nature of the computation. In more complicated loops a help from the compiler is needed much more, because it might be hard to guess the best unroll factor.

IMHO, a general approach here would be to teach unroller (or maybe SCEV) to analyze N sequential iterations. E.g. for the xor case and for 2 sequential iterations starting at the i-th iteration, we'd get something like  `S_i_plus_2 = S_i_plus_1 ^ 1 = (S_i ^ 1) ^ 1 = S_i`. Honestly, I don't know how hard it would be to implement this, and maybe the efforts and complexity won't be worth the gain, but that's what I'd call a general approach, rather than targeting a single case.

Michael

Repository:
  rL LLVM

https://reviews.llvm.org/D21720