[PATCH] D75013: [LoopTerminology] Rotated Loops

Mon Mar 9 18:54:13 PDT 2020

baziotis marked 4 inline comments as done.
baziotis added a comment.

In D75013#1913621 <https://reviews.llvm.org/D75013#1913621>, @Meinersbur wrote:

> This is somewhat more verbose than I'd expect for a terminology document (more like a tutorial). I also don't see why not: LGTM.

It seemed to me as well, but it looked useful. I'm glad you like it.

> You have commit right now, correct? Before committing, please fix the remaining typos/nits.

Yes and yes.

================
Comment at: llvm/docs/LoopTerminology.rst:277
+
+.. image:: ./loop-terminology-rotated-loop.png
+  :width: 400 px
----------------
Meinersbur wrote:
> The backedge "T -> body" looks displaced. Not sure what to do about it.
Believe me, I tried.. :p
I thought about creating the image from view-cfg, erase the arrows, then create again the arrows.
But then the image is not reproducible. I don't know how much we care about that.

================
Comment at: llvm/docs/LoopTerminology.rst:286
+* It is obvious that the %body and %latch don't need to be separate.
+  loop-rotate will usually merge them.
+* The compiler in this case can't deduce that the loop will
----------------
Meinersbur wrote:
> This is something typically done by simplify-cfg, I don't think loop-rotate itself merges BBs.
Yes, me too. LoopRotate has no such thing.

================
Comment at: llvm/docs/LoopTerminology.rst:357-358
+However, now, in the case that n <= 0, in the initial form,
+the loop body would never execute, and so, the load would
+never execute.  This is a problem both for performance and
+semantic reasons. First of all, we have a load that would 
----------------
Meinersbur wrote:
> [nit] The performance reason is really minor. When optimistically assuming that in most cases, a loop will "loop" lots of times, the occasional single execution of a load whose result is not used is minor. The additional control-flow introduced by the guard might cause more slowdown (not necessarily because of the addition branch instruction, branch predictors are really good these days). I'd just document the semantic reason.
> 
> It would be interesting to explore whether the guard still makes sense if `p` has the attribute `dereferenceable` (i.e. we know that the load will not cause an error)
> I'd just document the semantic reason.
Ok, yes.

> It would be interesting to explore whether the guard still makes sense if p has the attribute dereferenceable (i.e. we know that the load will not cause an error)

I haven't thought about that, thanks. Not relevant to this doc I guess but otherwise an interesting thing I could experiment with.

================
Comment at: llvm/docs/LoopTerminology.rst:392
+    } while (i < n);
+  }
----------------
I should make clear here that such movement of code does not happen from `-loop-rotate` (but e.g. LICM). Although we should be careful on the wording because LoopRotate //does// [[ https://github.com/llvm/llvm-project/blob/master/llvm/lib/Transforms/Utils/LoopRotationUtils.cpp#L397 | move some code  ]]. As matter of fact though, I don't understand this: https://godbolt.org/z/hgYhHX
`%invcond` doesn't write to / read from memory, has invariant operands, is not a terminator and is not debuginfo intrinsic.
So, I don't get why it isn't moved outside, I'll take a look tomorrow.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75013/new/

https://reviews.llvm.org/D75013