[PATCH] D75013: [LoopTerminology] Rotated Loops

Mon Mar 9 17:50:03 PDT 2020

Meinersbur accepted this revision.
Meinersbur added a comment.
This revision is now accepted and ready to land.

This is somewhat more verbose than I'd expect for a terminology document (more like a tutorial). I also don't see why not: LGTM.

You have commit right now, correct? Before committing, please fix the remaining typos/nits.

================
Comment at: llvm/docs/LoopTerminology.rst:221
+at the LLVM IR level. We follow with the previous examples
+in LLVM IR while also providing a graphical reprensentation
+of the control-flow graphs (CFG). You can get the same graphical
----------------
[typo] repre**n**sentation

================
Comment at: llvm/docs/LoopTerminology.rst:277
+
+.. image:: ./loop-terminology-rotated-loop.png
+  :width: 400 px
----------------
The backedge "T -> body" looks displaced. Not sure what to do about it.

================
Comment at: llvm/docs/LoopTerminology.rst:286
+* It is obvious that the %body and %latch don't need to be separate.
+  loop-rotate will usually merge them.
+* The compiler in this case can't deduce that the loop will
----------------
This is something typically done by simplify-cfg, I don't think loop-rotate itself merges BBs.

================
Comment at: llvm/docs/LoopTerminology.rst:325
+because loop-rotate triggers the
+:ref:`-loop-simplify <loop-terminology-loop-simplify>` pass to run.
+In this case, it inserted the %loop.preheader basic block so
----------------
[nit] please make clear that -loop-simplify runs //before// -loop-rotate.

================
Comment at: llvm/docs/LoopTerminology.rst:357-358
+However, now, in the case that n <= 0, in the initial form,
+the loop body would never execute, and so, the load would
+never execute.  This is a problem both for performance and
+semantic reasons. First of all, we have a load that would 
----------------
[nit] The performance reason is really minor. When optimistically assuming that in most cases, a loop will "loop" lots of times, the occasional single execution of a load whose result is not used is minor. The additional control-flow introduced by the guard might cause more slowdown (not necessarily because of the addition branch instruction, branch predictors are really good these days). I'd just document the semantic reason.

It would be interesting to explore whether the guard still makes sense if `p` has the attribute `dereferenceable` (i.e. we know that the load will not cause an error)

================
Comment at: llvm/docs/LoopTerminology.rst:382
+do an unconditional first execution and insert the loop condition
+in the end. That effectively means transforming the loop into a do-while loop:
+
----------------
[grammar] "That" -> "This"

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75013/new/

https://reviews.llvm.org/D75013