[PATCH] D75013: [LoopTerminology] Rotated Loops

Wed Mar 11 13:35:28 PDT 2020

baziotis marked an inline comment as done.
baziotis added inline comments.

================
Comment at: llvm/docs/LoopTerminology.rst:357-358
+However, now, in the case that n <= 0, in the initial form,
+the loop body would never execute, and so, the load would
+never execute.  This is a problem both for performance and
+semantic reasons. First of all, we have a load that would 
----------------
baziotis wrote:
> Meinersbur wrote:
> > [nit] The performance reason is really minor. When optimistically assuming that in most cases, a loop will "loop" lots of times, the occasional single execution of a load whose result is not used is minor. The additional control-flow introduced by the guard might cause more slowdown (not necessarily because of the addition branch instruction, branch predictors are really good these days). I'd just document the semantic reason.
> > 
> > It would be interesting to explore whether the guard still makes sense if `p` has the attribute `dereferenceable` (i.e. we know that the load will not cause an error)
> > I'd just document the semantic reason.
> Ok, yes.
> 
> > It would be interesting to explore whether the guard still makes sense if p has the attribute dereferenceable (i.e. we know that the load will not cause an error)
> 
> I haven't thought about that, thanks. Not relevant to this doc I guess but otherwise an interesting thing I could experiment with.
I tried to implement that today but it seems it's already done. First of all we can see it by adding the attribute. But we can also see that by following: https://github.com/llvm/llvm-project/blob/master/llvm/lib/Transforms/Scalar/LICM.cpp#L806
Here, simplistically, the first 2 parts of the condition check for "invariance" (actually probably having the first is too conservative but anyway, off-topic) and the third for "semantics preservation". Without `dereferenceable` the first is satisfied while the second not. If we follow it we end up in here: 
https://github.com/llvm/llvm-project/blob/master/llvm/lib/Analysis/ValueTracking.cpp#L4229
With `dereferenceable` is still satisfied because we don't suppress speculation and is anyway aligned.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75013/new/

https://reviews.llvm.org/D75013