[PATCH] D102615: [LoopDeletion] Break backedge if we can prove that the loop is exited on 1st iteration (try 3)

Wed Jun 16 14:24:29 PDT 2021

nikic added a comment.

In D102615#2818742 <https://reviews.llvm.org/D102615#2818742>, @mkazantsev wrote:

> As for SCEV impact, some points.
>
> 1. Yes, I agree that we should try to reduce impact on compile time when we can;
> 2. We already have passes (IV simplify, IRCE, LSR...) where we already use SCEV. I'm absoultely certain that in reality, most things they do they could handle with InstSimplify-like instructions. But for some reason, we don't mind against the toll that SCEV takes there (and it should be HUGE in IV simplify) just because we started tracking compile time *after* these things were already there.
> 3. It's pretty clear to me that the problem is not in loop deletion that is slow with SCEV. The problem is that the SCEV is slow. And it's much slower than 0.2% that we see here. Holding away future optimizations just because SCEV is, and always has been, slow, doesn't seem right to me.

I think we need to keep two things separate here: Passes like IndVars, IRCE and LSR perform actual loop reasoning. The entire purpose of SCEV is to reason about add recurrences, and that's exactly what these passes need. Using anything but SCEV wouldn't really make sense in this context. InstSimplify knows nothing (or at least very little) about recurrences.

The case here is different: This optimization only has a nominal relation to loops. There are no add recurrences involved, there is no "scalar evolution" going on -- rather, SCEV is being used as a generic symbolic reasoning framework, which is not a use-case it is intended or optimized for.

You are right that SCEV has serious compile-time issues in general, and "make SCEV faster" is certainly preferred over "don't use SCEV". However, in this particular instance I would argue that even if SCEV were blazing fast, it still wouldn't be the right tool for the job. Of course, there is the little caveat that we don't have any other symbolic reasoning framework in LLVM that has the same level of power, so I can certainly appreciate why it is being used here. I just want to highlight that this is not a case where SCEV is the obviously correct choice, as it is for passes like IndVars or LSR.

In D102615#2818794 <https://reviews.llvm.org/D102615#2818794>, @mkazantsev wrote:

> P.S. and as for SCCP - that's a good point I didn't think of. Need to make some experiments to see if it helps.
> UPD: it helps in some of cases that are interesting to us. In some of them it doesn't, but not clear whether it's a conceptual limitation or can be fixed easily.

Worth mentioning that currently SCCP doesn't use PredicateInfo, which means it doesn't perform folding based on dominating branches. This is currently only enabled in the IPSCCP pass.

In D102615#2820879 <https://reviews.llvm.org/D102615#2820879>, @reames wrote:

> I want to chime in on the philosophy bits raised in the last few comments.
>
> @nikic I understand and generally agree with your point about not implementing adhoc optimizations for particular patterns which happen to break particular workloads.  However, I don't think it really applies here.
>
> Leaving aside your point about whether this can be done without SCEV (a reasonable question I'll touch on later), I see the class of loops which provably don't execute their backedge based on symbolic reasoning of the first iteration as being an interesting class.  In fact, SCEV *already* does a variant of this in the exit computation logic when the exit condition is not analyzable.  If anything, from a philosophy perspective, I'd prefer that this patch didn't exist because SCEV did this internally.  However, that doesn't appear to be pragmatic from a compile time perspective.
>
> I also want to call out a methodology issue.  If we use the test-suite as evidence of (lack of) profit, and the test suite contains only historical C and C++ code, we will never implement any transform which either a) shows up for other languages, or b) is idiomatic in newer code (e.g. new language versions for C/C++).  I'm not saying you're wrong to collect this data, but we need to be careful about how we interpret it.
>
> From a purely pragmatic perspective, I don't want to get in the habit of breaking forward progress for perfection.  @nikic, I see you as coming dangerously close to that here.  I do think that if Max can show a savings elsewhere, we should allow this patch.

Yeah, I agree that using test-suite for evaluation is dangerous in that it is a very biased collection of code. I wouldn't be surprised if most of the optimizations I implemented have zero hits on test-suite, because they are targeted at a different frontend :) What test-suite mainly tells us is that there exists some large category of code for which an optimization is //not// relevant. It doesn't tell use whether there exists some other large category where it //is//.

I do like your framing in terms of this being an "interesting class" of loops, and agree that they are worth optimizing -- the question is mostly just how hard we should try :)

In any case, I don't want to block this patch. If you think this the right way to approach the problem, I'm fine with it going in as-is.

================
Comment at: llvm/lib/Transforms/Scalar/LoopDeletion.cpp:146
+static const SCEV *
+getSCEVOnFirstIteration(Value *V, Loop *L, ScalarEvolution &SE,
+                        DenseMap<Value *, const SCEV *> &FirstIterSCEV) {
----------------
Legit clang-tidy warning: If you have a long chain of adds, this is going to blow the stack. Should probably use worklist?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D102615/new/

https://reviews.llvm.org/D102615