[llvm-dev] How to best deal with undesirable Induction Variable Simplification?

Thu Aug 8 16:21:50 PDT 2019

Am Do., 8. Aug. 2019 um 12:37 Uhr schrieb Danila Malyutin via llvm-dev
<llvm-dev at lists.llvm.org>:
>
> Hello,
> Recently I’ve come across two instances where Induction Variable Simplification lead to noticable performance regressions.
>
> In one case, the removal of extra IV lead to the inability to reschedule instructions in a tight loop to reduce stalls. In that case, there were enough registers to spare, so using extra register for extra induction variable was preferable since it reduced dependencies in the loop.

Since r139579, IndVarSimplify (the pass) should not normalize
induction variables without a reason anymore (a reason would be that
the loop can be deleted). Could you file a bug report, attach a
minimal .ll file and mention what output you would expect?

> Due to unswitching there were several such loops each with the different number of p+=n ops, so when the IndVars pass rewrote all exit values, it added a lot of slightly different offsets to the main loop header that couldn’t fit in the available registers which lead to unnecessary spills/reloads.

Since after unswitching only one of the resulting loops is executed,
the register usage should be the maximum of those loops, which ideally
is at most the register usage of the pre-unswitched loop. In your
case, p could be in the same register in all unswitched loops.
However, other optimizations might increase register pressure again
and the register allocation is not optimal in all cases.

Again, could you file a bug report, include a minimal reproducer and
what output you expect?

> I am wondering what is the usual strategy for dealing with such “pessimizations”? Is it possible to somehow modify the IndVarSimplify pass to take those issues into account (for example, tell it that adding offset computation + gep is potentially more expensive than simply reusing last var from the loop) or should it be recovered in some later pass? If so, is there an easy way to revert IV elimination? Have anyone dealt with similar issues before?

Ideally, we prefer to such pessimizations to not occur, as r139579
did. However, the transformation might also be a IR normalization that
enables other transformations. In that case, another pass down the
pipeline would transform the normalized form to an optimized one. For
instance, LoopSimplify inserts a loop preheader the CFGSimplify would
remove again. What is considered normalization depends on the case. If
you can show that a change generally improves performance (not just
for your code) and has at most minor regressions, then any approach is
worth considering.

Michael