[llvm-dev] How to best deal with undesirable Induction Variable Simplification?

Fri Aug 9 16:00:23 PDT 2019

On 8/8/19 10:36 AM, Danila Malyutin via llvm-dev wrote:
>
> Hello,
> Recently I’ve come across two instances where Induction Variable
> Simplification lead to noticable performance regressions.
>
> In one case, the removal of extra IV lead to the inability to
> reschedule instructions in a tight loop to reduce stalls. In that
> case, there were enough registers to spare, so using extra register
> for extra induction variable was preferable since it reduced
> dependencies in the loop.
>
This one I'd phrase as a deficiency in the backend.  Arguably LSR, but
in general our rewrite to reduce schedule pressure transforms have room
for improvement.  I ran across a case of this with an add reduction
recently as well.

Removing a redundant IV is clearly the "right answer" in terms of
producing simpler, easier to optimize IR. 

> In the second case, there was a big nested loop made even bigger after
> unswitching. However, the inner loop body was rather simple, of the form:
>
> loop {
>
>   p+=n;
>
> …
>
>   p+=n;
>
> …
>
> }
> use p.
>
>  
>
> Due to unswitching there were several such loops each with the
> different number of p+=n ops, so when the IndVars pass rewrote all
> exit values, it added a lot of slightly different offsets to the main
> loop header that couldn’t fit in the available registers which lead to
> unnecessary spills/reloads.
>
I have to ask a further question here.  Why are the spill/fills
problematic?  If they happened *outside* said loops - as you'd expect
from the example - at worst there is a code size impact.  Is there
something more going on?  (i.e. are the loops super short running or
something?)
>
>
> I am wondering what is the usual strategy for dealing with such
> “pessimizations”? Is it possible to somehow modify the IndVarSimplify
> pass to take those issues into account (for example, tell it that
> adding offset computation + gep is potentially more expensive than
> simply reusing last var from the loop) or should it be recovered in
> some later pass? If so, is there an easy way to revert IV elimination?
> Have anyone dealt with similar issues before?
>
My answer: IndVars did the right thing in both of these cases.  The IR
is definitely much cleaner, easier to optimize by other transforms,
etc..  Unfortunately, it's not uncommon for a good transform to produce
output which reveals other deficiencies in the optimizer/backend.  We
can and should fix those where we find them. 

(There's honest disagreement about the philosophy here JFYI.)

>  
>
> --
>
> Danila
>
>  
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190809/56c5c89a/attachment.html>