<html>
<head>
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p><br>
</p>
<div class="moz-cite-prefix">On 8/8/19 10:36 AM, Danila Malyutin via
llvm-dev wrote:<br>
</div>
<blockquote type="cite"
cite="mid:MN2PR12MB38407C2BDCF43DF8674F232FB8D70@MN2PR12MB3840.namprd12.prod.outlook.com">
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
<meta name="Generator" content="Microsoft Word 15 (filtered
medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:#0563C1;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:#954F72;
text-decoration:underline;}
p.msonormal0, li.msonormal0, div.msonormal0
{mso-style-name:msonormal;
mso-margin-top-alt:auto;
margin-right:0in;
mso-margin-bottom-alt:auto;
margin-left:0in;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
span.EmailStyle18
{mso-style-type:personal-compose;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;
font-family:"Calibri",sans-serif;}
@page WordSection1
{size:8.5in 11.0in;
margin:56.7pt 42.5pt 56.7pt 85.05pt;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
<div class="WordSection1">
<p class="MsoNormal">Hello,<br>
Recently I’ve come across two instances where Induction
Variable Simplification lead to noticable performance
regressions.<o:p></o:p></p>
<p class="MsoNormal">In one case, the removal of extra IV lead
to the inability to reschedule instructions in a tight loop to
reduce stalls. In that case, there were enough registers to
spare, so using extra register for extra induction variable
was preferable since it reduced dependencies in the loop.<br>
</p>
</div>
</blockquote>
<p>This one I'd phrase as a deficiency in the backend. Arguably
LSR, but in general our rewrite to reduce schedule pressure
transforms have room for improvement. I ran across a case of this
with an add reduction recently as well.</p>
<p>Removing a redundant IV is clearly the "right answer" in terms of
producing simpler, easier to optimize IR. <br>
</p>
<blockquote type="cite"
cite="mid:MN2PR12MB38407C2BDCF43DF8674F232FB8D70@MN2PR12MB3840.namprd12.prod.outlook.com">
<div class="WordSection1">
<p class="MsoNormal">
In the second case, there was a big nested loop made even
bigger after unswitching. However, the inner loop body was
rather simple, of the form:<o:p></o:p></p>
<p class="MsoNormal">loop {<o:p></o:p></p>
<p class="MsoNormal"> p+=n;<o:p></o:p></p>
<p class="MsoNormal">…<o:p></o:p></p>
<p class="MsoNormal"> p+=n;<o:p></o:p></p>
<p class="MsoNormal">…<o:p></o:p></p>
<p class="MsoNormal">}<br>
use p.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Due to unswitching there were several such
loops each with the different number of p+=n ops, so when the
IndVars pass rewrote all exit values, it added a lot of
slightly different offsets to the main loop header that
couldn’t fit in the available registers which lead to
unnecessary spills/reloads.<br>
</p>
</div>
</blockquote>
I have to ask a further question here. Why are the spill/fills
problematic? If they happened *outside* said loops - as you'd
expect from the example - at worst there is a code size impact. Is
there something more going on? (i.e. are the loops super short
running or something?)<br>
<blockquote type="cite"
cite="mid:MN2PR12MB38407C2BDCF43DF8674F232FB8D70@MN2PR12MB3840.namprd12.prod.outlook.com">
<div class="WordSection1">
<p class="MsoNormal">
<br>
I am wondering what is the usual strategy for dealing with
such “pessimizations”? Is it possible to somehow modify the
IndVarSimplify pass to take those issues into account (for
example, tell it that adding offset computation + gep is
potentially more expensive than simply reusing last var from
the loop) or should it be recovered in some later pass? If so,
is there an easy way to revert IV elimination? Have anyone
dealt with similar issues before?</p>
</div>
</blockquote>
<p>My answer: IndVars did the right thing in both of these cases.
The IR is definitely much cleaner, easier to optimize by other
transforms, etc.. Unfortunately, it's not uncommon for a good
transform to produce output which reveals other deficiencies in
the optimizer/backend. We can and should fix those where we find
them. <br>
</p>
<p>(There's honest disagreement about the philosophy here JFYI.)<br>
</p>
<blockquote type="cite"
cite="mid:MN2PR12MB38407C2BDCF43DF8674F232FB8D70@MN2PR12MB3840.namprd12.prod.outlook.com">
<div class="WordSection1">
<p class="MsoNormal"><o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">--<o:p></o:p></p>
<p class="MsoNormal">Danila<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<pre class="moz-quote-pre" wrap="">_______________________________________________
LLVM Developers mailing list
<a class="moz-txt-link-abbreviated" href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>
<a class="moz-txt-link-freetext" href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a>
</pre>
</blockquote>
</body>
</html>