<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html;
      charset=windows-1252">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <p><br>
    </p>
    <div class="moz-cite-prefix">On 8/9/19 8:27 AM, Danila Malyutin via
      llvm-dev wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:MN2PR12MB3840A07C07A3EB4A7A32C88BB8D60@MN2PR12MB3840.namprd12.prod.outlook.com">
      <meta http-equiv="Content-Type" content="text/html;
        charset=windows-1252">
      <meta name="Generator" content="Microsoft Word 15 (filtered
        medium)">
      <!--[if !mso]><style>v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style><![endif]-->
      <style><!--
/* Font Definitions */
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:#0563C1;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:#954F72;
        text-decoration:underline;}
p.msonormal0, li.msonormal0, div.msonormal0
        {mso-style-name:msonormal;
        margin:0in;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;}
p.xmsonormal, li.xmsonormal, div.xmsonormal
        {mso-style-name:x_msonormal;
        margin:0in;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;}
span.EmailStyle21
        {mso-style-type:personal-reply;
        font-family:"Calibri",sans-serif;
        color:windowtext;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-size:10.0pt;}
@page WordSection1
        {size:8.5in 11.0in;
        margin:56.7pt 42.5pt 56.7pt 85.05pt;}
div.WordSection1
        {page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
      <div class="WordSection1">
        <p class="MsoNormal">Hi Hal,<o:p></o:p></p>
        <p class="MsoNormal"><o:p> </o:p></p>
        <p class="MsoNormal">I see. So LSR could theoretically
          counteract undesirable Ind Var transformations but it’s not
          implemented at the moment?<br>
          <br>
          I think I’ve managed to come up with a small reproducer that
          can also exhibit similar problem on x86, here it is:
          <a href="https://godbolt.org/z/_wxzut" moz-do-not-send="true">https://godbolt.org/z/_wxzut</a><o:p></o:p></p>
        <p class="MsoNormal"><o:p> </o:p></p>
        <p class="MsoNormal">As you can see, when rewriteLoopExitValues
          is not disabled Clang generates worse code due to additional
          spills, because Ind Vars rewrites all exit values of ‘a’ to
          recompute it’s value instead of reusing the value from the
          loop body. This requires extra registers for the new “a after
          the loop” value (since it’s not simply reused) and also to
          store the new “offset”, which leads to the extra spills since
          they all live across big loop body. When exit values are not
          rewritten ‘a’ stays in it’s `r15d` register with no extra
          costs.</p>
      </div>
    </blockquote>
    <p>This hits on a point I've thought some about, but haven't tried
      to implement.</p>
    <p>I think there might be room for a late pass which undoes the exit
      value rewriting.  As an analogy, we have MachineLICM which
      sometimes undoes the transforms performed by LICM, but we still
      want the IR form to hoist aggressively for ease of optimization
      and analysis.  </p>
    <p>Maybe this should be part of LSR, or maybe separate.  Haven't
      thought about that part extensively.</p>
    <p>It's worth noting that the SCEVs for the exit value of the value
      inside the loop and the rewritten exit value should be identical. 
      So recognizing the case for potential rewriting is quite
      straight-forward.  The profitability reasoning might be more
      involved, but the legality part should essentially be handled by
      SCEV, and should be able to reuse exactly the same code as RLEV. 
      <br>
    </p>
    <blockquote type="cite"
cite="mid:MN2PR12MB3840A07C07A3EB4A7A32C88BB8D60@MN2PR12MB3840.namprd12.prod.outlook.com">
      <div class="WordSection1">
        <p class="MsoNormal"><o:p></o:p></p>
        <p class="MsoNormal"><o:p> </o:p></p>
        <div>
          <p class="MsoNormal">--<o:p></o:p></p>
          <p class="MsoNormal">Danila<o:p></o:p></p>
        </div>
        <p class="MsoNormal"><o:p> </o:p></p>
        <div>
          <div style="border:none;border-top:solid #E1E1E1
            1.0pt;padding:3.0pt 0in 0in 0in">
            <p class="MsoNormal"><b>From:</b> Finkel, Hal J.
              [<a class="moz-txt-link-freetext" href="mailto:hfinkel@anl.gov">mailto:hfinkel@anl.gov</a>] <br>
              <b>Sent:</b> Thursday, August 8, 2019 21:24<br>
              <b>To:</b> Danila Malyutin
              <a class="moz-txt-link-rfc2396E" href="mailto:Danila.Malyutin@synopsys.com"><Danila.Malyutin@synopsys.com></a><br>
              <b>Subject:</b> Re: [llvm-dev] How to best deal with
              undesirable Induction Variable Simplification?<o:p></o:p></p>
          </div>
        </div>
        <p class="MsoNormal"><o:p> </o:p></p>
        <div id="divtagdefaultwrapper">
          <p><span style="font-size:12.0pt;color:black">Hi, Danila,<o:p></o:p></span></p>
          <p><span style="font-size:12.0pt;color:black"><o:p> </o:p></span></p>
          <p><span style="font-size:12.0pt;color:black">Regarding the
              first case, this is certainly a problem that has come up
              before. As I recall, and I believe this is still
              true, LoopStrengthReduce, where we reason about induction
              variables while accounting for register pressure, won't
              currently add new PHIs. People have talked about extending
              LSR to consider adding new PHIs in the past.<o:p></o:p></span></p>
          <p><span style="font-size:12.0pt;color:black"><o:p> </o:p></span></p>
          <p><span style="font-size:12.0pt;color:black">Regarding the
              second case, could you post a more-detailed description? I
              don't quite understand the issue.<o:p></o:p></span></p>
          <p><span style="font-size:12.0pt;color:black"><o:p> </o:p></span></p>
          <p><span style="font-size:12.0pt;color:black"> -Hal<o:p></o:p></span></p>
          <p><span style="font-size:12.0pt;color:black"><o:p> </o:p></span></p>
          <div id="Signature">
            <div>
              <div>
                <p class="MsoNormal"><span
                    style="font-size:10.0pt;color:black">Hal Finkel<br>
                    Lead, Compiler Technology and Programming Languages<br>
                    Leadership Computing Facility<br>
                    Argonne National Laboratory<o:p></o:p></span></p>
              </div>
            </div>
          </div>
          <p class="MsoNormal" style="margin-bottom:12.0pt"><span
              style="font-size:12.0pt;color:black"><o:p> </o:p></span></p>
          <div>
            <div class="MsoNormal" style="text-align:center"
              align="center"><span style="font-size:12.0pt;color:black">
                <hr width="98%" size="2" align="center">
              </span></div>
            <div id="divRplyFwdMsg">
              <p class="MsoNormal"><b><span style="color:black">From:</span></b><span
                  style="color:black"> llvm-dev <<a
                    href="mailto:llvm-dev-bounces@lists.llvm.org"
                    moz-do-not-send="true">llvm-dev-bounces@lists.llvm.org</a>>
                  on behalf of Danila Malyutin via llvm-dev <<a
                    href="mailto:llvm-dev@lists.llvm.org"
                    moz-do-not-send="true">llvm-dev@lists.llvm.org</a>><br>
                  <b>Sent:</b> Thursday, August 8, 2019 12:36 PM<br>
                  <b>To:</b> <a href="mailto:llvm-dev@lists.llvm.org"
                    moz-do-not-send="true">llvm-dev@lists.llvm.org</a>
                  <<a href="mailto:llvm-dev@lists.llvm.org"
                    moz-do-not-send="true">llvm-dev@lists.llvm.org</a>><br>
                  <b>Subject:</b> [llvm-dev] How to best deal with
                  undesirable Induction Variable Simplification?</span><span
                  style="font-size:12.0pt;color:black">
                  <o:p></o:p></span></p>
              <div>
                <p class="MsoNormal"><span
                    style="font-size:12.0pt;color:black"> <o:p></o:p></span></p>
              </div>
            </div>
            <div>
              <div>
                <p class="xmsonormal"><span
                    style="font-size:12.0pt;color:black">Hello,<br>
                    Recently I’ve come across two instances where
                    Induction Variable Simplification lead to noticable
                    performance regressions.<o:p></o:p></span></p>
                <p class="xmsonormal"><span
                    style="font-size:12.0pt;color:black">In one case,
                    the removal of extra IV lead to the inability to
                    reschedule instructions in a tight loop to reduce
                    stalls. In that case, there were enough registers to
                    spare, so using extra register for extra induction
                    variable was preferable since it reduced
                    dependencies in the loop.<br>
                    In the second case, there was a big nested loop made
                    even bigger after unswitching. However, the inner
                    loop body was rather simple, of the form:<o:p></o:p></span></p>
                <p class="xmsonormal"><span
                    style="font-size:12.0pt;color:black">loop {<o:p></o:p></span></p>
                <p class="xmsonormal"><span
                    style="font-size:12.0pt;color:black">  p+=n;<o:p></o:p></span></p>
                <p class="xmsonormal"><span
                    style="font-size:12.0pt;color:black">…<o:p></o:p></span></p>
                <p class="xmsonormal"><span
                    style="font-size:12.0pt;color:black">  p+=n;<o:p></o:p></span></p>
                <p class="xmsonormal"><span
                    style="font-size:12.0pt;color:black">…<o:p></o:p></span></p>
                <p class="xmsonormal"><span
                    style="font-size:12.0pt;color:black">}<br>
                    use p.<o:p></o:p></span></p>
                <p class="xmsonormal"><span
                    style="font-size:12.0pt;color:black"> <o:p></o:p></span></p>
                <p class="xmsonormal"><span
                    style="font-size:12.0pt;color:black">Due to
                    unswitching there were several such loops each with
                    the different number of p+=n ops, so when the
                    IndVars pass rewrote all exit values, it added a lot
                    of slightly different offsets to the main loop
                    header that couldn’t fit in the available registers
                    which lead to unnecessary spills/reloads.<br>
                    <br>
                    I am wondering what is the usual strategy for
                    dealing with such “pessimizations”? Is it possible
                    to somehow modify the IndVarSimplify pass to take
                    those issues into account (for example, tell it that
                    adding offset computation + gep is potentially more
                    expensive than simply reusing last var from the
                    loop) or should it be recovered in some later pass?
                    If so, is there an easy way to revert IV
                    elimination? Have anyone dealt with similar issues
                    before?<o:p></o:p></span></p>
                <p class="xmsonormal"><span
                    style="font-size:12.0pt;color:black"> <o:p></o:p></span></p>
                <p class="xmsonormal"><span
                    style="font-size:12.0pt;color:black">--<o:p></o:p></span></p>
                <p class="xmsonormal"><span
                    style="font-size:12.0pt;color:black">Danila<o:p></o:p></span></p>
                <p class="xmsonormal"><span
                    style="font-size:12.0pt;color:black"> <o:p></o:p></span></p>
              </div>
            </div>
          </div>
        </div>
      </div>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <pre class="moz-quote-pre" wrap="">_______________________________________________
LLVM Developers mailing list
<a class="moz-txt-link-abbreviated" href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>
<a class="moz-txt-link-freetext" href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a>
</pre>
    </blockquote>
  </body>
</html>