<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Mar 14, 2017 at 11:40 AM, Hal Finkel <span dir="ltr"><<a href="mailto:hfinkel@anl.gov" target="_blank">hfinkel@anl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000"><span class="">
<p><br>
</p>
<br>
<div class="m_-868374222628263648moz-cite-prefix">On 03/14/2017 11:58 AM, Michael
Kuperstein wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div dir="auto">I'm still not sure about this, for a few
reasons:
<div dir="auto"><br>
</div>
<div dir="auto">1) I'd like to try to treat epilogue loops the
same way regardless of whether the main loop was vectorized
by hand or automatically. So if someone hand-wrote an
avx-512 16-wide loop, with alias checks, and we decide it's
profitable to vectorize the epilogue loop by 4 and re-use
the checks, it ought to be done the same way. I realize this
may be a pipe-dream, though.</div>
</div>
</div>
</blockquote>
<br></span>
I agree that sounds ideal. Identifying the effective vectorization
factor of the hand-vectorized loop seems fragile. However, if
someone is hand vectorizing then it seems like a small price to add
a pragma to the scalar loop restricting the vectorization factor
(and/or specifying that it is safe to execute). As a result, I'm not
sure how much effort we should make here.<span class=""><br>
<br></span></div></blockquote><div><br></div><div>I semi-agree with you, which is why I don't think it's a blocker. :-)</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000"><span class="">
<blockquote type="cite">
<div dir="ltr">
<div dir="auto">
<div dir="auto"><br>
</div>
<div dir="auto">2) I'm still somewhat worried about "tiny
loops". As I wrote before, we explicitly refuse to vectorize
loops we know have a trip-count less than 16, because our
profitability heuristic for such loops is probably bad. IIUC
the only reason we don't bail due to the threshold is
because we use the same loop for "failed min iters check"
and "failed alias check". So, because it's reachable through
the alias-check path, the max trip count isn't actually
known, even though the typical trip count is probably
small. </div>
<div dir="auto">It's true that you currently don't try to
vectorize the epilogue if the original VF is below 16, but
this is a somewhat different condition. <br>
</div>
</div>
</div>
</blockquote>
<br></span>
I think that the reason we have that limit, however, is that we
don't model the costs of the checks. Not that the cost model is
otherwise too inaccurate for low-trip-count loops. If we modeled the
costs of the checks, then I don't think this would be a problem.<span class=""><br>
<br></span></div></blockquote><div><br></div><div>I don't think it's just the alias checks. There's also the min-iteration check, broadcasts that get hoisted out of the loop, etc.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000"><span class="">
<blockquote type="cite">
<div dir="ltr">
<div dir="auto">
<div dir="auto"><br>
</div>
<div dir="auto">3) Technically speaking, constructing a new
InnerLoopVectorizer to vectorize this one loop sounds weird.
We already have a worklist in the vectorizer that's
currently running.</div>
</div>
</div>
</blockquote>
<br></span>
I agree, although we do want to reuse the cost and legality analysis
(which I think is a worthwhile engineering decision because that
analysis involves SCEV, AA, and TTI, all of which can get
expensive). If we can do that and also just add the new loop to the
work queue, that certainly might be cleaner.<span class="HOEnZb"><font color="#888888"><br>
<br>
-Hal</font></span><div><div class="h5"><br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div dir="auto">
<div dir="auto"><br>
</div>
<div dir="auto">I don't think (1) is a blocker, and (3) should
be easy to fix, but I'm not sure whether the way this is
going to handle (2) is sufficient. If I'm the only one that
this bothers, I won't stand in the way, but I'd like to at
least make sure we've fully considered this.</div>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Mar 14, 2017 06:00, "Nema,
Ashutosh" <<a href="mailto:Ashutosh.Nema@amd.com" target="_blank">Ashutosh.Nema@amd.com</a>>
wrote:<br type="attribution">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div link="blue" vlink="purple" lang="EN-US">
<div class="m_-868374222628263648m_-8565914691181255471m_2339803851957541712WordSection1">
<p class="MsoNormal">Summarizing the discussion on the
implementation approaches.</p>
<p class="MsoNormal"> </p>
<p class="MsoNormal">Discussed about two approaches,
first running ‘InnerLoopVectorizer’ again on the
epilog loop immediately after vectorizing the
original loop within the same vectorization pass,
the second approach where re-running vectorization
pass and limiting vectorization factor of epilog
loop by metadata.</p>
<p class="MsoNormal"> </p>
<p class="MsoNormal"><Approach-2></p>
<p class="MsoNormal">Challenges with re-running the
vectorizer pass:</p>
<p class="m_-868374222628263648m_-8565914691181255471m_2339803851957541712MsoListParagraph"><span>1)<span style="font:7.0pt "Times New Roman"">
</span></span>Reusing alias check result: </p>
<p class="m_-868374222628263648m_-8565914691181255471m_2339803851957541712MsoListParagraph">When
vectorizer pass runs again it finds the epilog loop
as a new loop and it may generates alias check, this
new alias check may overkill the gains of epilog
vectorization.</p>
<p class="m_-868374222628263648m_-8565914691181255471m_2339803851957541712MsoListParagraph">We
should use the already computed alias check result
instead of re computing again.</p>
<p class="m_-868374222628263648m_-8565914691181255471m_2339803851957541712MsoListParagraph"><span>2)<span style="font:7.0pt "Times New Roman"">
</span></span>Rerun the vectorizer and hoist the
new alias check:</p>
<p class="m_-868374222628263648m_-8565914691181255471m_2339803851957541712MsoListParagraph">It’s
not possible to hoist alias checks as its not fully
redundant (not dominated by other checks), it’s not
getting execute in all paths.</p>
<p class="MsoNormal"> </p>
<p class="MsoNormal"><img id="m_-868374222628263648m_-8565914691181255471m_2339803851957541712Picture_x0020_1" src="cid:part2.4C3D0A89.86BB91D8@anl.gov" height="156" width="567"></p>
<p class="MsoNormal"> </p>
<p class="MsoNormal">NOTE: We cannot prepone alias
check as its expensive compared to other checks.</p>
<p class="MsoNormal"> </p>
<p class="MsoNormal"><Approach-1></p>
<p class="m_-868374222628263648m_-8565914691181255471m_2339803851957541712MsoListParagraph"><span>1)<span style="font:7.0pt "Times New Roman"">
</span></span>Current patch depends on the
existing functionality of LoopVectorizer, it uses
‘InnerLoopVectorizer’ again to vectorize the epilog
loop, as it happens in the same vectorization pass
we have flexibility to reuse already computed alias
result check & limit vectorization factor for
the epilog loop. </p>
<p class="m_-868374222628263648m_-8565914691181255471m_2339803851957541712MsoListParagraph"><span>2)<span style="font:7.0pt "Times New Roman"">
</span></span>It does not generate the blocks for
new block layout explicitly, rather it depends on
‘InnerLoopVectorizer::createEm<wbr>ptyLoop’ to
generate new block layout. The new block layout get
automatically generated by calling the
‘InnerLoopVectorizer:: vectorize’ again.</p>
<p class="m_-868374222628263648m_-8565914691181255471m_2339803851957541712MsoListParagraph"><span>3)<span style="font:7.0pt "Times New Roman"">
</span></span>Block layout description with epilog
loop vectorization is available at</p>
<p class="m_-868374222628263648m_-8565914691181255471m_2339803851957541712MsoListParagraph"><a href="https://reviews.llvm.org/file/data/fxg5vx3capyj257rrn5j/PHID-FILE-x6thnbf6ub55ep5yhalu/LayoutDescription.png" target="_blank">https://reviews.llvm.org/file/<wbr>data/fxg5vx3capyj257rrn5j/PHID<wbr>-FILE-x6thnbf6ub55ep5yhalu/Lay<wbr>outDescription.png</a></p>
<p class="MsoNormal"> </p>
<p class="MsoNormal">Approach-1 looks feasible, please
comment if any objections.</p>
<p class="MsoNormal"> </p>
<p class="MsoNormal">Regards,</p>
<p class="MsoNormal">Ashutosh</p>
<p class="MsoNormal"> </p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"> </span></p>
<div>
<div style="border:none;border-top:solid #e1e1e1 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal" style="margin-left:.5in"><b><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">From:</span></b><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">
Nema, Ashutosh
<br>
<b>Sent:</b> Wednesday, March 1, 2017 10:42 AM<br>
<b>To:</b> 'Daniel Berlin' <<a href="mailto:dberlin@dberlin.org" target="_blank">dberlin@dberlin.org</a>><br>
<b>Cc:</b> <a href="mailto:anemet@apple.com" target="_blank">anemet@apple.com</a>; Hal
Finkel <<a href="mailto:hfinkel@anl.gov" target="_blank">hfinkel@anl.gov</a>>;
Zaks, Ayal <<a href="mailto:ayal.zaks@intel.com" target="_blank">ayal.zaks@intel.com</a>>;
Renato Golin <<a href="mailto:renato.golin@linaro.org" target="_blank">renato.golin@linaro.org</a>>;
<a href="mailto:mkuper@google.com" target="_blank">mkuper@google.com</a>; Mehdi
Amini <<a href="mailto:mehdi.amini@apple.com" target="_blank">mehdi.amini@apple.com</a>>;
llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>><br>
<b>Subject:</b> RE: [llvm-dev] [Proposal][RFC]
Epilog loop vectorization</span></p>
</div>
</div>
<p class="MsoNormal" style="margin-left:.5in"> </p>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">Sorry
I misunderstood, gvn/newgvn/gvnhoist cannot help
here as these checks are not dominated by all
paths.</span></p>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"> </span></p>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">Regards,</span></p>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">Ashutosh</span></p>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"> </span></p>
<p class="MsoNormal" style="margin-left:1.0in"><b><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">From:</span></b><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">
Daniel Berlin [<a href="mailto:dberlin@dberlin.org" target="_blank">mailto:dberlin@dberlin.org</a>]
<br>
<b>Sent:</b> Tuesday, February 28, 2017 6:58 PM<br>
<b>To:</b> Nema, Ashutosh <<a href="mailto:Ashutosh.Nema@amd.com" target="_blank">Ashutosh.Nema@amd.com</a>><br>
<b>Cc:</b> <a href="mailto:anemet@apple.com" target="_blank">anemet@apple.com</a>;
Hal Finkel <<a href="mailto:hfinkel@anl.gov" target="_blank">hfinkel@anl.gov</a>>;
Zaks, Ayal <<a href="mailto:ayal.zaks@intel.com" target="_blank">ayal.zaks@intel.com</a>>;
Renato Golin <<a href="mailto:renato.golin@linaro.org" target="_blank">renato.golin@linaro.org</a>>;
<a href="mailto:mkuper@google.com" target="_blank">mkuper@google.com</a>;
Mehdi Amini <<a href="mailto:mehdi.amini@apple.com" target="_blank">mehdi.amini@apple.com</a>>;
llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>><br>
<b>Subject:</b> Re: [llvm-dev] [Proposal][RFC]
Epilog loop vectorization</span></p>
<p class="MsoNormal" style="margin-left:1.0in"> </p>
<div>
<p class="MsoNormal" style="margin-left:1.0in">Hoisting
or removing?<br>
Neither pass does hoisting, you'd need gvnhoist
for that.</p>
<div>
<p class="MsoNormal" style="margin-left:1.0in"> </p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.0in">Even
then:<br>
Staring at your example, none of the checks are
fully redundant (IE they are not dominated by
other checks) or execute on all paths.</p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.0in"> </p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.0in">Thus,
hoisting them would be purely speculative code
motion, which none of our passes do.</p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.0in"> </p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.0in">If
you would like these sets of checks to be
removed, you would need to place them in a place
that they execute unconditionally.</p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.0in"> </p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.0in">Otherwise,
this is not a standard code hoisting/removal
transform.</p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.0in"> </p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.0in">The
only redundancy i can see here at all is the
repeated getelementptr computation.</p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.0in">If
you move it to the preheader, it will be
eliminated.</p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.0in">Otherwise,
none of the checks are redundant.</p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.0in"><br>
What would you hope to happen in this case?</p>
</div>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.0in"> </p>
<div>
<p class="MsoNormal" style="margin-left:1.0in">On
Tue, Feb 28, 2017 at 5:09 AM, Nema, Ashutosh
<<a href="mailto:Ashutosh.Nema@amd.com" target="_blank">Ashutosh.Nema@amd.com</a>>
wrote:</p>
<blockquote style="border:none;border-left:solid #cccccc 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0in;margin-bottom:5.0pt">
<div>
<div>
<p class="MsoNormal" style="margin-left:1.0in">
<span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">I
have tried running both gvn and newgvn
but it did not helped in hoisting the
alias checks:</span></p>
<p class="MsoNormal" style="margin-left:1.0in">
<span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"> </span></p>
<p class="MsoNormal" style="margin-left:1.0in">
<span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">Please
check, maybe I have missed something.</span></p>
<p class="MsoNormal" style="margin-left:1.0in">
<span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"> </span></p>
<p class="MsoNormal" style="margin-left:1.0in">
<span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"><TestCase></span></p>
<p class="MsoNormal" style="margin-left:1.0in">
<span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">void
foo (char *A, char *B, char *C, int len)
{</span></p>
<p class="MsoNormal" style="margin-left:1.0in">
<span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">
int i = 0;</span></p>
<p class="MsoNormal" style="margin-left:1.0in">
<span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">
for (i=0 ; i< len; i++)</span></p>
<p class="MsoNormal" style="margin-left:1.0in">
<span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">
A[i] = B[i] + C[i];</span></p>
<p class="MsoNormal" style="margin-left:1.0in">
<span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">}</span></p>
<p class="MsoNormal" style="margin-left:1.0in">
<span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"> </span></p>
<p class="MsoNormal" style="margin-left:1.0in">
<span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"><Command></span></p>
<p class="MsoNormal" style="margin-left:1.0in">
<span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">
$ opt –O3 –gvn test.ll –o test.opt.ll</span></p>
<p class="MsoNormal" style="margin-left:1.0in">
<span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">
$ opt –O3 –newgvn test.ll –o test.opt.ll</span></p>
<p class="MsoNormal" style="margin-left:1.0in">
<span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"> </span></p>
<p class="MsoNormal" style="margin-left:1.0in">
<span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">“test.ll”
is attached, it got already vectorized
by the approach running vectorizer twice
by annotate the remainder loop with
metadata to limit the vectorization
factor for epilog vector loop.</span></p>
<p class="MsoNormal" style="margin-left:1.0in">
<span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"> </span></p>
<p class="MsoNormal" style="margin-left:1.0in">
<span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">Regards,</span></p>
<p class="MsoNormal" style="margin-left:1.0in">
<span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">Ashutosh</span></p>
<p class="MsoNormal" style="margin-left:1.0in">
<span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"> </span></p>
<div>
<div style="border:none;border-top:solid #e1e1e1 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal" style="margin-left:1.5in">
<b><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">From:</span></b><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">
<a href="mailto:anemet@apple.com" target="_blank">anemet@apple.com</a>
[mailto:<a href="mailto:anemet@apple.com" target="_blank">anemet@apple.com</a>]
<br>
<b>Sent:</b> Tuesday, February 28,
2017 1:33 AM<br>
<b>To:</b> Hal Finkel <<a href="mailto:hfinkel@anl.gov" target="_blank">hfinkel@anl.gov</a>><br>
<b>Cc:</b> Daniel Berlin <<a href="mailto:dberlin@dberlin.org" target="_blank">dberlin@dberlin.org</a>>;
Nema, Ashutosh <<a href="mailto:Ashutosh.Nema@amd.com" target="_blank">Ashutosh.Nema@amd.com</a>>;
Zaks, Ayal <<a href="mailto:ayal.zaks@intel.com" target="_blank">ayal.zaks@intel.com</a>>;
Renato Golin <<a href="mailto:renato.golin@linaro.org" target="_blank">renato.golin@linaro.org</a>>;
<a href="mailto:mkuper@google.com" target="_blank">mkuper@google.com</a>;
Mehdi Amini <<a href="mailto:mehdi.amini@apple.com" target="_blank">mehdi.amini@apple.com</a>>;
llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>><br>
<b>Subject:</b> Re: [llvm-dev]
[Proposal][RFC] Epilog loop
vectorization</span></p>
</div>
</div>
<p class="MsoNormal" style="margin-left:1.5in">
</p>
<p class="MsoNormal" style="margin-left:1.5in">
</p>
<div>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<p class="MsoNormal" style="margin-left:1.5in">
On Feb 27, 2017, at 12:01 PM, Hal
Finkel <<a href="mailto:hfinkel@anl.gov" target="_blank">hfinkel@anl.gov</a>>
wrote:</p>
</div>
<div>
<div>
<p class="MsoNormal" style="margin-left:1.5in">
</p>
<div>
<div>
<p class="MsoNormal" style="margin-left:1.5in">
</p>
<div>
<p class="MsoNormal" style="margin-left:1.5in">
On 02/27/2017 01:47 PM,
Daniel Berlin wrote:</p>
</div>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<p class="MsoNormal" style="margin-left:1.5in">
</p>
<div>
<p class="MsoNormal" style="margin-left:1.5in">
</p>
<div>
<p class="MsoNormal" style="margin-left:1.5in">
On Mon, Feb 27, 2017
at 11:29 AM, Adam
Nemet <<a href="mailto:anemet@apple.com" target="_blank">anemet@apple.com</a>>
wrote:</p>
<blockquote style="border:none;border-left:solid #cccccc 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0in;margin-bottom:5.0pt">
<div>
<p class="MsoNormal" style="margin-left:1.5in">
</p>
<div>
<div>
<div>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<p class="MsoNormal" style="margin-left:1.5in">
On Feb 27,
2017, at 10:11
AM, Hal Finkel
<<a href="mailto:hfinkel@anl.gov" target="_blank">hfinkel@anl.gov</a>>
wrote:</p>
</div>
<p class="MsoNormal" style="margin-left:1.5in">
</p>
<div>
<div>
<p class="MsoNormal" style="margin-left:1.5in">
</p>
<div>
<p class="MsoNormal" style="margin-left:1.5in">
On 02/27/2017
11:47 AM, Adam
Nemet wrote:</p>
</div>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<p class="MsoNormal" style="margin-left:1.5in">
</p>
<div>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<p class="MsoNormal" style="margin-left:1.5in">
On Feb 27,
2017, at 9:39
AM, Daniel
Berlin <<a href="mailto:dberlin@dberlin.org" target="_blank">dberlin@dberlin.org</a>>
wrote:</p>
</div>
<p class="MsoNormal" style="margin-left:1.5in">
</p>
<div>
<div>
<p class="MsoNormal" style="margin-left:1.5in">
</p>
<div>
<p class="MsoNormal" style="margin-left:1.5in">
</p>
<div>
<p class="MsoNormal" style="margin-left:1.5in">
On Mon, Feb
27, 2017 at
9:29 AM, Adam
Nemet <<a href="mailto:anemet@apple.com" target="_blank">anemet@apple.com</a>>
wrote:</p>
<blockquote style="border:none;border-left:solid #cccccc 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0in;margin-bottom:5.0pt">
<div>
<p class="MsoNormal" style="margin-left:1.5in">
</p>
<div>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<p class="MsoNormal" style="margin-left:1.5in">
On Feb 27,
2017, at 7:27
AM, Hal Finkel
<<a href="mailto:hfinkel@anl.gov" target="_blank">hfinkel@anl.gov</a>>
wrote:</p>
</div>
<p class="MsoNormal" style="margin-left:1.5in">
</p>
<div>
<div>
<p class="MsoNormal" style="margin-left:1.5in;background:white">
<span style="font-size:7.5pt;font-family:"Helvetica",sans-serif"><br>
On 02/27/2017
06:29 AM,
Nema, Ashutosh
wrote:</span></p>
</div>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt;font-variant-caps:normal;text-align:start;word-spacing:0px">
<div>
<div>
<p class="MsoNormal" style="margin-left:1.5in;background:white">
<span style="font-size:7.5pt;font-family:"Helvetica",sans-serif">Thanks
for looking
into this.</span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.5in;background:white">
<span style="font-size:7.5pt;font-family:"Helvetica",sans-serif"> </span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.5in;background:white">
<span style="font-size:7.5pt;font-family:"Helvetica",sans-serif">1)
Issues with re
running
vectorizer:</span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.5in;background:white">
<span style="font-size:7.5pt;font-family:"Helvetica",sans-serif">Vectorizer
might generate
redundant
alias checks
while
vectorizing
epilog loop.</span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.5in;background:white">
<span style="font-size:7.5pt;font-family:"Helvetica",sans-serif">Redundant
alias checks
are expensive,
we like to
reuse the
results of
already
computed alias
checks.</span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.5in;background:white">
<span style="font-size:7.5pt;font-family:"Helvetica",sans-serif">With
metadata we
can limit the
width of
epilog loop,
but not sure
about reusing
alias check
result.</span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.5in;background:white">
<span style="font-size:7.5pt;font-family:"Helvetica",sans-serif">Any
thoughts on
rerunning
vectorizer
with reusing
the alias
check result ?</span></p>
</div>
</div>
</blockquote>
<p class="MsoNormal" style="margin-left:1.5in">
<span style="font-size:7.5pt;font-family:"Helvetica",sans-serif"><br>
<span style="background:white">One
way of looking
at this is:
Reusing the
alias-check
result is
really just a
conditional
propagation
problem; if we
don't already
have an
optimization
that can
combine these
after the
fact, then we
should.</span></span></p>
</div>
</blockquote>
<div>
<p class="MsoNormal" style="margin-left:1.5in">
</p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.5in">
+Danny</p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.5in">
</p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.5in">
Isn’t Extended
SSA supposed
to help with
this?</p>
</div>
</div>
</div>
</blockquote>
<div>
<p class="MsoNormal" style="margin-left:1.5in">
</p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.5in">
Yes, it will
solve this
with no issue
already. GVN
probably does
already too.</p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.5in">
</p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.5in">
even if if you
have</p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.5in">
</p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.5in">
if (a == b)</p>
</div>
<div>
<div>
<p class="MsoNormal" style="margin-left:1.5in">
if (a == c)</p>
</div>
</div>
<div>
<div>
<p class="MsoNormal" style="margin-left:1.5in">
if (a == d)</p>
</div>
</div>
<div>
<div>
<p class="MsoNormal" style="margin-left:1.5in">
if (a == e)</p>
</div>
</div>
<div>
<div>
<p class="MsoNormal" style="margin-left:1.5in">
if (a == g)</p>
</div>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.5in">
</p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.5in">
</p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.5in">
and we can
prove a ... g
equivalent,
newgvn will
eliminate them
all and set
all the
branches true.</p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.5in">
</p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.5in">
If you need a
simpler clean
up pass, we
could run it
on sub-graphs.</p>
</div>
</div>
</div>
</div>
</div>
</blockquote>
<div>
<p class="MsoNormal" style="margin-left:1.5in">
</p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.5in">
Yes we
probably don’t
want to run a
full GVN after
the
“loop-scheduling”
passes.</p>
</div>
</div>
</blockquote>
<p class="MsoNormal" style="margin-left:1.5in">
<br>
FWIW, we
could, just
without the
memory-dependence
analysis
enabled (i.e.
set the
NoLoads
constructor
parameter to
true). GVN is
pretty fast in
that mode.</p>
</div>
</div>
</blockquote>
<div>
<p class="MsoNormal" style="margin-left:1.5in">
</p>
</div>
</div>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.5in">
OK. Another
data point is
that I’ve seen
cases in the
past where the
alias checks
required for
the loop
passes could
enable GVN to
remove
redundant
loads/stores.
Currently we
can only pick
these up with
LTO when GVN
is rerun.</p>
</div>
</div>
</div>
</blockquote>
<div>
<p class="MsoNormal" style="margin-left:1.5in">
</p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.5in">
This is just GVN
brokenness, newgvn
should not have this
problem.</p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.5in">
If it does, i'd love
to see it.</p>
</div>
</div>
</div>
</div>
</blockquote>
<p class="MsoNormal" style="margin-left:1.5in">
<br>
I thought that the problem is
that we just don't run GVN
after that point in the
pipeline.</p>
</div>
</div>
</div>
</div>
</blockquote>
<div>
<div>
<div>
<p class="MsoNormal" style="margin-left:1.5in">
</p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.5in">
Yeah, that is the problem but I
think Danny misunderstood what I
was trying to say.</p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.5in">
</p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.5in">
This was a datapoint to possibly
rerun GVN with memory-awareness.</p>
</div>
<p class="MsoNormal" style="margin-bottom:12.0pt;margin-left:1.5in">
</p>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<div>
<p class="MsoNormal" style="margin-bottom:12.0pt;margin-left:1.5in">
<br>
-Hal</p>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<div>
<div>
<div>
<p class="MsoNormal" style="margin-left:1.5in">
</p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.5in">
(I'm working on the
last few parts of
turning it on by
default, but it
requires a new
getModRefInfo
interface to be able
to get the last few
testcases)</p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.5in">
</p>
</div>
</div>
</div>
</div>
</blockquote>
<p class="MsoNormal" style="margin-bottom:12.0pt;margin-left:1.5in">
</p>
<pre style="margin-left:1.5in">-- </pre>
<pre style="margin-left:1.5in">Hal Finkel</pre>
<pre style="margin-left:1.5in">Lead, Compiler Technology and Programming Languages</pre>
<pre style="margin-left:1.5in">Leadership Computing Facility</pre>
<pre style="margin-left:1.5in">Argonne National Laboratory</pre>
</div>
</div>
</blockquote>
</div>
</div>
</div>
<p class="MsoNormal" style="margin-left:1.5in">
</p>
</div>
</div>
</blockquote>
</div>
<p class="MsoNormal" style="margin-left:1.0in"> </p>
</div>
</div>
</div>
</blockquote>
</div>
</div>
</div>
</blockquote>
<br>
<pre class="m_-868374222628263648moz-signature" cols="72">--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory</pre>
</div></div></div>
</blockquote></div><br></div></div>