<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<p><br>
</p>
<div class="moz-cite-prefix">On 03/14/2017 08:00 AM, Nema, Ashutosh
wrote:<br>
</div>
<blockquote
cite="mid:CY4PR12MB17993294734263A4A7EB2C2AFB240@CY4PR12MB1799.namprd12.prod.outlook.com"
type="cite">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 15 (filtered
medium)">
<!--[if !mso]><style>v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style><![endif]-->
<style><!--
/* Font Definitions */
@font-face
{font-family:Helvetica;
panose-1:2 11 6 4 2 2 2 2 2 4;}
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Consolas;
panose-1:2 11 6 9 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman",serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
pre
{mso-style-priority:99;
mso-style-link:"HTML Preformatted Char";
margin:0in;
margin-bottom:.0001pt;
font-size:10.0pt;
font-family:"Courier New";}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
{mso-style-priority:34;
margin-top:0in;
margin-right:0in;
margin-bottom:0in;
margin-left:.5in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
span.HTMLPreformattedChar
{mso-style-name:"HTML Preformatted Char";
mso-style-priority:99;
mso-style-link:"HTML Preformatted";
font-family:Consolas;}
span.EmailStyle19
{mso-style-type:personal;
font-family:"Calibri",sans-serif;
color:#1F497D;}
span.EmailStyle20
{mso-style-type:personal-reply;
font-family:"Calibri",sans-serif;
color:#1F497D;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
/* List Definitions */
@list l0
{mso-list-id:395780924;
mso-list-type:hybrid;
mso-list-template-ids:1076887890 67698705 67698713 67698715 67698703 67698713 67698715 67698703 67698713 67698715;}
@list l0:level1
{mso-level-text:"%1\)";
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level2
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level3
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l0:level4
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level5
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level6
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l0:level7
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level8
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level9
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l1
{mso-list-id:711031633;
mso-list-type:hybrid;
mso-list-template-ids:-1063088596 67698705 67698713 67698715 67698703 67698713 67698715 67698703 67698713 67698715;}
@list l1:level1
{mso-level-text:"%1\)";
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l1:level2
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l1:level3
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l1:level4
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l1:level5
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l1:level6
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l1:level7
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l1:level8
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l1:level9
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
ol
{margin-bottom:0in;}
ul
{margin-bottom:0in;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
<div class="WordSection1">
<p class="MsoNormal">Summarizing the discussion on the
implementation approaches.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Discussed about two approaches, first
running ‘InnerLoopVectorizer’ again on the epilog loop
immediately after vectorizing the original loop within the
same vectorization pass, the second approach where re-running
vectorization pass and limiting vectorization factor of epilog
loop by metadata.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><Approach-2><o:p></o:p></p>
<p class="MsoNormal">Challenges with re-running the vectorizer
pass:<o:p></o:p></p>
<p class="MsoListParagraph"
style="text-indent:-.25in;mso-list:l0 level1 lfo2"><!--[if !supportLists]--><span
style="mso-list:Ignore">1)<span style="font:7.0pt
"Times New Roman"">
</span></span><!--[endif]-->Reusing alias check result: <o:p></o:p></p>
<p class="MsoListParagraph">When vectorizer pass runs again it
finds the epilog loop as a new loop and it may generates alias
check, this new alias check may overkill the gains of epilog
vectorization.<o:p></o:p></p>
<p class="MsoListParagraph">We should use the already computed
alias check result instead of re computing again.<o:p></o:p></p>
<p class="MsoListParagraph"
style="text-indent:-.25in;mso-list:l0 level1 lfo2"><!--[if !supportLists]--><span
style="mso-list:Ignore">2)<span style="font:7.0pt
"Times New Roman"">
</span></span><!--[endif]-->Rerun the vectorizer and hoist
the new alias check:<o:p></o:p></p>
<p class="MsoListParagraph">It’s not possible to hoist alias
checks as its not fully redundant (not dominated by other
checks), it’s not getting execute in all paths.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><img id="Picture_x0020_1"
src="cid:part1.9399286D.63210CB7@anl.gov" height="156"
width="567"><o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">NOTE: We cannot prepone alias check as its
expensive compared to other checks.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><Approach-1><o:p></o:p></p>
<p class="MsoListParagraph"
style="text-indent:-.25in;mso-list:l1 level1 lfo1"><!--[if !supportLists]--><span
style="mso-list:Ignore">1)<span style="font:7.0pt
"Times New Roman"">
</span></span><!--[endif]-->Current patch depends on the
existing functionality of LoopVectorizer, it uses
‘InnerLoopVectorizer’ again to vectorize the epilog loop, as
it happens in the same vectorization pass we have flexibility
to reuse already computed alias result check & limit
vectorization factor for the epilog loop. <o:p></o:p></p>
<p class="MsoListParagraph"
style="text-indent:-.25in;mso-list:l1 level1 lfo1"><!--[if !supportLists]--><span
style="mso-list:Ignore">2)<span style="font:7.0pt
"Times New Roman"">
</span></span><!--[endif]-->It does not generate the blocks
for new block layout explicitly, rather it depends on
‘InnerLoopVectorizer::createEmptyLoop’ to generate new block
layout. The new block layout get automatically generated by
calling the ‘InnerLoopVectorizer:: vectorize’ again.<o:p></o:p></p>
<p class="MsoListParagraph"
style="text-indent:-.25in;mso-list:l1 level1 lfo1"><!--[if !supportLists]--><span
style="mso-list:Ignore">3)<span style="font:7.0pt
"Times New Roman"">
</span></span><!--[endif]-->Block layout description with
epilog loop vectorization is available at<o:p></o:p></p>
<p class="MsoListParagraph"><a moz-do-not-send="true"
href="https://reviews.llvm.org/file/data/fxg5vx3capyj257rrn5j/PHID-FILE-x6thnbf6ub55ep5yhalu/LayoutDescription.png">https://reviews.llvm.org/file/data/fxg5vx3capyj257rrn5j/PHID-FILE-x6thnbf6ub55ep5yhalu/LayoutDescription.png</a><o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Approach-1 looks feasible, please comment
if any objections.</p>
</div>
</blockquote>
<br>
I think think this is reasonable. One thing: In the proposed block
layout, if the alias check fails, we jump to the "Min Iter Check
2". From there we re-check the alias-check result (which will be
false again), and then jump to the scalar loop. This is one more
branch than necessary in the case where the alias check fails. If
the alias check fails, we should jump directly to the scalar loop.<br>
<br>
Thanks again,<br>
Hal<br>
<br>
<blockquote
cite="mid:CY4PR12MB17993294734263A4A7EB2C2AFB240@CY4PR12MB1799.namprd12.prod.outlook.com"
type="cite">
<div class="WordSection1">
<p class="MsoNormal"><o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Regards,<o:p></o:p></p>
<p class="MsoNormal">Ashutosh<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p> </o:p></span></p>
<div>
<div style="border:none;border-top:solid #E1E1E1
1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal" style="margin-left:.5in"><b><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif">From:</span></b><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif">
Nema, Ashutosh
<br>
<b>Sent:</b> Wednesday, March 1, 2017 10:42 AM<br>
<b>To:</b> 'Daniel Berlin' <a class="moz-txt-link-rfc2396E" href="mailto:dberlin@dberlin.org"><dberlin@dberlin.org></a><br>
<b>Cc:</b> <a class="moz-txt-link-abbreviated" href="mailto:anemet@apple.com">anemet@apple.com</a>; Hal Finkel
<a class="moz-txt-link-rfc2396E" href="mailto:hfinkel@anl.gov"><hfinkel@anl.gov></a>; Zaks, Ayal
<a class="moz-txt-link-rfc2396E" href="mailto:ayal.zaks@intel.com"><ayal.zaks@intel.com></a>; Renato Golin
<a class="moz-txt-link-rfc2396E" href="mailto:renato.golin@linaro.org"><renato.golin@linaro.org></a>; <a class="moz-txt-link-abbreviated" href="mailto:mkuper@google.com">mkuper@google.com</a>;
Mehdi Amini <a class="moz-txt-link-rfc2396E" href="mailto:mehdi.amini@apple.com"><mehdi.amini@apple.com></a>; llvm-dev
<a class="moz-txt-link-rfc2396E" href="mailto:llvm-dev@lists.llvm.org"><llvm-dev@lists.llvm.org></a><br>
<b>Subject:</b> RE: [llvm-dev] [Proposal][RFC] Epilog
loop vectorization<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal" style="margin-left:.5in"><o:p> </o:p></p>
<p class="MsoNormal" style="margin-left:.5in"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">Sorry
I misunderstood, gvn/newgvn/gvnhoist cannot help here as
these checks are not dominated by all paths.<o:p></o:p></span></p>
<p class="MsoNormal" style="margin-left:.5in"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal" style="margin-left:.5in"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">Regards,<o:p></o:p></span></p>
<p class="MsoNormal" style="margin-left:.5in"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">Ashutosh<o:p></o:p></span></p>
<p class="MsoNormal" style="margin-left:.5in"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal" style="margin-left:1.0in"><b><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif">From:</span></b><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif">
Daniel Berlin [<a moz-do-not-send="true"
href="mailto:dberlin@dberlin.org">mailto:dberlin@dberlin.org</a>]
<br>
<b>Sent:</b> Tuesday, February 28, 2017 6:58 PM<br>
<b>To:</b> Nema, Ashutosh <<a moz-do-not-send="true"
href="mailto:Ashutosh.Nema@amd.com">Ashutosh.Nema@amd.com</a>><br>
<b>Cc:</b> <a moz-do-not-send="true"
href="mailto:anemet@apple.com">anemet@apple.com</a>; Hal
Finkel <<a moz-do-not-send="true"
href="mailto:hfinkel@anl.gov">hfinkel@anl.gov</a>>;
Zaks, Ayal <<a moz-do-not-send="true"
href="mailto:ayal.zaks@intel.com">ayal.zaks@intel.com</a>>;
Renato Golin <<a moz-do-not-send="true"
href="mailto:renato.golin@linaro.org">renato.golin@linaro.org</a>>;
<a moz-do-not-send="true" href="mailto:mkuper@google.com">mkuper@google.com</a>;
Mehdi Amini <<a moz-do-not-send="true"
href="mailto:mehdi.amini@apple.com">mehdi.amini@apple.com</a>>;
llvm-dev <<a moz-do-not-send="true"
href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>><br>
<b>Subject:</b> Re: [llvm-dev] [Proposal][RFC] Epilog loop
vectorization<o:p></o:p></span></p>
<p class="MsoNormal" style="margin-left:1.0in"><o:p> </o:p></p>
<div>
<p class="MsoNormal" style="margin-left:1.0in">Hoisting or
removing?<br>
Neither pass does hoisting, you'd need gvnhoist for that.<o:p></o:p></p>
<div>
<p class="MsoNormal" style="margin-left:1.0in"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.0in">Even then:<br>
Staring at your example, none of the checks are fully
redundant (IE they are not dominated by other checks) or
execute on all paths.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.0in"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.0in">Thus,
hoisting them would be purely speculative code motion,
which none of our passes do.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.0in"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.0in">If you would
like these sets of checks to be removed, you would need to
place them in a place that they execute unconditionally.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.0in"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.0in">Otherwise,
this is not a standard code hoisting/removal transform.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.0in"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.0in">The only
redundancy i can see here at all is the repeated
getelementptr computation.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.0in">If you move
it to the preheader, it will be eliminated.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.0in">Otherwise,
none of the checks are redundant.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.0in"><br>
What would you hope to happen in this case?<o:p></o:p></p>
</div>
</div>
<div>
<p class="MsoNormal" style="margin-left:1.0in"><o:p> </o:p></p>
<div>
<p class="MsoNormal" style="margin-left:1.0in">On Tue, Feb
28, 2017 at 5:09 AM, Nema, Ashutosh <<a
moz-do-not-send="true"
href="mailto:Ashutosh.Nema@amd.com" target="_blank">Ashutosh.Nema@amd.com</a>>
wrote:<o:p></o:p></p>
<blockquote style="border:none;border-left:solid #CCCCCC
1.0pt;padding:0in 0in 0in
6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0in;margin-bottom:5.0pt">
<div>
<div>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.0in"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">I
have tried running both gvn and newgvn but it did
not helped in hoisting the alias checks:</span><o:p></o:p></p>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.0in"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"> </span><o:p></o:p></p>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.0in"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">Please
check, maybe I have missed something.</span><o:p></o:p></p>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.0in"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"> </span><o:p></o:p></p>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.0in"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><TestCase></span><o:p></o:p></p>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.0in"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">void
foo (char *A, char *B, char *C, int len) {</span><o:p></o:p></p>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.0in"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">
int i = 0;</span><o:p></o:p></p>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.0in"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">
for (i=0 ; i< len; i++)</span><o:p></o:p></p>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.0in"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">
A[i] = B[i] + C[i];</span><o:p></o:p></p>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.0in"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">}</span><o:p></o:p></p>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.0in"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"> </span><o:p></o:p></p>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.0in"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><Command></span><o:p></o:p></p>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.0in"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">
$ opt –O3 –gvn test.ll –o test.opt.ll</span><o:p></o:p></p>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.0in"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">
$ opt –O3 –newgvn test.ll –o test.opt.ll</span><o:p></o:p></p>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.0in"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"> </span><o:p></o:p></p>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.0in"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">“test.ll”
is attached, it got already vectorized by the
approach running vectorizer twice by annotate the
remainder loop with metadata to limit the
vectorization factor for epilog vector loop.</span><o:p></o:p></p>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.0in"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"> </span><o:p></o:p></p>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.0in"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">Regards,</span><o:p></o:p></p>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.0in"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">Ashutosh</span><o:p></o:p></p>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.0in"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"> </span><o:p></o:p></p>
<div>
<div style="border:none;border-top:solid #E1E1E1
1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in"><b><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif">From:</span></b><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif">
<a moz-do-not-send="true"
href="mailto:anemet@apple.com"
target="_blank">anemet@apple.com</a>
[mailto:<a moz-do-not-send="true"
href="mailto:anemet@apple.com"
target="_blank">anemet@apple.com</a>]
<br>
<b>Sent:</b> Tuesday, February 28, 2017 1:33
AM<br>
<b>To:</b> Hal Finkel <<a
moz-do-not-send="true"
href="mailto:hfinkel@anl.gov"
target="_blank">hfinkel@anl.gov</a>><br>
<b>Cc:</b> Daniel Berlin <<a
moz-do-not-send="true"
href="mailto:dberlin@dberlin.org"
target="_blank">dberlin@dberlin.org</a>>;
Nema, Ashutosh <<a moz-do-not-send="true"
href="mailto:Ashutosh.Nema@amd.com"
target="_blank">Ashutosh.Nema@amd.com</a>>;
Zaks, Ayal <<a moz-do-not-send="true"
href="mailto:ayal.zaks@intel.com"
target="_blank">ayal.zaks@intel.com</a>>;
Renato Golin <<a moz-do-not-send="true"
href="mailto:renato.golin@linaro.org"
target="_blank">renato.golin@linaro.org</a>>;
<a moz-do-not-send="true"
href="mailto:mkuper@google.com"
target="_blank">mkuper@google.com</a>; Mehdi
Amini <<a moz-do-not-send="true"
href="mailto:mehdi.amini@apple.com"
target="_blank">mehdi.amini@apple.com</a>>;
llvm-dev <<a moz-do-not-send="true"
href="mailto:llvm-dev@lists.llvm.org"
target="_blank">llvm-dev@lists.llvm.org</a>><br>
<b>Subject:</b> Re: [llvm-dev] [Proposal][RFC]
Epilog loop vectorization</span><o:p></o:p></p>
</div>
</div>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in"> <o:p></o:p></p>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in"> <o:p></o:p></p>
<div>
<blockquote
style="margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">On
Feb 27, 2017, at 12:01 PM, Hal Finkel <<a
moz-do-not-send="true"
href="mailto:hfinkel@anl.gov"
target="_blank">hfinkel@anl.gov</a>>
wrote:<o:p></o:p></p>
</div>
<div>
<div>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in"> <o:p></o:p></p>
<div>
<div>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in"> <o:p></o:p></p>
<div>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">On
02/27/2017 01:47 PM, Daniel Berlin
wrote:<o:p></o:p></p>
</div>
<blockquote
style="margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in"> <o:p></o:p></p>
<div>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in"> <o:p></o:p></p>
<div>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">On
Mon, Feb 27, 2017 at 11:29 AM,
Adam Nemet <<a
moz-do-not-send="true"
href="mailto:anemet@apple.com"
target="_blank">anemet@apple.com</a>>
wrote:<o:p></o:p></p>
<blockquote
style="border:none;border-left:solid
#CCCCCC 1.0pt;padding:0in 0in
0in
6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0in;margin-bottom:5.0pt">
<div>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in"> <o:p></o:p></p>
<div>
<div>
<div>
<blockquote
style="margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">
On Feb 27, 2017,
at 10:11 AM, Hal
Finkel <<a
moz-do-not-send="true"
href="mailto:hfinkel@anl.gov" target="_blank">hfinkel@anl.gov</a>>
wrote:<o:p></o:p></p>
</div>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in"> <o:p></o:p></p>
<div>
<div>
<p
class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">
<o:p></o:p></p>
<div>
<p
class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">
On 02/27/2017
11:47 AM, Adam
Nemet wrote:<o:p></o:p></p>
</div>
<blockquote
style="margin-top:5.0pt;margin-bottom:5.0pt">
<p
class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">
<o:p></o:p></p>
<div>
<blockquote
style="margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<p
class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">
On Feb 27,
2017, at 9:39
AM, Daniel
Berlin <<a
moz-do-not-send="true" href="mailto:dberlin@dberlin.org" target="_blank">dberlin@dberlin.org</a>>
wrote:<o:p></o:p></p>
</div>
<p
class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">
<o:p></o:p></p>
<div>
<div>
<p
class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">
<o:p></o:p></p>
<div>
<p
class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">
<o:p></o:p></p>
<div>
<p
class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">
On Mon, Feb
27, 2017 at
9:29 AM, Adam
Nemet <<a
moz-do-not-send="true"
href="mailto:anemet@apple.com" target="_blank">anemet@apple.com</a>>
wrote:<o:p></o:p></p>
<blockquote
style="border:none;border-left:solid
#CCCCCC
1.0pt;padding:0in
0in 0in
6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0in;margin-bottom:5.0pt">
<div>
<p
class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">
<o:p></o:p></p>
<div>
<blockquote
style="margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<p
class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">
On Feb 27,
2017, at 7:27
AM, Hal Finkel
<<a
moz-do-not-send="true"
href="mailto:hfinkel@anl.gov" target="_blank">hfinkel@anl.gov</a>>
wrote:<o:p></o:p></p>
</div>
<p
class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">
<o:p></o:p></p>
<div>
<div>
<p
class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in;background:white">
<span
style="font-size:7.5pt;font-family:"Helvetica",sans-serif"><br>
On 02/27/2017
06:29 AM,
Nema, Ashutosh
wrote:</span><o:p></o:p></p>
</div>
<blockquote
style="margin-top:5.0pt;margin-bottom:5.0pt;font-variant-caps:normal;text-align:start;word-spacing:0px">
<div>
<div>
<p
class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in;background:white">
<span
style="font-size:7.5pt;font-family:"Helvetica",sans-serif">Thanks
for looking
into this.</span><o:p></o:p></p>
</div>
<div>
<p
class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in;background:white">
<span
style="font-size:7.5pt;font-family:"Helvetica",sans-serif"> </span><o:p></o:p></p>
</div>
<div>
<p
class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in;background:white">
<span
style="font-size:7.5pt;font-family:"Helvetica",sans-serif">1)
Issues with re
running
vectorizer:</span><o:p></o:p></p>
</div>
<div>
<p
class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in;background:white">
<span
style="font-size:7.5pt;font-family:"Helvetica",sans-serif">Vectorizer
might generate
redundant
alias checks
while
vectorizing
epilog loop.</span><o:p></o:p></p>
</div>
<div>
<p
class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in;background:white">
<span
style="font-size:7.5pt;font-family:"Helvetica",sans-serif">Redundant
alias checks
are expensive,
we like to
reuse the
results of
already
computed alias
checks.</span><o:p></o:p></p>
</div>
<div>
<p
class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in;background:white">
<span
style="font-size:7.5pt;font-family:"Helvetica",sans-serif">With
metadata we
can limit the
width of
epilog loop,
but not sure
about reusing
alias check
result.</span><o:p></o:p></p>
</div>
<div>
<p
class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in;background:white">
<span
style="font-size:7.5pt;font-family:"Helvetica",sans-serif">Any
thoughts on
rerunning
vectorizer
with reusing
the alias
check result ?</span><o:p></o:p></p>
</div>
</div>
</blockquote>
<p
class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">
<span
style="font-size:7.5pt;font-family:"Helvetica",sans-serif"><br>
<span
style="background:white">One
way of looking
at this is:
Reusing the
alias-check
result is
really just a
conditional
propagation
problem; if we
don't already
have an
optimization
that can
combine these
after the
fact, then we
should.</span></span><o:p></o:p></p>
</div>
</blockquote>
<div>
<p
class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">
<o:p></o:p></p>
</div>
<div>
<p
class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">
+Danny<o:p></o:p></p>
</div>
<div>
<p
class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">
<o:p></o:p></p>
</div>
<div>
<p
class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">
Isn’t Extended
SSA supposed
to help with
this?<o:p></o:p></p>
</div>
</div>
</div>
</blockquote>
<div>
<p
class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">
<o:p></o:p></p>
</div>
<div>
<p
class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">
Yes, it will
solve this
with no issue
already. GVN
probably does
already too.<o:p></o:p></p>
</div>
<div>
<p
class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">
<o:p></o:p></p>
</div>
<div>
<p
class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">
even if if you
have<o:p></o:p></p>
</div>
<div>
<p
class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">
<o:p></o:p></p>
</div>
<div>
<p
class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">
if (a == b)<o:p></o:p></p>
</div>
<div>
<div>
<p
class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">
if (a == c)<o:p></o:p></p>
</div>
</div>
<div>
<div>
<p
class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">
if (a == d)<o:p></o:p></p>
</div>
</div>
<div>
<div>
<p
class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">
if (a == e)<o:p></o:p></p>
</div>
</div>
<div>
<div>
<p
class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">
if (a == g)<o:p></o:p></p>
</div>
</div>
<div>
<p
class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">
<o:p></o:p></p>
</div>
<div>
<p
class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">
<o:p></o:p></p>
</div>
<div>
<p
class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">
and we can
prove a ... g
equivalent,
newgvn will
eliminate them
all and set
all the
branches true.<o:p></o:p></p>
</div>
<div>
<p
class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">
<o:p></o:p></p>
</div>
<div>
<p
class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">
If you need a
simpler clean
up pass, we
could run it
on sub-graphs.<o:p></o:p></p>
</div>
</div>
</div>
</div>
</div>
</blockquote>
<div>
<p
class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">
<o:p></o:p></p>
</div>
<div>
<p
class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">
Yes we
probably don’t
want to run a
full GVN after
the
“loop-scheduling”
passes.<o:p></o:p></p>
</div>
</div>
</blockquote>
<p
class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">
<br>
FWIW, we could,
just without the
memory-dependence analysis enabled (i.e. set the NoLoads constructor
parameter to
true). GVN is
pretty fast in
that mode.<o:p></o:p></p>
</div>
</div>
</blockquote>
<div>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in"> <o:p></o:p></p>
</div>
</div>
</div>
<div>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">OK.
Another data point is
that I’ve seen cases in
the past where the alias
checks required for the
loop passes could enable
GVN to remove redundant
loads/stores. Currently
we can only pick these
up with LTO when GVN is
rerun.<o:p></o:p></p>
</div>
</div>
</div>
</blockquote>
<div>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">This
is just GVN brokenness, newgvn
should not have this problem.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">If
it does, i'd love to see it.<o:p></o:p></p>
</div>
</div>
</div>
</div>
</blockquote>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in"><br>
I thought that the problem is that we
just don't run GVN after that point in
the pipeline.<o:p></o:p></p>
</div>
</div>
</div>
</div>
</blockquote>
<div>
<div>
<div>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">Yeah,
that is the problem but I think Danny
misunderstood what I was trying to say.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">This
was a datapoint to possibly rerun GVN with
memory-awareness.<o:p></o:p></p>
</div>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;margin-bottom:12.0pt;margin-left:1.5in">
<o:p> </o:p></p>
<blockquote
style="margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<div>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;margin-bottom:12.0pt;margin-left:1.5in">
<br>
-Hal<o:p></o:p></p>
<blockquote
style="margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<div>
<div>
<div>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in">(I'm
working on the last few parts
of turning it on by default,
but it requires a new
getModRefInfo interface to be
able to get the last few
testcases)<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in"> <o:p></o:p></p>
</div>
</div>
</div>
</div>
</blockquote>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;margin-bottom:12.0pt;margin-left:1.5in">
<o:p> </o:p></p>
<pre style="margin-left:1.5in">-- <o:p></o:p></pre>
<pre style="margin-left:1.5in">Hal Finkel<o:p></o:p></pre>
<pre style="margin-left:1.5in">Lead, Compiler Technology and Programming Languages<o:p></o:p></pre>
<pre style="margin-left:1.5in">Leadership Computing Facility<o:p></o:p></pre>
<pre style="margin-left:1.5in">Argonne National Laboratory<o:p></o:p></pre>
</div>
</div>
</blockquote>
</div>
</div>
</div>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:1.5in"> <o:p></o:p></p>
</div>
</div>
</blockquote>
</div>
<p class="MsoNormal" style="margin-left:1.0in"><o:p> </o:p></p>
</div>
</div>
</blockquote>
<br>
<pre class="moz-signature" cols="72">--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory</pre>
</body>
</html>