<html><head><style type='text/css'>p { margin: 0; }</style></head><body><div style='font-family: arial,helvetica,sans-serif; font-size: 10pt; color: #000000'><br><br><hr id="zwchr"><blockquote style="border-left: 2px solid rgb(16, 16, 255); margin-left: 5px; padding-left: 5px; color: rgb(0, 0, 0); font-weight: normal; font-style: normal; text-decoration: none; font-family: Helvetica,Arial,sans-serif; font-size: 12pt;"><b>From: </b>"Xinliang David Li" <davidxl@google.com><br><b>To: </b>"Dehao Chen" <danielcdh@gmail.com><br><b>Cc: </b>reviews+D19950+public+38ba22078c2035b8@reviews.llvm.org, "David Majnemer" <david.majnemer@gmail.com>, "Hal Finkel" <hfinkel@anl.gov>, "Junbum Lim" <junbuml@codeaurora.org>, mcrosier@codeaurora.org, "llvm-commits" <llvm-commits@lists.llvm.org>, "amara emerson" <amara.emerson@arm.com><br><b>Sent: </b>Tuesday, May 10, 2016 3:15:24 PM<br><b>Subject: </b>Re: [PATCH] D19950: Use frequency info to guide Loop Invariant Code Motion.<br><br><div dir="ltr"><br><div class="gmail_extra"><br><div id="DWT13835" class="gmail_quote">On Tue, May 10, 2016 at 1:03 PM, Dehao Chen <span dir="ltr"><<a href="mailto:danielcdh@gmail.com" target="_blank">danielcdh@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"><div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote"><span id="DWT13834" class="">On Tue, May 10, 2016 at 11:48 AM, Xinliang David Li <span dir="ltr"><<a href="mailto:davidxl@google.com" target="_blank">davidxl@google.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"><div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote"><span id="DWT13833">On Tue, May 10, 2016 at 11:01 AM, Dehao Chen <span dir="ltr"><<a href="mailto:danielcdh@gmail.com" target="_blank">danielcdh@gmail.com</a>></span> wrote:<br><blockquote id="DWT13832" class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">danielcdh added a comment.<br>
<span><br>
In <a href="http://reviews.llvm.org/D19950#425287" rel="noreferrer" target="_blank">http://reviews.llvm.org/D19950#425287</a>, @hfinkel wrote:<br>
<br>
> In <a href="http://reviews.llvm.org/D19950#425286" rel="noreferrer" target="_blank">http://reviews.llvm.org/D19950#425286</a>, @hfinkel wrote:<br>
><br>
> > In <a href="http://reviews.llvm.org/D19950#425285" rel="noreferrer" target="_blank">http://reviews.llvm.org/D19950#425285</a>, @davidxl wrote:<br>
> ><br>
> > > Static prediction has been conservative in estimating loop trip count -- it produces something like 30ish iterations. If the a very hot loop has a big if-then-else (or switch), it is very likely to mark many bbs' to be colder than the loop header. Turning on this for static prediction really depends on the false rate. It seems to be this can get wrong pretty easily for very hot loops (which is also the most important thing to optimize for).<br>
> ><br>
> ><br>
> > This is a good point. There's no universal conservative choice (assuming a small trip count is conservative in some cases, and assuming a large trip count is conservative in other cases).<br>
><br>
><br>
> Would it be better (and practical) if there were some way for the BFI client to specify which kind of 'conservative' is desired?<br>
><br>
</span><span>> Also, why are we doing this instead of sinking later (in CGP or similar)? LICM can expose optimization opportunities, plus represents a code pattern the user might input manually. Sinking later seems more robust.<br>
<br>
<br>
</span>I looked at CGP pass, looks like it's handling the sinking case-by-case (e.g. there is separate routine to handle sinking of load, gep, etc. I'm afraid this would miss opportunities. Additionally, the file-level comment of CGP pass says "This works around limitations in it's basic-block-at-a-time approach. It should eventually be removed."<br></blockquote></span></div></div></div></blockquote></span></div></div></div></blockquote></div></div></div></blockquote>Yes, but it will be "removed" when the entire subsystem is replaced by GlobalISel, and we'll certainly need to make GlobalISel profiling-data aware, so I expect this is the right path forward regardless. I agree, however, that we want a general sinking here based on profiling data, not just the specific existing heuristics for loads, GEPs, etc.<br><blockquote style="border-left: 2px solid rgb(16, 16, 255); margin-left: 5px; padding-left: 5px; color: rgb(0, 0, 0); font-weight: normal; font-style: normal; text-decoration: none; font-family: Helvetica,Arial,sans-serif; font-size: 12pt;"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span class=""><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"></blockquote><div><br></div><div><br></div></span><div>Perhaps you can do profile driven sinking CGP separately to handle manually hoisted code situation mentioned by Hal.</div></div></div></div></blockquote><div><br></div></span><div>Do you mean we still use frequency to decide whether to hoist code in LICM, additionally use frequency info to check if we want to sink instructions in CGP?</div></div></div></div></blockquote><div><br></div><div><br></div><div id="DWT13770">yes -- that is the suggestion.</div></div></div></div></blockquote>I'd prefer that we try to sink late first, and only if there are use cases that we can't handle this way, we consider throttling hoisting early. If we come across such use cases, I'd like to understand them better. Hoisting can expose other optimization opportunities, and you lose those opportunities if you don't hoist in the first place.<br><br> -Hal<br><blockquote style="border-left: 2px solid rgb(16, 16, 255); margin-left: 5px; padding-left: 5px; color: rgb(0, 0, 0); font-weight: normal; font-style: normal; text-decoration: none; font-family: Helvetica,Arial,sans-serif; font-size: 12pt;"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div></div><div><br></div><div>David</div><div><br></div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span class="HOEnZb"><font color="#888888"><div><br></div><div>Dehao</div></font></span><span class=""><div> </div><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span><font color="#888888"><div><br></div><div>David</div></font></span><span><div> </div><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
<br>
I'm not quite clear why it helps to move code out of loop early and later sink it inside. Could you give an example or some more context?<br>
<br>
Thanks,<br>
Dehao<br>
<br>
<br>
<a href="http://reviews.llvm.org/D19950" rel="noreferrer" target="_blank">http://reviews.llvm.org/D19950</a><br>
<br>
<br>
<br>
</blockquote></span></div><br></div></div>
</blockquote></span></div><br></div></div>
</blockquote></div><br></div></div>
</blockquote><br><br><br>-- <br><div><span name="x"></span>Hal Finkel<br>Assistant Computational Scientist<br>Leadership Computing Facility<br>Argonne National Laboratory<span name="x"></span><br></div></div></body></html>