<div dir="ltr">Hey Sanjoy,<div>  </div><div class="gmail_extra"><div class="gmail_quote">On Wed, Jul 26, 2017 at 1:41 PM, Sanjoy Das via llvm-dev <span dir="ltr"><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi,<br>

<span class="gmail-"><br>

On Wed, Jul 26, 2017 at 12:54 PM, Sean Silva <<a href="mailto:chisophugis@gmail.com">chisophugis@gmail.com</a>> wrote:<br>

> The way I interpret Quentin's statement is something like:<br>

><br>

> - Inlining turns an interprocedural problem into an intraprocedural problem<br>

> - Outlining turns an intraprocedural problem into an interprocedural problem<br>

><br>

> Insofar as our intraprocedural analyses and transformations are strictly<br>

> more powerful than interprocedural, then there is a precise sense in which<br>

> inlining exposes optimization opportunities while outlining does not.<br>

<br>

</span>While I think our intra-proc optimizations are *generally* more<br>

powerful, I don't think they are *always* more powerful.  For<br>

instance, LICM (today) won't hoist full regions but it can hoist<br>

single function calls.  If we can extract out a region into a<br>

readnone+nounwind function call then LICM will hoist it to the<br>

preheader if the safety checks pass.<br>

<span class="gmail-"><br>

> Actually, for his internship last summer River wrote a profile-guided<br>

> outliner / partial inliner (it didn't try to do deduplication; so it was<br>

> more like PartialInliner.cpp). IIRC he found that LLVM's interprocedural<br>

> analyses were so bad that there were pretty adverse effects from many of the<br>

> outlining decisions. E.g. if you outline from the left side of a diamond,<br>

> that side basically becomes a black box to most LLVM analyses and forces<br>

> downstream dataflow meet points to give an overly conservative result, even<br>

> though our standard intraprocedural analyses would have happily dug through<br>

> the left side of the diamond if the code had not been outlined.<br>

><br>

> Also, River's patch (the one in this thread) does parameterized outlining.<br>

> For example, two sequences containing stores can be outlined even if the<br>

> corresponding stores have different pointers. The pointer to be loaded from<br>

> is passed as a parameter to the outlined function. In that sense, the<br>

> outlined function's behavior becomes a conservative approximation of both<br>

> which in principle loses precision.<br>

<br>

</span>Can we outline only once we've already done all of these optimizations<br>

that outlining would block?<br></blockquote><div> </div><div>  The outliner is able to run at any point in the interprocedural pipeline. There are currently two locations: Early outlining(pre inliner) and late outlining(practically the last pass to run). It is configured to run either Early+Late, or just Late. </div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<span class="gmail-"><br>

> I like your EarlyCSE example and it is interesting that combined with<br>

> functionattrs it can make a "cheap" pass get a transformation that an<br>

> "expensive" pass would otherwise be needed. Are there any cases where we<br>

> only have the "cheap" pass and thus the outlining would be essential for our<br>

> optimization pipeline to get the optimization right?<br>

><br>

> The case that comes to mind for me is cases where we have some cutoff of<br>

> search depth. Reducing a sequence to a single call (+ functionattr<br>

> inference) can essentially summarize the sequence and effectively increase<br>

> search depth, which might give more results. That seems like a bit of a weak<br>

> example though.<br>

<br>

</span>I don't know if River's patch outlines entire control flow regions at<br>

a time, but if it does then we could use cheap basic block scanning<br>

analyses for things that would normally require CFG-level analysis.<br></blockquote><div><br></div><div>  The current patch currently just supports outlining from within a single block. Although, I had a working prototype for Region based outlining, I kept it from this patch for simplicity. So its entirely possible to add that kind of functionality because I've already tried.</div><div>Thanks,</div><div>  River Riddle</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br>

-- Sanjoy<br>

<div class="gmail-HOEnZb"><div class="gmail-h5"><br>

><br>

> -- Sean Silva<br>

><br>

> On Wed, Jul 26, 2017 at 12:07 PM, Sanjoy Das via llvm-dev<br>

> <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>> wrote:<br>

>><br>

>> Hi,<br>

>><br>

>> On Wed, Jul 26, 2017 at 10:10 AM, Quentin Colombet via llvm-dev<br>

>> <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>> wrote:<br>

>> > No, I mean in terms of enabling other optimizations in the pipeline like<br>

>> > vectorizer. Outliner does not expose any of that.<br>

>><br>

>> I have not made a lot of effort to understand the full discussion here (so<br>

>> what<br>

>> I say below may be off-base), but I think there are some cases where<br>

>> outlining<br>

>> (especially working with function-attrs) can make optimization easier.<br>

>><br>

>> It can help transforms that duplicate code (like loop unrolling and<br>

>> inlining) be<br>

>> more profitable -- I'm thinking of cases where unrolling/inlining would<br>

>> have to<br>

>> duplicate a lot of code, but after outlining would require duplicating<br>

>> only a<br>

>> few call instructions.<br>

>><br>

>><br>

>> It can help EarlyCSE do things that require GVN today:<br>

>><br>

>> void foo() {<br>

>>   ... complex computation that computes func()<br>

>>   ... complex computation that computes func()<br>

>> }<br>

>><br>

>> outlining=><br>

>><br>

>> int func() { ... }<br>

>><br>

>> void foo() {<br>

>>   int x = func();<br>

>>   int y = func();<br>

>> }<br>

>><br>

>> functionattrs=><br>

>><br>

>> int func() readonly { ... }<br>

>><br>

>> void foo(int a, int b) {<br>

>>   int x = func();<br>

>>   int y = func();<br>

>> }<br>

>><br>

>> earlycse=><br>

>><br>

>> int func(int t) readnone { ... }<br>

>><br>

>> void foo(int a, int b) {<br>

>>   int x = func(a);<br>

>>   int y = x;<br>

>> }<br>

>><br>

>> GVN will catch this, but EarlyCSE is (at least supposed to be!) cheaper.<br>

>><br>

>><br>

>> Once we have an analysis that can prove that certain functions can't trap,<br>

>> outlining can allow LICM etc. to speculate entire outlined regions out of<br>

>> loops.<br>

>><br>

>><br>

>> Generally, I think outlining exposes information that certain regions of<br>

>> the<br>

>> program are doing identical things.  We should expect to get some mileage<br>

>> out of<br>

>> this information.<br>

>><br>

>> -- Sanjoy<br>

>> ______________________________<wbr>_________________<br>

>> LLVM Developers mailing list<br>

>> <a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>

>> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>

><br>

><br>

______________________________<wbr>_________________<br>

LLVM Developers mailing list<br>

<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>

<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>

</div></div></blockquote></div><br></div></div>