<br><br><div class="gmail_quote">On Tue, Nov 23, 2010 at 7:40 PM, Reid Kleckner <span dir="ltr"><<a href="mailto:reid.kleckner@gmail.com">reid.kleckner@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

<div class="im">On Tue, Nov 23, 2010 at 8:07 PM, Xinliang David Li <<a href="mailto:xinliangli@gmail.com">xinliangli@gmail.com</a>> wrote:<br>

> Hi, I browsed the LLVM inliner implementation, and it seems there is room<br>

> for improvement.  (I have not read it too carefully, so correct me if what I<br>

> observed is wrong).<br>

> First the good side of the inliner -- the function level summary and inline<br>

> cost estimation is more elaborate and complete than gcc. For instance, it<br>

> considers callsite arguments and the effects of optimization enabled by<br>

> inlining.<br>

> Now more to the weakness of the inliner:<br>

> 1) It is bottom up.  The inlining is not done in the order based on the<br>

> priority of the callsites.  It may leave important callsites (at top of the<br>

> cg) unlined due to higher cost after inline cost update. It also eliminates<br>

> the possibility of inline specialization. To change this, the inliner pass<br>

> may not use the pass manager infrastructure .  (I noticed a hack in the<br>

> inliner to workaround the problem -- for static functions avoid inlining its<br>

> callees if it causes it to become too big ..)<br>

> 2) There seems to be only one inliner pass.  For calls to small functions,<br>

> it is better to perform early inlining as one of the local (per function)<br>

> optimizations followed by scalar opt clean up. This will sharpen the summary<br>

> information.  (Note the inline summary update does not consider the possible<br>

> cleanup)<br>

> 3)  recursive inlining is not supported<br>

> 4) function with indirect branch is not inlined. What source construct does<br>

> indirect branch instr correspond to ? variable jump?<br>

<br>

</div>This corresponds to functions that use GCC's labels-as-values<br>

extension, not things like switch statements or virtual calls, so this<br>

really only applies to things like interpreter loops, which you don't<br>

usually want to inline.<br>

<div class="im"><br>

> 5) fudge factor prefers functions with vector instructions -- why is that?<br>

<br>

</div>I'm guessing that vectorized instructions are an indicator that the<br>

function is a hotspot, and should therefore be inlined.<br></blockquote><div><br></div><div>This is a reasonable heuristic.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">


<div class="im"><br>

> 6) There is one heuristc used in inline-cost computation seems wrong:<br>

><br>

>   // Calls usually take a long time, so they make the inlining gain smaller.<br>

>   InlineCost += CalleeFI->Metrics.NumCalls * InlineConstants::CallPenalty;<br>

> Does it try to block inlining of callees with lots of calls? Note inlining<br>

> such a function only increase static call counts.<br>

<br>

</div>I'm guessing the heuristic is that functions with lots of calls are<br>

heavy and spend a lot of time in their callees, so eliminating the<br>

overhead of one call isn't worth it.<br></blockquote><div><br></div><div>Those calls may sit in a cold paths -- penalizing inlining to those functions can miss opportunities.</div><div><br></div><div>David</div><div>

 </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

<font color="#888888"><br>

Reid<br>

</font></blockquote></div><br>