<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Apr 17, 2015 at 4:24 PM, Ivan Baev <span dir="ltr"><<a href="mailto:ibaev@codeaurora.org" target="_blank">ibaev@codeaurora.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">> On Fri, Apr 17, 2015 at 2:13 PM, Ivan Baev <<a href="mailto:ibaev@codeaurora.org">ibaev@codeaurora.org</a>> wrote:<br>

>> Hi, we've implemented an indirect call promotion llvm pass. The design<br>

>> notes including examples are shown below. This pass complements the<br>

>> indirect call profile infrastructure<br>

>> <a href="http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-April/084271.html" target="_blank">http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-April/084271.html</a><br>

><br>

> There are issues with the profiling infrastructure proposal which will<br>

> addressed separately.<br>

<br>

</span>-- Please send these to Betul and me.<br>

<span class=""><br></span></blockquote><div><br></div><div>yes.</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">

><br>

> This part looks sane in general to me. See my rely inline.<br>

><br>

><br>

>> We've implemented two heuristics from this paper [1].<br>

>><br>

>> a. Hot call site heuristic: only consider for promotion a call site<br>

>> which<br>

>> contribution (profile count) is more than 0.1% of the total count of all<br>

>> indirect calls executed during the profile runs.<br>

>><br>

>> b. Hot target heuristic: promote the most frequent target at a call site<br>

>> if it gets at least 40% of all receivers at the call site.<br>

><br>

> Is the heuristics a || b, or a && b ?<br>

<br>

</span>-- a && b<br>

<span class=""><br>

><br>

>><br>

>> Only the hottest target from a call site is possibly promoted, similarly<br>

>> to the approach taken in the paper.<br>

>><br>

</span>>> In addition to Aigner & HÃ¶lzle-based heuristics, we add an inline hint<br>

<span class="">>> to<br>

>> the promoted direct call/invoke instruction if it is the single receiver<br>

>> at the call site according to the profile data or the number of times<br>

>> the<br>

>> target is executed at the call site is at least 4% of the total count of<br>

>> all indirect calls.  Once the function entry profile counts become<br>

>> available we will use them to tune the above inline-related heuristic.<br>

>><br>

>><br>

><br>

> I don't think indirectly promoted callsites should be treated any<br>

> differently from original direct callsites -- after promotion, the<br>

> direct callsites have call count info and the same inline heuristics<br>

> should apply.<br>

<br>

</span>--- Agree in general, we should live this decision to inliner, especially<br>

when it becomes a profile-based inliner.<br>

At ICP pass we currenty have the profile counts for indirect call sites<br>

and their receivers (targets) and it is tempting to pass some of this<br>

information to the inliner.<br></blockquote><div><br></div><div>The inliner will get the profile information in the very near future, so any other way to pass the info to inliner is a hack and not recommended. Note that Value profile transformation can easily compute the branch probability for the newly generated branch with target count and site total count.</div><div><br></div><div> <br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<span class=""><br>

<br>

>><br>

>>   if (ptr->foo == A::foo)<br>

>> to<br>

>>   if (ptr->_vptr == A::_vtable)<br>

><br>

> You can do that if you know class A is final. In general, you need<br>

> type or vtable profiling to get it.<br>

<br>

</span>-- It is a future enhancement. Could you please provide some more details,<br>

in particular is it valid for C++ programs?<br>

<span class=""><br></span></blockquote><div><br></div><div>It is valid. HP compiler does that. Comparing vtable profiling with call target profiling: while it has its own downsides too. It tends to increase the number of target values to track because of the complicated class hierarchy. For instance, B has a member function 'foo' and is inherited by D1, D2, D3 (not overriding), and D4 (overriding). The number of values to track increases from 2 to 4. The transformation may also become more complicated: if (vptr == D1_vtb || vptr == D2_vtb ....) invoke B::foo(...) ...</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">

><br>

>><br>

>> This will sink one load from the original block into the less frequently<br>

>> executed if.false block. This opportunity was found by Balaram Makam.<br>

>><br>

>><br>

>> 4. New enhancement patch<br>

>> -------------------------<br>

>> Currently our implementation has the following shortcomings:<br>

>> a. Our heuristics do not depend on the global information on function<br>

>> counts. It could be that none of the indirect call sites are<br>

>> contributing<br>

>> highly to the overall calls. Because our current implementation is<br>

>> deciding what to inline based on the indirect call site sum only, it<br>

>> could<br>

>> be inlining functions that are in essence cold when all functions in the<br>

>> source base are considered. This situation will be improved when the<br>

>> function entry profile counts become available in llvm IR.<br>

><br>

> Our plan is to add program level summary data for PGO.  Any global<br>

> decisions need to made based on that because only relative global<br>

> hotness matters.<br>

><br>

>><br>

>> b. Current implementation only transforms the first hot target, the rest<br>

>> of the targets are never considered even if they are relatively hot.<br>

><br>

> This is probably a good thing.  Going beyond 2 can have negative effect.<br>

><br>

<br>

</span>-- With 2 we're getting incremental improvements, and we plan to further<br>

tune it.<br></blockquote><div> </div><div>David</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Thanks for the feedback, David.<br>

Ivan<br>

<span class=""><br>

<br>

>><br>

>> We are evaluating a new solution which depends on the<br>

>> presence/availability of functions counts in clang. We form a sorted<br>

>> multiset of all functions counts. A given indirect target is considered<br>

</span>>> for inlining if the targetâ€™s count at the call site falls within one<br>

<span class="">>> of<br>

>> the ranges that form the top 0-10%, 10-20% or 20-30% of the sorted<br>

</span>>> multiset.  Weâ€™ve added checks which become stricter as the target<br>

<span class="">>> count<br>

>> falls farther away from the top most called 10%, 20% or 30% of all<br>

>> functions respectively.<br>

>><br>

>> Targets that are classified as making calls to one of the top most<br>

>> called<br>

>> 30% of the functions receive inline hints.  Inline hints are<br>

>> communicated<br>

>> from clang down to LLVM in metadata. Then, on the LLVM side the<br>

>> transformation pass uses the metadata field for the hint to add an<br>

>> inline<br>

>> hint at the transformed call site.<br>

><br>

> Again, there is no need to invent indirect call (promoted) specific<br>

> inline heuristics.<br>

><br>

><br>

> thanks,<br>

><br>

> David<br>

><br>

><br>

>><br>

>> -------------------------<br>

</span>>> [1] G. Aigner and U. HÃ¶lzle. Eliminating virtual function calls in C++<br>

<div class="HOEnZb"><div class="h5">>> programs. ECOOP, 1996.<br>

>> [2] X. Li, R. Ashok, R. Hundt. Lightweight Feedback-Directed<br>

>> Cross-Module<br>

>> Optimization. CGO, 2010.<br>

>><br>

<br>

<br>

<br>

_______________________________________________<br>

LLVM Developers mailing list<br>

<a href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a>         <a href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br>

<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br>

</div></div></blockquote></div><br></div></div>