[llvm-dev] [LLVMdev] Path forward on profile guided inlining?
Easwaran Raman via llvm-dev
llvm-dev at lists.llvm.org
Mon Dec 7 15:13:26 PST 2015
(Resending after removing llvmdev at cs.uiuc.edu and using
llvm-dev at lists.llvm.org)
On Mon, Dec 7, 2015 at 3:08 PM, Easwaran Raman <eraman at google.com> wrote:
> Hi Philip,
>
> Is there any update on this? I've been sending patches to get rid of the
> callee hotness based inline hints from the frontend and move the logic to
> the inliner. The next step is to use the callsite hotness instead. I also
> want to focus on the infrastructure to enable this and what I've been
> experimenting with is similar to your two alternative approaches:
>
>
>> Alternate Approaches:
>> 1) We could just recompute BFI explicitly in the inliner right before
>> passing the result to ICA for the purposes of prototyping. If this was off
>> by default, this might be a reasonable scheme for investigation. This
>> could likely never be enabled for real uses.
>> 2) We could pre-compute BFI once per function within a given SCC and then
>> try to keep it up to date during inlining. If we cached the call
>> frequencies for the initial call sites, we could adjust the visit order to
>> minimize the number of times we need to recompute a given functions block
>> frequencies. (e.g. we can look at all the original call sites within a
>> function before looking at newly inlined ones)
>>
>>
> My proposal is very similar (perhaps identical) to your option 2 above. I
> don't understand the part where you talk about adjusting the visit order to
> minimize BFI computation.
>
> BFI computation: BFI for a function is computed on demand and cached.
>
> Update: When 'bar' gets inlined into 'foo', the BFI for 'foo' is updated.
> Let OldBB in 'bar' gets cloned as NewBB in 'foo'. NewBB's block frequency
> can be incrementally computed from OldBB's block frequency, entry block
> frequency of 'bar' and the frequency of the block containing the 'foo' ->
> 'bar' callsite. Even when the new CGSCC level BFI analysis is in place,
> this incremental update is useful to minimize computation.
>
> Invalidation: Once inlining is completed in an SCC (at the end of
> runOnSCC), the entries for functions in that SCC are invalidated since
> other passes run by the CGSCC pass manager (including those run by the
> function pass manager run under CGSCC pass manager) might affect the
> computed BFI for the functions in the SCC.
>
> When the new PM infrastructure and a CGSCC based BFI analysis is in place,
> the transition should be easy assuming it will provide getBFI(Function *)
> and invalidateBFI(Function *) interfaces. BFI for a function is computed at
> most twice in this approach. Thoughts?
>
>
> Thanks,
> Easwaran
>
>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20151207/97f64942/attachment.html>
More information about the llvm-dev
mailing list