[llvm-dev] [LLVMdev] Path forward on profile guided inlining?

Thu Dec 10 17:03:50 PST 2015

Clearly I haven't done a good job explaining my proposal. I don't propose
extending the pass manager to provide function level analysis. I proposed
to do something like this:

/// \brief Get BlockFrequencyInfo for a function.
BlockFrequencyInfo *BlockFrequencyAnalysis::getBlockFrequencyInfo(Function
*F) {
  // BFM is a map from Function * to BlockFrequencyInfo * and
  // caches BlockFrequencyInfo *
  auto Iter = BFM.find(F);
  if (Iter != BFM.end()) return Iter->second;
  // We need to create a BlockFrequencyInfo object for F and store it.
  //
  // DominatorTree, LoopInfo and BranchProbabilityInfo can be thrown
  // away after BFI is computed.

  // First create a dominator tree
  DominatorTree DT;
  DT.recalculate(*F);

  // Then get a LoopInfo object
  LoopInfo LI;
  LI.analyze(DT);

  // Then compute BranchProbabilityInfo
  BranchProbabilityInfo BPI;
  BPI.calculate(*F, LI);

  // Finally compute BlockFrequencyInfo
  BlockFrequencyInfo *BFI = new BlockFrequencyInfo;
  BFI->calculate(*F, BPI, LI);
  BFM[F] = BFI;
  return BFI;
}

This recomputes all analysis needed to compute BFI and doesn't depend on
the analysis framework. This is likely more computation (no benefit when a
dependent analysis is preserved) and is used only by the inliner and not
shared by passes like a true analysis. The invalidation needs to be done
only once per function after it ceases to be a caller in inlining. Within
inliner itself, the BFI data is incrementally updated.

Thanks,
Easwaran

On Thu, Dec 10, 2015 at 4:00 PM, Philip Reames <listmail at philipreames.com>
wrote:

> Given I didn't get any response to my original query, I chose not to
> invest time in this at the time.  I am unlikely to get time for this in the
> near future.
>
> On 12/07/2015 03:13 PM, Easwaran Raman wrote:
>
> (Resending after  removing  <llvmdev at cs.uiuc.edu>llvmdev at cs.uiuc.edu and
> using llvm-dev at lists.llvm.org)
>
> On Mon, Dec 7, 2015 at 3:08 PM, Easwaran Raman <eraman at google.com> wrote:
>
>> Hi Philip,
>>
>>  Is there any update on this? I've been sending patches to get rid of the
>> callee hotness based inline hints from the frontend and move the logic to
>> the inliner. The next step is to use the callsite hotness instead. I also
>> want to focus on the infrastructure to enable this and what I've been
>> experimenting with is similar to your two alternative approaches:
>>
>>
>>> Alternate Approaches:
>>> 1) We could just recompute BFI explicitly in the inliner right before
>>> passing the result to ICA for the purposes of prototyping. If this was off
>>> by default, this might be a reasonable scheme for investigation.  This
>>> could likely never be enabled for real uses.
>>> 2) We could pre-compute BFI once per function within a given SCC and
>>> then try to keep it up to date during inlining.  If we cached the call
>>> frequencies for the initial call sites, we could adjust the visit order to
>>> minimize the number of times we need to recompute a given functions block
>>> frequencies.  (e.g. we can look at all the original call sites within a
>>> function before looking at newly inlined ones)
>>>
>>>
>> My proposal is very similar (perhaps identical) to your option 2 above. I
>> don't understand the part where you talk about adjusting the visit order to
>> minimize BFI computation.
>>
>> BFI computation:  BFI for a function is computed on demand and cached.
>>
> The problem here is that the old pass manager does not have a place/way to
> cache the function level analysis.  You could do this within the inliner
> itself, but the function passes run by the SCC pass manager will invalidate
> everything.  That's the problem I was trying to solve.
>
>
>> Update: When 'bar' gets inlined into 'foo', the BFI for 'foo' is updated.
>> Let OldBB in 'bar' gets cloned as NewBB in 'foo'.  NewBB's block frequency
>> can be incrementally computed from OldBB's block frequency, entry block
>> frequency of 'bar' and the frequency of the block containing the 'foo' ->
>> 'bar' callsite. Even when the new CGSCC level BFI analysis is in place,
>> this incremental update is useful to minimize computation.
>>
>> Invalidation:  Once inlining is completed in an SCC (at the end of
>> runOnSCC), the entries for functions in that SCC are invalidated since
>> other passes run by the CGSCC pass manager (including those run by the
>> function pass manager run under CGSCC pass manager) might affect the
>> computed BFI for the functions in the SCC.
>>
> This sounds like you're essentially planning to extend the old pass
> manager with function level analysis support in the SCC pass manager.  I
> advise against this given the fact that Chandler is actively working on the
> new one and getting patches reviewed for the old one is "challenging" at
> best.  If you take this approach, make sure Chandler - who is pretty much
> the only active reviewer in this area - is on-board with your approach
> before you start.
>
>
>> When the new PM infrastructure and a CGSCC based BFI analysis is in
>> place, the transition should be easy assuming it will provide
>> getBFI(Function *) and invalidateBFI(Function *) interfaces. BFI for a
>> function is computed at most twice in this approach. Thoughts?
>>
>>
>> Thanks,
>> Easwaran
>>
>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu          <http://llvm.cs.uiuc.edu>
>>> http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20151210/629a8794/attachment.html>