[PATCH] D65060: [LICM] Make Loop ICM profile aware

Wenlei He via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jul 24 14:35:54 PDT 2019


wenlei marked an inline comment as done.
wenlei added inline comments.


================
Comment at: llvm/lib/Transforms/Scalar/LICM.cpp:804
+  BasicBlock *SrcBlock = I.getParent();
+  if (BFI->getBlockFreq(SrcBlock) < BFI->getBlockFreq(DstBlock)) {
+    ORE->emit([&]() {
----------------
davidxl wrote:
> wenlei wrote:
> > davidxl wrote:
> > > wenlei wrote:
> > > > vsk wrote:
> > > > > IIRC BFI returns a 0 frequency if information about a block is missing. If information for either block is missing, shouldn't this fall back to the non-PGO behavior?
> > > > Hmm.. how can we differentiate between missing info vs. the block count is really just 0? My understanding is 1) if profile isn't available in general or for this function, BPI would use static info about the CFG to estimate weights (BranchProbabilityInfo::calculate), thus BFI automatically gets a "synthetic" static profile; 2) if information is missing because we don't have profile for that block, that's likely AutoFDO instead of Instr. PGO, and SampleProfileLoader::propagateWeights would try to infer if possible, but that's more of a profile accuracy issue. 
> > > Without profile data, BFI can easily give you wrong results, so you do need real profile data to guide the decision, otherwise this can not be turned on by default.
> > > 
> > > To use real profile data, the right interface to use is ProfileSummaryInfo::getProfileCount -- this interface handles samplePGO and instrumentation based PGO properly.
> > > 
> > > 
> > > Without profile data, BFI can easily give you wrong results, so you do need real profile data to guide the decision, otherwise this can not be turned on by default.
> > 
> > I thought BFI would give us best guess without profile, e.g. it would leverage "__builtin_expect", and recognize loop (calcLoopBranchHeuristics), etc. for a synthetic profile. That's definitely not as accurate as a real profile, but I was following the examples of other passes. The LoopSink pass for example, uses BFI.getBlockFreq too. Did I miss anything? 
> > 
> > > To use real profile data, the right interface to use is ProfileSummaryInfo::getProfileCount -- this interface handles samplePGO and instrumentation based PGO properly.
> > 
> > Sorry I'm a bit confused. IIRC, ProfileSummaryInfo::getProfileCount is specifically for getting counts from call instructions. There we special case for sample profile, because call instructions should have accurate weight from LBR. The main use of that function is to determine if a call site is hot or cold (there's also assertion on that). Here we're not looking for hotness of a call site, so I'm sure if that interface is a good fit..
> LoopSink pass has check:
> 
>   // Enable LoopSink only when runtime profile is available.
>   // With static profile, the sinking decision may be sub-optimal.
>   if (!Preheader->getParent()->hasProfileData())
>     return false;
> 
> 
> Regarding ProfileSummaryInfo::getProfileCount -- thanks for noticing. I think it is a mistake to assert for call instruction. It should do something like:
> 
>   if (hasSampleProfile() && Inst->isCall()) {
>       ...
>       return ...
>   }
>    // use BFI and entry count based method:
>    ....
> 
> but that is something to be fixed independently.
> 
> 
> 
> 
thanks for the pointer to hasProfileData. the same check is now added. 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D65060/new/

https://reviews.llvm.org/D65060





More information about the llvm-commits mailing list