[PATCH] D34312: [NewPM/Inliner] Reduce threshold for cold callsites in the non-PGO case

Mon Jun 26 12:00:14 PDT 2017

chandlerc added inline comments.

================
Comment at: lib/Analysis/InlineCost.cpp:72-76
+                        cl::desc("Maxmimum block frequency, expressed as"
+                                 " a percentage of caller's entry"
+                                 " frequency, for a callsite to be"
+                                 " cold in the absence of profile "
+                                 " information."));
----------------
Did clang format do this? yuck. You may get better results by mergeing the string literal into a single line string literal and letting clang-format re-break it from there.

================
Comment at: lib/Analysis/InlineCost.cpp:656-659
+  auto CallSiteBB = CS.getInstruction()->getParent();
+  auto CallSiteFreq = CallerBFI->getBlockFreq(CallSiteBB).getFrequency();
+  auto CallerEntryFreq = CallerBFI->getEntryFreq();
+  return CallSiteFreq * 100 <= CallerEntryFreq * ColdCallSiteRelFreq;
----------------
Rather than digging the integers out (and risking overflow) I would expect to use `BlockFrequency` and `BranchProbability` directly?

For example, in MBP we do something along the lines of...

  const BranchProbability ColdProb(1, 5); // 20%
  auto CallSiteFreq = CallerBFI->getBlockFreq(CallSiteBB);
  auto EntryFreq = CallerBFI->getEntryFreq();
  return CallSiteFreq < EntryFreq * ColdProb;

(Clearly the coldness should be different here than it is in MBP, just seems like if the way we scale `BlockFrequency` is with a `BranchProbability`, we should keep doing that here.)

It also might be good to try and cache the scaled entry frequency so that we don't re-compute it for each call site?

https://reviews.llvm.org/D34312