[PATCH] D18622: Replace the use of MaxFunctionCount module flag
Easwaran Raman via llvm-commits
llvm-commits at lists.llvm.org
Tue Apr 12 14:20:29 PDT 2016
eraman added inline comments.
================
Comment at: include/llvm/ProfileData/ProfileCommon.h:194
@@ +193,3 @@
+ return nullptr;
+ // Computing profile summary for a module involves parsing a fairly large
+ // metadata and could be expensive. We use a simple cache of the last seen
----------------
eraman wrote:
> davidxl wrote:
> > I suggest doing some compile time measurement before using the caching mechanism. Besides, this won't affect O2 compile time at all.
> Here are some numbers:
> * It takes ~ 1.05 ms on a machine with Intel Xeon E5-2690 with 2.9GHz clock frequency per call to computeProfileSummary. Or 3000 cycles. The time was obtained by using a NamedRegionTimer to measure 1M calls to computeProfileSummary and I took the user + sys time.
>
> * To put this in perspective, while compiling a large real application, the time taken by computeProfileSummary is 25-30% of the time taken by CallAnalyzer's analyzeCall. So not caching will increase in a 25-30% increase in analyzeCall (which is now the only client of computeProfileSummary).
>
> Of course, as you point out this is only in the PGO mode and doesn't affect O2 compiles, but still it seems like worthwhile caching this,
I made the cahe into a ManagedStatic and added SmartScopedLock to control accesses to the cache. Now, a call to getProfileSummary takes ~323 ns (averaged over 20M calls). Out of this, 70% is system time.
http://reviews.llvm.org/D18622
More information about the llvm-commits
mailing list