[PATCH] D18622: Replace the use of MaxFunctionCount module flag

Easwaran Raman via llvm-commits llvm-commits at lists.llvm.org
Tue Apr 12 14:20:29 PDT 2016


eraman added inline comments.

================
Comment at: include/llvm/ProfileData/ProfileCommon.h:194
@@ +193,3 @@
+    return nullptr;
+  // Computing profile summary for a module involves parsing a fairly large
+  // metadata and could be expensive. We use a simple cache of the last seen
----------------
eraman wrote:
> davidxl wrote:
> > I suggest doing some compile time measurement before using the caching mechanism. Besides, this won't affect O2 compile time at all.
> Here are some numbers:
> * It takes ~ 1.05 ms on a machine with Intel Xeon E5-2690 with 2.9GHz clock frequency per call to computeProfileSummary.  Or 3000 cycles. The time was obtained by using a NamedRegionTimer to measure 1M calls to computeProfileSummary and I took the user + sys time.
> 
> * To put this in perspective, while compiling a large real application, the time taken by computeProfileSummary is  25-30% of the time taken by CallAnalyzer's analyzeCall. So not caching will increase in a 25-30% increase in analyzeCall (which is now the only client of computeProfileSummary). 
> 
> Of course, as you point out this is only in the PGO mode and doesn't affect O2 compiles, but still it seems like worthwhile caching this,
I made the cahe into a  ManagedStatic and added SmartScopedLock to control accesses to the cache. Now, a call to getProfileSummary takes ~323 ns (averaged over 20M calls). Out of this, 70% is system time. 


http://reviews.llvm.org/D18622





More information about the llvm-commits mailing list