[PATCH] D18622: Replace the use of MaxFunctionCount module flag

Wed Apr 6 23:21:32 PDT 2016

vsk added inline comments.

================
Comment at: include/llvm/ProfileData/ProfileCommon.h:75
@@ +74,3 @@
+  // Cache of last seen module and its profile summary.
+  static std::pair<Module *, std::unique_ptr<ProfileSummary>> CachedSummary;
+
----------------
eraman wrote:
> vsk wrote:
> > eraman wrote:
> > > vsk wrote:
> > > > eraman wrote:
> > > > > vsk wrote:
> > > > > > eraman wrote:
> > > > > > > vsk wrote:
> > > > > > > > eraman wrote:
> > > > > > > > > vsk wrote:
> > > > > > > > > > eraman wrote:
> > > > > > > > > > > vsk wrote:
> > > > > > > > > > > > Hm, this won't work in multi-threaded environments.
> > > > > > > > > > > > 
> > > > > > > > > > > > You can either sink this into LLVMContext (preferred!), or make it an `llvm::ManagedStatic` defined outside of `ProfileSummary` for more visibility. If you go with the second option, you'll also need a mutex around all cache accesses.
> > > > > > > > > > > > 
> > > > > > > > > > > > Somewhere down the line you might consider making this cache some kind of (Small)?DenseMap.
> > > > > > > > > > > Yes indeed. Perhaps it is better to not cache this and instead let the clients cache the summary?
> > > > > > > > > > > 
> > > > > > > > > > > As to use of DenseMap, is this ever needed? I believe (and I may be wrong) the only place multiple modules are simultaneously present is in the LTO mode and there all modules are actually merged into one module which is where the IPO optimizations take place. 
> > > > > > > > > > > 
> > > > > > > > > > > Another crazy thought - we could just cache the ProfileSummary without any key since all modules compiled in a single process are bound to have the same profile summary (since summary is not module specific)
> > > > > > > > > > Sure, using a DenseMap is probably overkill. Having a one-entry cache seems useful though.
> > > > > > > > > > 
> > > > > > > > > > Regarding your last point: it's cheap to keep the `Module*` as a key, and keeping it makes the code less "magical" (no hidden assumptions about which Module you're referring to).
> > > > > > > > > Yeah, that's why I didn't go that route.
> > > > > > > > > 
> > > > > > > > > I don't see a way to make this part of the LLVMContextImpl since ProfileSummary is not part of Core (and we obviously don't want Core to be dependent on ProfileData). In fact that is the reason why Module's getProfileSummary returns Metadata instead of ProfileSummary. So the only way to make this work is to make it an ManagedStatic right?
> > > > > > > > There's one more possibility. What if we cache ProfileSummary objects in the clients of this API, instead of in `getProfileSummary`?
> > > > > > > > 
> > > > > > > > E.g we could cache the summary in `CallAnalyzer`. An `LLVMContext` should be readily available there, and that seems like the right place to put this.
> > > > > > > That's what I meant by "Perhaps it is better to not cache this and instead let the clients cache the summary". This won't require using LLVMContext as the cached summary will be part of the client object (in this case it'll be in Inliner and passed on to CallAnalyzer)
> > > > > > Ah, sorry for misreading.
> > > > > > 
> > > > > > But, wouldn't the code be more generic if clients query the LLVMContext for the summary? That way, all clients of LLVMContext have access to the cached summary, not just the inliner.
> > > > > That would require ProfileSummary definition to be visible to LLVMContext, doesn't it? Or are you suggesting keeping a (Module *, ProfileSummary *) pair in LLVMContext, let the client query it and if there is no cached entry, compute the summary and update this cache to make it available for other optimizations? IMO this is not worth the potential savings.
> > > > I'm suggesting keeping a (Module *, ProfileSummary *) pair in LLVMContextImpl and exposing LLVMContext::getProfileSummary(Module *). The latter would take care of computing and caching the summary. We wouldn't have to expose ProfileCommon.h to all users of LLVMContext (we can just forward-declare ProfileSummary).
> > > > 
> > > > ISTM that this approach is identical to caching the summary in the inliner, but is easier to reuse. So, is your concern that it would be trickier to cache the summary in an LLVMContext vs. the inliner? Or would you prefer to not cache the summary anywhere?
> > > Sorry, I am still not getting it. To implement LLVMContext::getProfileSummary(Module *) in the .cpp file - specifically the part that computes the summary - we need the definition of ProfileSummary and not just the forward declaration. This would also have to call getFromMD which is  defined in ProfileSummary.cpp. In short, this would make LLVMCore depend on LLVMProfileData (or make the code in ProfileData part of LLVMCore) which is something we don't want to do.. Does this make sense?
> > > 
> > > One thing we could do is keep a ProfileSummary* in Module and use it as a cache that is populated by ProfileSummary::getProfileSummary(Module *). This would only require forward declaration of ProfileSummary in Module. 
> > I'm sorry I missed your response! I understand the problem now. I think your solution (keep a ProfileSummary* in Module) is good.
> Unfortunately, this is not ideal either. Who should own the ProfileSummary object created out of Metadata? Module cannot since it doesn't see ProfileSummary destructor. I think the only options are to a) let the optimizations cache the summary or b) use ManagedStatic.
I see, we can't place a unique_ptr<ProfileSummary> into LLVMContext without exposing its class definition.

In that case, I prefer option (b) because it's a bit more general solution.

http://reviews.llvm.org/D18622