[llvm-dev] [PM] I think that the new PM needs to learn about inter-analysis dependencies...

Tue Aug 9 08:29:51 PDT 2016

When we talk about stale information, what does it mean exactly? This goes back to the point about how what guarantees are made about the analysis information, and when it is invalidated. If some information is wrong for the current IR (during a transform), but still fully correct with regards to the IR that was given to the analysis before the transform, this is information that some transforms actually need. The other case is if the analysis info directly points to the IR that is being modified, and some information is wrong for both the old and current IR, which could cause the transform to fail. What should the policy be for analyses, and how would someone writing a transform know when information becomes stale and how?
- Jan   

    On Monday, August 8, 2016 1:42 PM, Sean Silva via llvm-dev <llvm-dev at lists.llvm.org> wrote:

On Sun, Aug 7, 2016 at 4:55 PM, Philip Reames <listmail at philipreames.com> wrote:

  Skimming the thread, this post is the clearest path forward I've seen.  Minor comments inline, but I generally like this framing.  

 On 07/14/2016 08:04 PM, Chandler Carruth via llvm-dev wrote:

 We need better terminology to talk about this. I propose: 
  analysis-dependencies: analysis A uses result of analysis B when *running* an analysis and not used by the result query-dependencies: result of analysis A uses result of analysis B when evaluating a query data-structure-depnedencies: result of analysis A uses data structures from the result of analysis B inside its own data structures 
  I think these are much more precise than "hard" or "soft" and expose more facets. 
  For analysis-depnedencies, I continue to think they work correctly. If a transformation claims it preserves an analysis, it must actually know this to be true. I don't see any actual problems here in practice today, and this isn't actually something changed in the new PM. 
  For data-structure-dependencies, the investigation done by Sean seems to clearly show these to be rare, and I think having their invalidate methods be overridden to test that *all* of the data structures they depend on are preserved is the correct approach.  
 This may end up being too error prone.  That seems to be Sean's concern down thread.  I am also worried about that, but assuming the number of such occurrences are low, it seems reasonable to start with this approach and revisit if needed.  

Chandler's suggestion here doesn't actually avoid the issue.
A simple proof that this approach cannot work is that the issue at hand is one of insufficient invalidation. The `invalidate` callback can only prevent invalidation, so Chandler's suggestion of overriding it can only prevent invalidation -- it cannot trigger additional invalidation and therefore cannot solve a problem of insufficient invalidation.
-- Sean Silva 

  The primary issue seems to be with query-dependencies. These in turn break down into two categories: 
  1) query-dependencies on "immutable" analysis results. These are results that we *never* expect to be invalidated and we just want easy access to. TargetLibraryInfo is the classic example here. 
  2) query-dependencies on normal analysis results. DominatorTree and and AAResults are the primary examples here. 
  I would like to have a simple way to handle #1 by ensuring invalidation doesn't occur. I think this already works, but I'm interested if there is actually an issue here. 
  We *could* handle #2 much like data-structure-dependencies, but I hear Hal and others that this would become burdensome. However, my preference would be to instead handle #2 by making result APIs accept direct handles to the analysis results they rely on. 
  The reason for preferring this approach is because I think it makes the relationship between these things much more clear to users of the analyses.  
 +1 to this.  Having the query dependencies explicit at the call site would generally make the code much easier to understand and thus much more likely to be correct.  I recently ran across an issue in LVI under the old pass manager that looks highly suspicious, but because the code is quite complex (putting it mildly), I wasn't sure  whether it was correct or not.  Having the additional analyzes passed in explicitly through the query would make many of these cases substantially easier to audit.  

  I think the most annoying of these to handle are aggregation-style analyses results like AAResults. There, I think it might make more sense to handle them along the lines of data-structure-dependencies. I don't think we have so many of those that this would be a significant burden IMO.  
  On Thu, Jul 14, 2016 at 7:48 PM Sean Silva <chisophugis at gmail.com> wrote:

   On Thu, Jul 14, 2016 at 7:21 PM, Hal Finkel <hfinkel at anl.gov> wrote:

  Hi Sean,

 Thanks for writing all of this up. I'll go back to my previous position: we need a general dependency graph built as the analysis cache is used. It should have the following properties:

  1. When we call getResult or getCachedResult on an analysis manager, we record a dependency of the current pass on the returned result.
  2. This dependency needs to be stored such that it can be used to invalidate the current result when the returned result is invalidates and so that the dependency can be deleted when the current result is invalidated.

 As I understand the problem, this is a fully-general solution. I see no reason not to have a fully-general solution.

        Yeah, the mechanics of maintaining this fully general mapping are straightforward in the abstract (once we have a single analysis manager for all IRUnitT's). Essentially, the analysis manager maintains a stack of (Analysis, IRUnit) that it is currently computing and pushes/pops the stack as it (re-)enters/exits get{,Cached}Result. If the stack is not empty (suppose  top of stack is `(AnalysisFoo, IRUnitBar)`), then when we go to push `(AnalysisBaz, IRUnitQux)` we record a dependency that the result cached on key `(AnalysisBaz, IRUnitQux)` must be invalidated if the result cached on key `(AnalysisFoo, IRUnitBar)` is invalidated. 
  The difficult part is what guarantees we provide about things being stale or not and how we invalidate when IRUnit's are  deleted or created. For example, suppose a function pass DCE's a call which causes an SCC Foo of 3 functions to no longer be an SCC. When/how do  analyses cached on Foo get invalidated? And is it valid to query them? One of the expected use cases (I'm told) for CGSCC passes is to propagate function-attribute like things, so these are being potentially queried by that same function pass.       
  -- Sean Silva         

 Thanks again,
 Hal

From: "Sean Silva" <chisophugis at gmail.com>
 To: "Chandler Carruth" <chandlerc at gmail.com>
 Cc: "Xinliang David Li" <davidxl at google.com>, "llvm-dev" <llvm-dev at lists.llvm.org>, "Davide Italiano" <dccitaliano at gmail.com>, "Tim Amini Golling" <mehdi.amini at apple.com>, "Hal Finkel" <hfinkel at anl.gov>, "Sanjoy Das" <sanjoy at playingwithpointers.co m>, "Pete Cooper" <peter_cooper at apple.com>
 Sent: Thursday, July 14, 2016 4:11:58 AM
 Subject: Re: [PM] I think that the new PM needs to learn about inter-analysis dependencies...

   On Thu, Jul 14, 2016 at 12:51 AM, Sean Silva <chisophugis at gmail.com> wrote:

   On Wed, Jul 13, 2016 at 1:48 AM, Sean Silva <chisophugis at gmail.com> wrote:

   On Wed, Jul 13, 2016 at 12:34 AM, Chandler Carruth <chandlerc at gmail.com> wrote:

     On Wed, Jul 13, 2016 at 12:25 AM Sean Silva <chisophugis at gmail.com> wrote:

   On Tue, Jul 12, 2016 at 11:39 PM, Chandler Carruth <chandlerc at gmail.com> wrote:

   On Tue, Jul 12, 2016 at 11:34 PM Sean Silva <chisophugis at gmail.com> wrote:

   On Tue, Jul 12, 2016 at 11:32 PM, Xinliang David Li <davidxl at google.com> wrote:

 On Tue, Jul 12, 2016 at 10:57 PM, Chandler Carruth <chandlerc at gmail.com> wrote:

   Yea, this is a nasty problem. 
  One important thing to understand is that this is specific to  analyses which hold references to other analyses. While this isn't unheard of, it isn't as common as it could be. Still,  definitely something we need to address.   

   We can call this type of dependencies (holding references)  hard-dependency. The soft dependency refers to the case where analysis 'A' depends on 'B' during computation, but does not need 'B' once  it is computed. 
  There are actually quite a few examples of hard-dependency case. For  instance LoopAccessInfo, LazyValueInfo etc which hold references to  other analyses. 
  Problem involving hard-dependency is actually easier to detect, as it  is usually a compile time problem. Issues involving soft  dependencies are more subtle and can lead to wrong code gen.    

        Did you mean to say that soft-dependency problems are easier to  detect? At least my intuition is that soft-dependency is easier because there is no risk of dangling pointers to other  analyses.

   The issue is that the fact that there is *any* dependency isn't clear. 
  However, I think the only real problem here are these "hard  dependencies" (I don't really like that term though). For others, only an analysis that is *explicitly* preserved survives. So  I'm not worried about the fact that people have to remember this. 
  The question is how often there are cross-data-structure references.  David mentions a few examples, and I'm sure there are more, but  it isn't clear to me yet whether this is pervasive or occasional.   

 I just did a quick run-through ofPassRegistry.def and this is what I found: 
  Module analyses: 0/5 hold pointers to other analyses CallGraph: No pointers to other analyses. LazyCallGraph: No pointers to other analyses. ProfileSummaryAnalysis: No pointers to other analyses. TargetLibraryAnalysis: No pointers to other analyses.
  VerifierAnalysis: No pointers to other analyses. 
  Module alias analyses: 1/1 keeps pointer to other analysis. GlobalsAA: Result keeps pointer to TLI (this is a function analysis). 
  Function analyses: 9/17 keep pointers to other analysis AAManager: Its Result holds TLI pointer and pointers to individual AA  result objects. AssumptionAnalysis: No pointers to other analyses.
  BlockFrequencyAnalysis: Its Result holds pointers to LoopInfo and BPI.
  BranchProbabilityAnalysis: Stores no pointers to other analyses. (uses LoopInfo  to "recalculate" though)
  DominatorTreeAnalysis: Stores no pointers to other analyses.
  PostDominatorTreeAnalysis: Stores no pointers to other analyses. DemandedBitsAnalysis: Stores pointers to AssumptionCacheand DominatorTree
  DominanceFrontierAnalysis: Stores no pointers to other analyses. (uses DominatorTreeAnalysis for "recalculate" though).
  LoopInfo:Uses DominatorTreeAnalysis for "recalculate" but stores no pointers.
  LazyValueAnalysis: Stores pointers to AssumptionCache,TargetLibraryInfo, DominatorTree.
  DependenceAnalysis: Stores pointers to AliasAnalysis,ScalarEvolution, LoopInfo MemoryDependenceAnalysis: Stores pointers to AliasAnalysis,AssumptionCache, TargetLibraryInfo, DominatorTree
  MemorySSAAnalysis: Stores pointers to AliasAnalysis, DominatorTree
  RegionInfoAnalysis: Stores pointers to DomTree, PostDomTree, DomFrontier
  ScalarEvolutionAnalysis: Stores pointers to TargetLibraryInfo,AssumptionCache, DominatorTree, LoopInfo TargetLibraryAnalysis: Has no dependencies
  TargetIRAnalysis: Has no dependencies.

  Function alias analyses: 3/5 keep pointers to other analyses BasicAA: Keeps pointers to TargetLibraryInfo,AssumptionCache, DominatorTree, LoopInfo CFLAA: Keeps pointer to TargetLibraryInfo SCEVAA: Keeps pointer to ScalarEvolution ScopedNoAliasAA: No dependencies
  TypeBasedAA: No dependencies

  Total: 13/28 analyses (~50%) hold pointers to other analyses. Of the 15/28 analyses that don't hold pointers, 12/15 simply have no  dependencies. Only 3/15 (BPI, LoopInfo,DominanceFrontier) have dependencies that are used just for a "recalculate" step that retains no pointers. So I think it is fair to say that analyses which hold pointers to  other analyses is not an exceptional case. In fact, analyses that  use other analyses just for a "recalculate" step seems to be the  exceptional case (only 3/28 or about 10%)     

    Interesting! 
  Most of these look like they hold a pointer to the root analysis as  opposed to detailed objects *inside* the analysis? 
  It might make sense to try to handle this very specific pattern in a  special way of overriding the invalidate routines is too error  prone.... We could try to make this work "automatically" but I'm worried this would be challenging to  get right. Open to suggestions of course. 
  Any other ideas about what would make sense to handle this? 
  Does it make sense to override the invalidate routines now and iterate  from there? I feel like you've done a lot of the research  necessary for this already...   

    I'll keep pushing forward tomorrow with building test-suite successfully  using the new PM for the LTO pipeline (I was doing some unrelated LLD stuff for most of today). It will be interesting to see how many `invalidate`  overrides will be needed to avoid these issues for just the LTO pipeline  on test-suite.    

    I spent the better part of today working on this and will continue tomorrow; this problem seems nastier than I thought. For some reason the LTO pipeline (or something about LTO) seems to hit on these issues much more (I'm talking like 40k lines of ASan error reports from building test-suite with the  LTO pipeline in the new PM; per-TU steps still using the old PM). Some notes: 
  - BasicAA's dependence on domtree and loopinfo in the new PM seems to account for quite a few of the problems. - BasicAA and other stuff are marked (by overriding `invalidate` to return false) to never be invalidated because they are "stateless". However they  still hold pointers and so they do need to be invalidated. - CallGraph uses AssertingVH (PR28400) and so I needed a workaround similar to r274656 in various passes. - D21921 is holding up -- I haven't hit any issues with the core logic of that patch. - AAResults holds handles to various AA result objects. This means it pretty much always needs to be invalidated unless you are sure that none of the  AA's will get invalidated. 

  The existing `invalidate` method doesn't have the right semantics for even an error-prone solution :( We are going to need to make some significant changes to  even get basic sanity I think. Perhaps each analysis can expose a "preserve" static function. E.g. instead of `PA.preserve<Foo>();` you have to do `Foo::setPreserved(PA);`. I'm actually not quite sure that that will even work. Once I have test-suite fully building successfully with the LTO pipeline in the new PM I'll be able to give a more confident answer (esp. w.r.t. the manager for different IRUnitT's). But at this point I'm not confident running *any* pass pipeline in the new PM without at least assertions+ASan.

  We may want to have a proper design discussion around this problem though. 
  Also I'd like to have test-suite working (by hook or by crook) with LTO in the new PM so we can get some numbers on the resident set impact of all this caching; if it is really problematic then we may need to start talking front-and-center about different invalidation policies for keeping this in  check instead of leaving it as something that we will be able to patch later.

  The more I think about it, the more I'm convinced that the real "hard" problem that the new PM is exposing us to is having the ability for any pass to ask for any analysis on any IRUnitT (and any specific IRUnit of that IRUnitT) and have the result stored somewhere and then invalidated. This means that  "getAnalysisUsage" is not just a list of passes, but much more complicated and is essentially a set of arbitrary pairs "(analysis, IRUnit)" (and the associated potential tangle of dependencies between the state cached on these tuples). With the old PM, you essentially are looking  at a problem of scheduling the lifetime of analyses of the same IRUnit intermingled with transformation passes on that same IRUnit, so you  only have the "analysis" part of the tuple above, making things much simpler (and handling dependencies is much simpler too).    

  To clarify, it seems like the current new PM is essentially trying to solve the problem of maintaining/updating a mapping: (Analysis, IRUnit) -> AnalysisResult where the AnalysisResult's can have an arbitrary dependency on an arbitrary set of other AnalysisResult's currently maintained in this mapping. In order to invalidate any AnalysisResult you  need to invalidate all AnalysisResult's that transitively depend on it. Therefore the right-hand side of this mapping needs to be something like `(AnalysisResult, SetOfDependents)`. So the mapping is really `(Analysis, IRUnit) -> (AnalysisResult, SetOfDependents)` Also, this mapping can be updated at any point during the execution of a transformation pass (and various other places) and must stay correct as the IR is changed (more on this below). For example, you might have something like: (DominatorTreeAnalysis, function @foo) -> (DominatorTree for @foo, [(DemandedBitsAnalysis, function @foo)]) (AssumptionAnalysis, function @foo) -> (AssumptionCache for @foo, [(DemandedBitsAnalysis, function @foo)]) (DemandedBitsAnalysis, function @foo) -> (DemandedBits for @foo, []) (AssumptionAnalysis, function @bar) -> (AssumptionCache for @bar, [(SomeModuleAnalysis, module TheModule)]) (AssumptionAnalysis, function @baz) -> (AssumptionCache for @baz, [(SomeModuleAnalysis, module TheModule)]) (SomeModuleAnalysis, module TheModule) -> (SomeModuleAnalysisResult for TheModule, [(SomeFunctionAnalysis, function @baz)]) (SomeFunctionAnalysis, function @baz) -> (SomeFunctionAnalysisResult for @baz, []) 
  So for example, when a transformation pass invalidates `(AssumptionAnalysis, function @bar)`, we need to walk `(SomeModuleAnalysis, module TheModule)` and `(SomeFunctionAnalysis, function @baz)` to invalidate them. 

  Compare this with the old PM (although like I said we have outgrown this model). Essentially you take the previous mapping, and require IRUnit to be a constant at any given point in time. Hence the  mapping is essentially
 Analysis -> AnalysisResult Since this is 1:1 there is no real distinction between the Analysis and the AnalysisResult (and as part of transitioning to the new PM this has had to be untangled). This also makes the dependencies simpler since you just have a set of "what analyses have been run at this point". You just need to run the analyses individually and make sure they are in the right order. Also, getAnalysis just takes the Analysis to get the AnalysisResult which makes it simpler -- you just query which analyses are live. 

  Also, the mapping `(Analysis, IRUnit) -> (AnalysisResult, SetOfDependents)` that the new PM is essentially trying to keep is even more complicated because for e.g. Loop and CGSCC passes the IRUnit itself is an object created by an  analysis and subject to invalidation of that analysis as the IR changes underneath it. 
  And then there is the question of at what points must this mapping be valid (i.e. no stale analysis results, no dangling pointers, etc.) and when the transitive invalidation walking happens. Evidently while a transformation pass is running, things might temporarily be stale; what are the "checkpoints" where the mapping is guaranteed to be valid? At the start of each transformation  pass? At least Chandler's D21464 does not stick to this because the IRUnit's (SCC's) are only updated at the end of running potentially many function transformation passes. I.e. all but the first function transformation pass might observe stale IRUnit's (SCC's). 
  One other thing to note is that soft-dependencies (using David's terminology) don't require this kind of dependency tracking. An analysis result can be cached even though its soft-dependencies are not cached. And invalidation of soft-dependencies does not require invalidating the soft-dependents. Actually, this makes it the terminology "soft" and "hard' quite natural; "hard" requires an edge to track the dependency for invalidation purposes, "soft" does not. 
  This is all quite general. Perhaps too much. We clearly need to go beyond the old PM's model, but we may not need to go to the fully general case. Is there a good middle-ground that meets our needs?  What restrictions would we be willing to live with in order to make it easier? The first one on my list is to not have the IRUnit's themselves depend on analyses. Like  Chandler mentioned on D21921 this has the effect of e.g. preventing caching across the intervening module pass in a case like`module(cgscc(require<foo-cgsc c-analysis>),some-module-pass- that-makes-no-changes,cgscc(so me-cgscc-pass-that-uses-foo-cg scc-analysis))` but that seems like a restriction we can live with. 

  Again, sorry for the braindump. 

  -- Sean Silva   
     We've obviously outgrown this model with examples like LAA, AssumptionCacheTracker, etc. that hack around this in the old PM. We may want to have  a fresh re-examination of what problems we are exactly trying to solve. 
  For me, my main concern now is what changes need to be made in order to feel confident running a pipeline in the new PM without assertions+ASan. 

  Sorry for the long post, just brain-dumping before heading home. 
  -- Sean Silva 

  -- Sean Silva  

 -- 
 Hal Finkel
 Assistant Computational Scientist
 Leadership Computing Facility
 Argonne National Laboratory

 ______________________________ _________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org
http://lists.llvm.org/cgi-bin/ mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160809/5f162c37/attachment-0001.html>