[llvm-dev] [PM] I think that the new PM needs to learn about inter-analysis dependencies...

Thu Jul 14 20:07:43 PDT 2016

----- Original Message -----

> From: "Sean Silva" <chisophugis at gmail.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "Xinliang David Li" <davidxl at google.com>, "llvm-dev"
> <llvm-dev at lists.llvm.org>, "Davide Italiano"
> <dccitaliano at gmail.com>, "Tim Amini Golling"
> <mehdi.amini at apple.com>, "Sanjoy Das"
> <sanjoy at playingwithpointers.com>, "Pete Cooper"
> <peter_cooper at apple.com>, "Chandler Carruth" <chandlerc at gmail.com>
> Sent: Thursday, July 14, 2016 10:04:37 PM
> Subject: Re: [PM] I think that the new PM needs to learn about
> inter-analysis dependencies...

> On Thu, Jul 14, 2016 at 7:58 PM, Hal Finkel < hfinkel at anl.gov >
> wrote:

> > > From: "Sean Silva" < chisophugis at gmail.com >
> > 
> 
> > > To: "Hal Finkel" < hfinkel at anl.gov >
> > 
> 
> > > Cc: "Xinliang David Li" < davidxl at google.com >, "llvm-dev" <
> > > llvm-dev at lists.llvm.org >, "Davide Italiano" <
> > > dccitaliano at gmail.com
> > > >, "Tim Amini Golling" < mehdi.amini at apple.com >, "Sanjoy Das" <
> > > sanjoy at playingwithpointers.com >, "Pete Cooper" <
> > > peter_cooper at apple.com >, "Chandler Carruth" <
> > > chandlerc at gmail.com
> > > >
> > 
> 
> > > Sent: Thursday, July 14, 2016 9:48:44 PM
> > 
> 
> > > Subject: Re: [PM] I think that the new PM needs to learn about
> > > inter-analysis dependencies...
> > 
> 

> > > On Thu, Jul 14, 2016 at 7:21 PM, Hal Finkel < hfinkel at anl.gov >
> > > wrote:
> > 
> 

> > > > Hi Sean,
> > > 
> > 
> 

> > > > Thanks for writing all of this up. I'll go back to my previous
> > > > position: we need a general dependency graph built as the
> > > > analysis
> > > > cache is used. It should have the following properties:
> > > 
> > 
> 

> > > > 1. When we call getResult or getCachedResult on an analysis
> > > > manager,
> > > > we record a dependency of the current pass on the returned
> > > > result.
> > > 
> > 
> 
> > > > 2. This dependency needs to be stored such that it can be used
> > > > to
> > > > invalidate the current result when the returned result is
> > > > invalidates and so that the dependency can be deleted when the
> > > > current result is invalidated.
> > > 
> > 
> 

> > > > As I understand the problem, this is a fully-general solution.
> > > > I
> > > > see
> > > > no reason not to have a fully-general solution.
> > > 
> > 
> 

> > > Yeah, the mechanics of maintaining this fully general mapping are
> > > straightforward in the abstract (once we have a single analysis
> > > manager for all IRUnitT's). Essentially, the analysis manager
> > > maintains a stack of ( Analysis, IRUnit) that it is currently
> > > computing and pushes/pops the stack as it (re-)enters/exits
> > > get{,Cached}Result. If the stack is not empty (suppose top of
> > > stack
> > > is `(AnalysisFoo, IRUnitBar)`), then when we go to push
> > > `(AnalysisBaz, IRUnitQux)` we record a dependency that the result
> > > cached on key `(AnalysisBaz, IRUnitQux)` must be invalidated if
> > > the
> > > result cached on key `(AnalysisFoo, IRUnitBar)` is invalidated.
> > 
> 

> > > The difficult part is what guarantees we provide about things
> > > being
> > > stale or not and how we invalidate when IRUnit's are deleted or
> > > created.
> > 
> 
> > > For example, suppose a function pass DCE's a call which causes an
> > > SCC
> > > Foo of 3 functions to no longer be an SCC. When/how do analyses
> > > cached on Foo get invalidated? And is it valid to query them? One
> > > of
> > > the expected use cases (I'm told) for CGSCC passes is to
> > > propagate
> > > function-attribute like things, so these are being potentially
> > > queried by that same function pass.
> > 
> 
> > I don't understand this example. Just because the function removes
> > the call edge, theoretically breaking the SCC, will that
> > automatically trigger an update of the CG data structure? I'd think
> > that the CG itself won't be updated until the function pass is done
> > (during which time the old CG will still be a valid
> > over-approximation).
> 
> This is the kind of restriction I was talking about. I.e. the
> information computed by a CGSCC analysis must be such that queries
> to it remain a conservative approximation in the face of some
> specified set of transformations to functions in the SCC (such as
> deleting calls). Presumably function passes would then be required
> to adhere to these limitations (or else do invalidation manually).
This sounds reasonable to me. 

-Hal 

> -- Sean Silva

> > -Hal
> 

> > > -- Sean Silva
> > 
> 

> > > > Thanks again,
> > > 
> > 
> 
> > > > Hal
> > > 
> > 
> 

> > > > > From: "Sean Silva" < chisophugis at gmail.com >
> > > > 
> > > 
> > 
> 
> > > > > To: "Chandler Carruth" < chandlerc at gmail.com >
> > > > 
> > > 
> > 
> 
> > > > > Cc: "Xinliang David Li" < davidxl at google.com >, "llvm-dev" <
> > > > > llvm-dev at lists.llvm.org >, "Davide Italiano" <
> > > > > dccitaliano at gmail.com
> > > > > >, "Tim Amini Golling" < mehdi.amini at apple.com >, "Hal
> > > > > >Finkel"
> > > > > ><
> > > > > hfinkel at anl.gov >, "Sanjoy Das" <
> > > > > sanjoy at playingwithpointers.com
> > > > > >,
> > > > > "Pete Cooper" < peter_cooper at apple.com >
> > > > 
> > > 
> > 
> 
> > > > > Sent: Thursday, July 14, 2016 4:11:58 AM
> > > > 
> > > 
> > 
> 
> > > > > Subject: Re: [PM] I think that the new PM needs to learn
> > > > > about
> > > > > inter-analysis dependencies...
> > > > 
> > > 
> > 
> 

> > > > > On Thu, Jul 14, 2016 at 12:51 AM, Sean Silva <
> > > > > chisophugis at gmail.com
> > > > > > wrote:
> > > > 
> > > 
> > 
> 

> > > > > > On Wed, Jul 13, 2016 at 1:48 AM, Sean Silva <
> > > > > > chisophugis at gmail.com
> > > > > > >
> > > > > > wrote:
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > On Wed, Jul 13, 2016 at 12:34 AM, Chandler Carruth <
> > > > > > > chandlerc at gmail.com > wrote:
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > On Wed, Jul 13, 2016 at 12:25 AM Sean Silva <
> > > > > > > > chisophugis at gmail.com
> > > > > > > > >
> > > > > > > > wrote:
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > On Tue, Jul 12, 2016 at 11:39 PM, Chandler Carruth <
> > > > > > > > > chandlerc at gmail.com > wrote:
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > > On Tue, Jul 12, 2016 at 11:34 PM Sean Silva <
> > > > > > > > > > chisophugis at gmail.com
> > > > > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > > > On Tue, Jul 12, 2016 at 11:32 PM, Xinliang David
> > > > > > > > > > > Li
> > > > > > > > > > > <
> > > > > > > > > > > davidxl at google.com > wrote:
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > > > > On Tue, Jul 12, 2016 at 10:57 PM, Chandler
> > > > > > > > > > > > Carruth
> > > > > > > > > > > > <
> > > > > > > > > > > > chandlerc at gmail.com > wrote:
> > > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > > > > > Yea, this is a nasty problem.
> > > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > > > > > One important thing to understand is that
> > > > > > > > > > > > > this
> > > > > > > > > > > > > is
> > > > > > > > > > > > > specific
> > > > > > > > > > > > > to
> > > > > > > > > > > > > analyses which hold references to other
> > > > > > > > > > > > > analyses.
> > > > > > > > > > > > > While
> > > > > > > > > > > > > this
> > > > > > > > > > > > > isn't
> > > > > > > > > > > > > unheard of, it isn't as common as it could
> > > > > > > > > > > > > be.
> > > > > > > > > > > > > Still,
> > > > > > > > > > > > > definitely
> > > > > > > > > > > > > something we need to address.
> > > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > > > > We can call this type of dependencies (holding
> > > > > > > > > > > > references)
> > > > > > > > > > > > hard-dependency. The soft dependency refers to
> > > > > > > > > > > > the
> > > > > > > > > > > > case
> > > > > > > > > > > > where
> > > > > > > > > > > > analysis 'A' depends on 'B' during computation,
> > > > > > > > > > > > but
> > > > > > > > > > > > does
> > > > > > > > > > > > not
> > > > > > > > > > > > need
> > > > > > > > > > > > 'B' once it is computed.
> > > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > > > > There are actually quite a few examples of
> > > > > > > > > > > > hard-dependency
> > > > > > > > > > > > case.
> > > > > > > > > > > > For
> > > > > > > > > > > > instance LoopAccessInfo, LazyValueInfo etc
> > > > > > > > > > > > which
> > > > > > > > > > > > hold
> > > > > > > > > > > > references
> > > > > > > > > > > > to
> > > > > > > > > > > > other analyses.
> > > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > > > > Problem involving hard-dependency is actually
> > > > > > > > > > > > easier
> > > > > > > > > > > > to
> > > > > > > > > > > > detect,
> > > > > > > > > > > > as
> > > > > > > > > > > > it
> > > > > > > > > > > > is usually a compile time problem. Issues
> > > > > > > > > > > > involving
> > > > > > > > > > > > soft
> > > > > > > > > > > > dependencies are more subtle and can lead to
> > > > > > > > > > > > wrong
> > > > > > > > > > > > code
> > > > > > > > > > > > gen.
> > > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > > > Did you mean to say that soft-dependency problems
> > > > > > > > > > > are
> > > > > > > > > > > easier
> > > > > > > > > > > to
> > > > > > > > > > > detect? At least my intuition is that
> > > > > > > > > > > soft-dependency
> > > > > > > > > > > is
> > > > > > > > > > > easier
> > > > > > > > > > > because there is no risk of dangling pointers to
> > > > > > > > > > > other
> > > > > > > > > > > analyses.
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > > The issue is that the fact that there is *any*
> > > > > > > > > > dependency
> > > > > > > > > > isn't
> > > > > > > > > > clear.
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > > However, I think the only real problem here are
> > > > > > > > > > these
> > > > > > > > > > "hard
> > > > > > > > > > dependencies" (I don't really like that term
> > > > > > > > > > though).
> > > > > > > > > > For
> > > > > > > > > > others,
> > > > > > > > > > only an analysis that is *explicitly* preserved
> > > > > > > > > > survives.
> > > > > > > > > > So
> > > > > > > > > > I'm
> > > > > > > > > > not
> > > > > > > > > > worried about the fact that people have to remember
> > > > > > > > > > this.
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > > The question is how often there are
> > > > > > > > > > cross-data-structure
> > > > > > > > > > references.
> > > > > > > > > > David mentions a few examples, and I'm sure there
> > > > > > > > > > are
> > > > > > > > > > more,
> > > > > > > > > > but
> > > > > > > > > > it
> > > > > > > > > > isn't clear to me yet whether this is pervasive or
> > > > > > > > > > occasional.
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > I just did a quick run-through of PassRegistry.def
> > > > > > > > > and
> > > > > > > > > this
> > > > > > > > > is
> > > > > > > > > what
> > > > > > > > > I
> > > > > > > > > found:
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > Module analyses: 0/5 hold pointers to other analyses
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > CallGraph: No pointers to other analyses.
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > LazyCallGraph: No pointers to other analyses.
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > ProfileSummaryAnalysis: No pointers to other
> > > > > > > > > analyses.
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > TargetLibraryAnalysis: No pointers to other analyses.
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > VerifierAnalysis: No pointers to other analyses.
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > Module alias analyses: 1/1 keeps pointer to other
> > > > > > > > > analysis.
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > GlobalsAA: Result keeps pointer to TLI (this is a
> > > > > > > > > function
> > > > > > > > > analysis).
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > Function analyses: 9/17 keep pointers to other
> > > > > > > > > analysis
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > AAManager: Its Result holds TLI pointer and pointers
> > > > > > > > > to
> > > > > > > > > individual
> > > > > > > > > AA
> > > > > > > > > result objects.
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > AssumptionAnalysis: No pointers to other analyses.
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > BlockFrequencyAnalysis: Its Result holds pointers to
> > > > > > > > > LoopInfo
> > > > > > > > > and
> > > > > > > > > BPI.
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > BranchProbabilityAnalysis: Stores no pointers to
> > > > > > > > > other
> > > > > > > > > analyses.
> > > > > > > > > (uses LoopInfo to "recalculate" though)
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > DominatorTreeAnalysis: Stores no pointers to other
> > > > > > > > > analyses.
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > PostDominatorTreeAnalysis: Stores no pointers to
> > > > > > > > > other
> > > > > > > > > analyses.
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > DemandedBitsAnalysis: Stores pointers to
> > > > > > > > > AssumptionCache
> > > > > > > > > and
> > > > > > > > > DominatorTree
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > DominanceFrontierAnalysis: Stores no pointers to
> > > > > > > > > other
> > > > > > > > > analyses.
> > > > > > > > > (uses DominatorTreeAnalysis for "recalculate"
> > > > > > > > > though).
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > LoopInfo: Uses DominatorTreeAnalysis for
> > > > > > > > > "recalculate"
> > > > > > > > > but
> > > > > > > > > stores
> > > > > > > > > no
> > > > > > > > > pointers.
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > LazyValueAnalysis: Stores pointers to
> > > > > > > > > AssumptionCache,
> > > > > > > > > TargetLibraryInfo, DominatorTree.
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > DependenceAnalysis: Stores pointers to AliasAnalysis,
> > > > > > > > > ScalarEvolution, LoopInfo
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > MemoryDependenceAnalysis: Stores pointers to
> > > > > > > > > AliasAnalysis,
> > > > > > > > > AssumptionCache, TargetLibraryInfo, DominatorTree
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > MemorySSAAnalysis: Stores pointers to AliasAnalysis,
> > > > > > > > > DominatorTree
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > RegionInfoAnalysis: Stores pointers to DomTree,
> > > > > > > > > PostDomTree,
> > > > > > > > > DomFrontier
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > ScalarEvolutionAnalysis: Stores pointers to
> > > > > > > > > TargetLibraryInfo,
> > > > > > > > > AssumptionCache, DominatorTree, LoopInfo
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > TargetLibraryAnalysis: Has no dependencies
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > TargetIRAnalysis: Has no dependencies.
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > Function alias analyses: 3/5 keep pointers to other
> > > > > > > > > analyses
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > BasicAA: Keeps pointers to TargetLibraryInfo,
> > > > > > > > > AssumptionCache,
> > > > > > > > > DominatorTree, LoopInfo
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > CFLAA: Keeps pointer to TargetLibraryInfo
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > SCEVAA: Keeps pointer to ScalarEvolution
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > ScopedNoAliasAA: No dependencies
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > TypeBasedAA: No dependencies
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > Total: 13/28 analyses (~50%) hold pointers to other
> > > > > > > > > analyses.
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > Of the 15/28 analyses that don't hold pointers, 12/15
> > > > > > > > > simply
> > > > > > > > > have
> > > > > > > > > no
> > > > > > > > > dependencies. Only 3/15 (BPI, LoopInfo,
> > > > > > > > > DominanceFrontier)
> > > > > > > > > have
> > > > > > > > > dependencies that are used just for a "recalculate"
> > > > > > > > > step
> > > > > > > > > that
> > > > > > > > > retains no pointers.
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > So I think it is fair to say that analyses which hold
> > > > > > > > > pointers
> > > > > > > > > to
> > > > > > > > > other analyses is not an exceptional case. In fact,
> > > > > > > > > analyses
> > > > > > > > > that
> > > > > > > > > use other analyses just for a "recalculate" step
> > > > > > > > > seems
> > > > > > > > > to
> > > > > > > > > be
> > > > > > > > > the
> > > > > > > > > exceptional case (only 3/28 or about 10%)
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > Interesting!
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > Most of these look like they hold a pointer to the root
> > > > > > > > analysis
> > > > > > > > as
> > > > > > > > opposed to detailed objects *inside* the analysis?
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > It might make sense to try to handle this very specific
> > > > > > > > pattern
> > > > > > > > in
> > > > > > > > a
> > > > > > > > special way of overriding the invalidate routines is
> > > > > > > > too
> > > > > > > > error
> > > > > > > > prone.... We could try to make this work
> > > > > > > > "automatically"
> > > > > > > > but
> > > > > > > > I'm
> > > > > > > > worried this would be challenging to get right. Open to
> > > > > > > > suggestions
> > > > > > > > of course.
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > Any other ideas about what would make sense to handle
> > > > > > > > this?
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > Does it make sense to override the invalidate routines
> > > > > > > > now
> > > > > > > > and
> > > > > > > > iterate from there? I feel like you've done a lot of
> > > > > > > > the
> > > > > > > > research
> > > > > > > > necessary for this already...
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > I'll keep pushing forward tomorrow with building
> > > > > > > test-suite
> > > > > > > successfully using the new PM for the LTO pipeline (I was
> > > > > > > doing
> > > > > > > some
> > > > > > > unrelated LLD stuff for most of today). It will be
> > > > > > > interesting
> > > > > > > to
> > > > > > > see how many `invalidate` overrides will be needed to
> > > > > > > avoid
> > > > > > > these
> > > > > > > issues for just the LTO pipeline on test-suite.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > I spent the better part of today working on this and will
> > > > > > continue
> > > > > > tomorrow; this problem seems nastier than I thought. For
> > > > > > some
> > > > > > reason
> > > > > > the LTO pipeline (or something about LTO) seems to hit on
> > > > > > these
> > > > > > issues much more (I'm talking like 40k lines of ASan error
> > > > > > reports
> > > > > > from building test-suite with the LTO pipeline in the new
> > > > > > PM;
> > > > > > per-TU
> > > > > > steps still using the old PM). Some notes:
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > - BasicAA's dependence on domtree and loopinfo in the new
> > > > > > PM
> > > > > > seems
> > > > > > to
> > > > > > account for quite a few of the problems.
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > - BasicAA and other stuff are marked (by overriding
> > > > > > `invalidate`
> > > > > > to
> > > > > > return false) to never be invalidated because they are
> > > > > > "stateless".
> > > > > > However they still hold pointers and so they do need to be
> > > > > > invalidated.
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > - CallGraph uses AssertingVH (PR28400) and so I needed a
> > > > > > workaround
> > > > > > similar to r274656 in various passes.
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > - D21921 is holding up -- I haven't hit any issues with the
> > > > > > core
> > > > > > logic of that patch.
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > - AAResults holds handles to various AA result objects.
> > > > > > This
> > > > > > means
> > > > > > it
> > > > > > pretty much always needs to be invalidated unless you are
> > > > > > sure
> > > > > > that
> > > > > > none of the AA's will get invalidated.
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > The existing `invalidate` method doesn't have the right
> > > > > > semantics
> > > > > > for
> > > > > > even an error-prone solution :( We are going to need to
> > > > > > make
> > > > > > some
> > > > > > significant changes to even get basic sanity I think.
> > > > > > Perhaps
> > > > > > each
> > > > > > analysis can expose a "preserve" static function. E.g.
> > > > > > instead
> > > > > > of
> > > > > > `PA.preserve<Foo>();` you have to do
> > > > > > `Foo::setPreserved(PA);`.
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > I'm actually not quite sure that that will even work. Once
> > > > > > I
> > > > > > have
> > > > > > test-suite fully building successfully with the LTO
> > > > > > pipeline
> > > > > > in
> > > > > > the
> > > > > > new PM I'll be able to give a more confident answer (esp.
> > > > > > w.r.t.
> > > > > > the
> > > > > > manager for different IRUnitT's).
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > But at this point I'm not confident running *any* pass
> > > > > > pipeline
> > > > > > in
> > > > > > the new PM without at least assertions+ASan.
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > We may want to have a proper design discussion around this
> > > > > > problem
> > > > > > though.
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > Also I'd like to have test-suite working (by hook or by
> > > > > > crook)
> > > > > > with
> > > > > > LTO in the new PM so we can get some numbers on the
> > > > > > resident
> > > > > > set
> > > > > > impact of all this caching; if it is really problematic
> > > > > > then
> > > > > > we
> > > > > > may
> > > > > > need to start talking front-and-center about different
> > > > > > invalidation
> > > > > > policies for keeping this in check instead of leaving it as
> > > > > > something that we will be able to patch later.
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > The more I think about it, the more I'm convinced that the
> > > > > > real
> > > > > > "hard" problem that the new PM is exposing us to is having
> > > > > > the
> > > > > > ability for any pass to ask for any analysis on any IRUnitT
> > > > > > (and
> > > > > > any
> > > > > > specific IRUnit of that IRUnitT) and have the result stored
> > > > > > somewhere and then invalidated. This means that
> > > > > > "getAnalysisUsage"
> > > > > > is not just a list of passes, but much more complicated and
> > > > > > is
> > > > > > essentially a set of arbitrary pairs "(analysis, IRUnit)"
> > > > > > (and
> > > > > > the
> > > > > > associated potential tangle of dependencies between the
> > > > > > state
> > > > > > cached
> > > > > > on these tuples). With the old PM, you essentially are
> > > > > > looking
> > > > > > at
> > > > > > a
> > > > > > problem of scheduling the lifetime of analyses of the same
> > > > > > IRUnit
> > > > > > intermingled with transformation passes on that same
> > > > > > IRUnit,
> > > > > > so
> > > > > > you
> > > > > > only have the "analysis" part of the tuple above, making
> > > > > > things
> > > > > > much
> > > > > > simpler (and handling dependencies is much simpler too).
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > To clarify, it seems like the current new PM is essentially
> > > > > trying
> > > > > to
> > > > > solve the problem of maintaining/updating a mapping:
> > > > 
> > > 
> > 
> 
> > > > > (Analysis, IRUnit) -> AnalysisResult
> > > > 
> > > 
> > 
> 
> > > > > where the AnalysisResult's can have an arbitrary dependency
> > > > > on
> > > > > an
> > > > > arbitrary set of other AnalysisResult's currently maintained
> > > > > in
> > > > > this
> > > > > mapping. In order to invalidate any AnalysisResult you need
> > > > > to
> > > > > invalidate all AnalysisResult's that transitively depend on
> > > > > it.
> > > > > Therefore the right-hand side of this mapping needs to be
> > > > > something
> > > > > like `(AnalysisResult, SetOfDependents)`.
> > > > 
> > > 
> > 
> 
> > > > > So the mapping is really `(Analysis, IRUnit) ->
> > > > > (AnalysisResult,
> > > > > SetOfDependents)`
> > > > 
> > > 
> > 
> 
> > > > > Also, this mapping can be updated at any point during the
> > > > > execution
> > > > > of a transformation pass (and various other places) and must
> > > > > stay
> > > > > correct as the IR is changed (more on this below).
> > > > 
> > > 
> > 
> 
> > > > > For example, you might have something like:
> > > > 
> > > 
> > 
> 
> > > > > (DominatorTreeAnalysis, function @foo) -> (DominatorTree for
> > > > > @foo,
> > > > > [(DemandedBitsAnalysis, function @foo)])
> > > > 
> > > 
> > 
> 
> > > > > (AssumptionAnalysis, function @foo) -> (AssumptionCache for
> > > > > @foo,
> > > > > [(DemandedBitsAnalysis, function @foo)])
> > > > 
> > > 
> > 
> 
> > > > > (DemandedBitsAnalysis, function @foo) -> (DemandedBits for
> > > > > @foo,
> > > > > [])
> > > > 
> > > 
> > 
> 
> > > > > (AssumptionAnalysis, function @bar) -> (AssumptionCache for
> > > > > @bar,
> > > > > [(SomeModuleAnalysis, module TheModule)])
> > > > 
> > > 
> > 
> 
> > > > > (AssumptionAnalysis, function @baz) -> (AssumptionCache for
> > > > > @baz,
> > > > > [(SomeModuleAnalysis, module TheModule)])
> > > > 
> > > 
> > 
> 
> > > > > (SomeModuleAnalysis, module TheModule) ->
> > > > > (SomeModuleAnalysisResult
> > > > > for TheModule, [(SomeFunctionAnalysis, function @baz)])
> > > > 
> > > 
> > 
> 
> > > > > (SomeFunctionAnalysis, function @baz) ->
> > > > > (SomeFunctionAnalysisResult
> > > > > for @baz, [])
> > > > 
> > > 
> > 
> 

> > > > > So for example, when a transformation pass invalidates
> > > > > `(AssumptionAnalysis, function @bar)`, we need to walk
> > > > > `(SomeModuleAnalysis, module TheModule)` and
> > > > > `(SomeFunctionAnalysis,
> > > > > function @baz)` to invalidate them.
> > > > 
> > > 
> > 
> 

> > > > > Compare this with the old PM (although like I said we have
> > > > > outgrown
> > > > > this model). Essentially you take the previous mapping, and
> > > > > require
> > > > > IRUnit to be a constant at any given point in time. Hence the
> > > > > mapping is essentially
> > > > 
> > > 
> > 
> 
> > > > > Analysis -> AnalysisResult
> > > > 
> > > 
> > 
> 
> > > > > Since this is 1:1 there is no real distinction between the
> > > > > Analysis
> > > > > and the AnalysisResult (and as part of transitioning to the
> > > > > new
> > > > > PM
> > > > > this has had to be untangled).
> > > > 
> > > 
> > 
> 
> > > > > This also makes the dependencies simpler since you just have
> > > > > a
> > > > > set
> > > > > of
> > > > > "what analyses have been run at this point". You just need to
> > > > > run
> > > > > the analyses individually and make sure they are in the right
> > > > > order.
> > > > > Also, getAnalysis just takes the Analysis to get the
> > > > > AnalysisResult
> > > > > which makes it simpler -- you just query which analyses are
> > > > > live.
> > > > 
> > > 
> > 
> 

> > > > > Also, the mapping `(Analysis, IRUnit) -> (AnalysisResult,
> > > > > SetOfDependents)` that the new PM is essentially trying to
> > > > > keep
> > > > > is
> > > > > even more complicated because for e.g. Loop and CGSCC passes
> > > > > the
> > > > > IRUnit itself is an object created by an analysis and subject
> > > > > to
> > > > > invalidation of that analysis as the IR changes underneath
> > > > > it.
> > > > 
> > > 
> > 
> 

> > > > > And then there is the question of at what points must this
> > > > > mapping
> > > > > be
> > > > > valid (i.e. no stale analysis results, no dangling pointers,
> > > > > etc.)
> > > > > and when the transitive invalidation walking happens.
> > > > > Evidently
> > > > > while a transformation pass is running, things might
> > > > > temporarily
> > > > > be
> > > > > stale; what are the "checkpoints" where the mapping is
> > > > > guaranteed
> > > > > to
> > > > > be valid? At the start of each transformation pass? At least
> > > > > Chandler's D21464 does not stick to this because the IRUnit's
> > > > > (SCC's) are only updated at the end of running potentially
> > > > > many
> > > > > function transformation passes. I.e. all but the first
> > > > > function
> > > > > transformation pass might observe stale IRUnit's (SCC's).
> > > > 
> > > 
> > 
> 

> > > > > One other thing to note is that soft-dependencies (using
> > > > > David's
> > > > > terminology) don't require this kind of dependency tracking.
> > > > > An
> > > > > analysis result can be cached even though its
> > > > > soft-dependencies
> > > > > are
> > > > > not cached. And invalidation of soft-dependencies does not
> > > > > require
> > > > > invalidating the soft-dependents. Actually, this makes it the
> > > > > terminology "soft" and "hard' quite natural; "hard" requires
> > > > > an
> > > > > edge
> > > > > to track the dependency for invalidation purposes, "soft"
> > > > > does
> > > > > not.
> > > > 
> > > 
> > 
> 

> > > > > This is all quite general. Perhaps too much. We clearly need
> > > > > to
> > > > > go
> > > > > beyond the old PM's model, but we may not need to go to the
> > > > > fully
> > > > > general case. Is there a good middle-ground that meets our
> > > > > needs?
> > > > > What restrictions would we be willing to live with in order
> > > > > to
> > > > > make
> > > > > it easier? The first one on my list is to not have the
> > > > > IRUnit's
> > > > > themselves depend on analyses. Like Chandler mentioned on
> > > > > D21921
> > > > > this has the effect of e.g. preventing caching across the
> > > > > intervening module pass in a case like
> > > > > `module(cgscc(require<foo-cgscc-analysis>),some-module-pass-that-makes-no-changes,cgscc(some-cgscc-pass-that-uses-foo-cgscc-analysis))`
> > > > > but that seems like a restriction we can live with.
> > > > 
> > > 
> > 
> 

> > > > > Again, sorry for the braindump.
> > > > 
> > > 
> > 
> 

> > > > > -- Sean Silva
> > > > 
> > > 
> > 
> 

> > > > > > We've obviously outgrown this model with examples like LAA,
> > > > > > AssumptionCacheTracker, etc. that hack around this in the
> > > > > > old
> > > > > > PM.
> > > > > > We
> > > > > > may want to have a fresh re-examination of what problems we
> > > > > > are
> > > > > > exactly trying to solve.
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > For me, my main concern now is what changes need to be made
> > > > > > in
> > > > > > order
> > > > > > to feel confident running a pipeline in the new PM without
> > > > > > assertions+ASan.
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > Sorry for the long post, just brain-dumping before heading
> > > > > > home.
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > -- Sean Silva
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > -- Sean Silva
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > --
> > > 
> > 
> 

> > > > Hal Finkel
> > > 
> > 
> 
> > > > Assistant Computational Scientist
> > > 
> > 
> 
> > > > Leadership Computing Facility
> > > 
> > 
> 
> > > > Argonne National Laboratory
> > > 
> > 
> 

> > --
> 

> > Hal Finkel
> 
> > Assistant Computational Scientist
> 
> > Leadership Computing Facility
> 
> > Argonne National Laboratory
> 

-- 

Hal Finkel 
Assistant Computational Scientist 
Leadership Computing Facility 
Argonne National Laboratory 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160714/b3c7e484/attachment-0001.html>