[llvm-dev] [PM] I think that the new PM needs to learn about inter-analysis dependencies...

Wed Jul 13 00:39:42 PDT 2016

----- Original Message -----

> From: "Sean Silva" <chisophugis at gmail.com>
> To: "Chandler Carruth" <chandlerc at gmail.com>
> Cc: "Xinliang David Li" <davidxl at google.com>, "llvm-dev"
> <llvm-dev at lists.llvm.org>, "Davide Italiano"
> <dccitaliano at gmail.com>, "Tim Amini Golling"
> <mehdi.amini at apple.com>, "Hal Finkel" <hfinkel at anl.gov>, "Sanjoy
> Das" <sanjoy at playingwithpointers.com>, "Pete Cooper"
> <peter_cooper at apple.com>
> Sent: Wednesday, July 13, 2016 2:25:52 AM
> Subject: Re: [PM] I think that the new PM needs to learn about
> inter-analysis dependencies...

> On Tue, Jul 12, 2016 at 11:39 PM, Chandler Carruth <
> chandlerc at gmail.com > wrote:

> > On Tue, Jul 12, 2016 at 11:34 PM Sean Silva < chisophugis at gmail.com
> > >
> > wrote:
> 

> > > On Tue, Jul 12, 2016 at 11:32 PM, Xinliang David Li <
> > > davidxl at google.com > wrote:
> > 
> 

> > > > On Tue, Jul 12, 2016 at 10:57 PM, Chandler Carruth <
> > > > chandlerc at gmail.com > wrote:
> > > 
> > 
> 

> > > > > Yea, this is a nasty problem.
> > > > 
> > > 
> > 
> 

> > > > > One important thing to understand is that this is specific to
> > > > > analyses which hold references to other analyses. While this
> > > > > isn't
> > > > > unheard of, it isn't as common as it could be. Still,
> > > > > definitely
> > > > > something we need to address.
> > > > 
> > > 
> > 
> 
> > > > We can call this type of dependencies (holding references)
> > > > hard-dependency. The soft dependency refers to the case where
> > > > analysis 'A' depends on 'B' during computation, but does not
> > > > need
> > > > 'B' once it is computed.
> > > 
> > 
> 

> > > > There are actually quite a few examples of hard-dependency
> > > > case.
> > > > For
> > > > instance LoopAccessInfo, LazyValueInfo etc which hold
> > > > references
> > > > to
> > > > other analyses.
> > > 
> > 
> 

> > > > Problem involving hard-dependency is actually easier to detect,
> > > > as
> > > > it
> > > > is usually a compile time problem. Issues involving soft
> > > > dependencies are more subtle and can lead to wrong code gen.
> > > 
> > 
> 
> > > Did you mean to say that soft-dependency problems are easier to
> > > detect? At least my intuition is that soft-dependency is easier
> > > because there is no risk of dangling pointers to other analyses.
> > 
> 

> > The issue is that the fact that there is *any* dependency isn't
> > clear.
> 

> > However, I think the only real problem here are these "hard
> > dependencies" (I don't really like that term though). For others,
> > only an analysis that is *explicitly* preserved survives. So I'm
> > not
> > worried about the fact that people have to remember this.
> 

> > The question is how often there are cross-data-structure
> > references.
> > David mentions a few examples, and I'm sure there are more, but it
> > isn't clear to me yet whether this is pervasive or occasional.
> 
> I just did a quick run-through of PassRegistry.def and this is what I
> found:

> Module analyses: 0/5 hold pointers to other analyses
> CallGraph: No pointers to other analyses.
> LazyCallGraph: No pointers to other analyses.
> ProfileSummaryAnalysis: No pointers to other analyses.
> TargetLibraryAnalysis: No pointers to other analyses.

> VerifierAnalysis: No pointers to other analyses.

> Module alias analyses: 1/1 keeps pointer to other analysis.
> GlobalsAA: Result keeps pointer to TLI (this is a function analysis).

> Function analyses: 9/17 keep pointers to other analysis
> AAManager: Its Result holds TLI pointer and pointers to individual AA
> result objects.
> AssumptionAnalysis: No pointers to other analyses.

> BlockFrequencyAnalysis: Its Result holds pointers to LoopInfo and
> BPI.

> BranchProbabilityAnalysis: Stores no pointers to other analyses.
> (uses LoopInfo to "recalculate" though)

> DominatorTreeAnalysis: Stores no pointers to other analyses.

> PostDominatorTreeAnalysis: Stores no pointers to other analyses.
> DemandedBitsAnalysis: Stores pointers to AssumptionCache and
> DominatorTree

> DominanceFrontierAnalysis: Stores no pointers to other analyses.
> (uses DominatorTreeAnalysis for "recalculate" though).

> LoopInfo: Uses DominatorTreeAnalysis for "recalculate" but stores no
> pointers.

> LazyValueAnalysis: Stores pointers to AssumptionCache,
> TargetLibraryInfo, DominatorTree.

> DependenceAnalysis: Stores pointers to AliasAnalysis,
> ScalarEvolution, LoopInfo
> MemoryDependenceAnalysis: Stores pointers to AliasAnalysis,
> AssumptionCache, TargetLibraryInfo, DominatorTree

> MemorySSAAnalysis: Stores pointers to AliasAnalysis, DominatorTree

> RegionInfoAnalysis: Stores pointers to DomTree, PostDomTree,
> DomFrontier

> ScalarEvolutionAnalysis: Stores pointers to TargetLibraryInfo,
> AssumptionCache, DominatorTree, LoopInfo
> TargetLibraryAnalysis: Has no dependencies

> TargetIRAnalysis: Has no dependencies.

> Function alias analyses: 3/5 keep pointers to other analyses
> BasicAA: Keeps pointers to TargetLibraryInfo, AssumptionCache,
> DominatorTree, LoopInfo
> CFLAA: Keeps pointer to TargetLibraryInfo
> SCEVAA: Keeps pointer to ScalarEvolution
> ScopedNoAliasAA: No dependencies

> TypeBasedAA: No dependencies

> Total: 13/28 analyses (~50%) hold pointers to other analyses.
> Of the 15/28 analyses that don't hold pointers, 12/15 simply have no
> dependencies. Only 3/15 (BPI, LoopInfo, DominanceFrontier) have
> dependencies that are used just for a "recalculate" step that
> retains no pointers.
> So I think it is fair to say that analyses which hold pointers to
> other analyses is not an exceptional case. In fact, analyses that
> use other analyses just for a "recalculate" step seems to be the
> exceptional case (only 3/28 or about 10%)

Interesting. I'm not sure this is the right metric, however. There are lots of analyses that hold pointers to other analyses but don't need to. The analysis handle itself can be reacquired lazily if we care to do so. What's truly problematic is holding pointers into another analysis's data structures. To be concrete, holding a pointer to ScalarEvolution is not a fundamental problem because we could make the analysis reacquire the pointer at the start of every query. Holding SCEV* is the problem. 

FWIW, I still think this is common enough to design a solution that makes it easy to get this right. 

-Hal 

> Since I like to visualize things, here is a quick graph of the
> dependencies between analyses which hold pointers to each other.
> Edge A -> B indicates "A can/does hold a pointer to B".
> dot file: http://reviews.llvm.org/P6603
> rendering: http://reviews.llvm.org/F2161058

> (I've been a bit loose with terminology here. A lot of the times that
> I say "analysis" I mean "the analysis' result object" or I use the
> name of the analysis interchangeably with the analysis result
> object)

> -- Sean Silva

> > And even then it isn't clear how onerous explicitly managing this
> > in
> > invalidate overrides will be.
> 

> > > -- Sean Silva
> > 
> 

> > > > David
> > > 
> > 
> 

> > > > > Some ideas about mitigating and fixing it below.
> > > > 
> > > 
> > 
> 

> > > > > On Tue, Jul 12, 2016 at 6:15 PM Sean Silva <
> > > > > chisophugis at gmail.com
> > > > > >
> > > > > wrote:
> > > > 
> > > 
> > 
> 

> > > > > > How should we solve this? I see two potential solutions:
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > 1. Analyses must somehow list the analyses they depend on
> > > > > > (either
> > > > > > by
> > > > > > overriding "invalidate" to make sure that they invalidate
> > > > > > them,
> > > > > > or
> > > > > > something "declarative" that would allow the
> > > > > > AnalysisManager
> > > > > > to
> > > > > > walk
> > > > > > the transitive dependencies).
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > I think this is the right approach. I would personally start
> > > > > by
> > > > > overriding the invalidate callback everywhere that it is
> > > > > necessary,
> > > > > and see how bad that becomes.
> > > > 
> > > 
> > 
> 

> > > > > If it becomes common and burdensome, then we can change the
> > > > > way
> > > > > invalidation works such that the analysis manager is aware of
> > > > > the
> > > > > preserved analysis set in more detail, and have it build up
> > > > > the
> > > > > necessary data structures to know in-advance whether it must
> > > > > make
> > > > > an
> > > > > explicit invalidate call.
> > > > 
> > > 
> > 
> 

> > > > > However, I suspect this may not be *too* bad for two reasons:
> > > > 
> > > 
> > 
> 

> > > > > a) As I mentioned above, I'm hoping there aren't *too* many
> > > > > handles
> > > > > between different analyses. But I've not done a careful
> > > > > examination,
> > > > > so we can check this.
> > > > 
> > > 
> > 
> 

> > > > > b) For many analyses that might trigger this, I think we have
> > > > > a
> > > > > simpler option. If the analysis is *immutable* for any reason
> > > > > --
> > > > > that is, it overrides its invalidate routine to always return
> > > > > "false" the way TargetLibraryInfo should (although I'm not
> > > > > sure
> > > > > it
> > > > > does currently), we shouldn't need to do this as it shouldn't
> > > > > be
> > > > > getting cleared out. Does this make sense? Do others see
> > > > > anything
> > > > > I'm missing with that approach?
> > > > 
> > > 
> > 
> 

> > > > > > 2. The AnalysisManager must do a somewhat complicated dance
> > > > > > to
> > > > > > track
> > > > > > when analyses call back into it in order to get other
> > > > > > analyses.
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > I would really rather avoid this, as currently the analysis
> > > > > manager's
> > > > > logic here is very simple, and in many cases we only need the
> > > > > analyses to *compute* our result, not to embed it. I'm
> > > > > tihnking
> > > > > of
> > > > > stuff like Dominators is used to build LoopInfo, but there
> > > > > isn't
> > > > > a
> > > > > stale handle there.
> > > > 
> > > 
> > 
> 

> > > > > There is another aspect of course in that if something is
> > > > > preserving
> > > > > LoopInfo, it really should be preserving Dominators too...
> > > > 
> > > 
> > 
> 
-- 

Hal Finkel 
Assistant Computational Scientist 
Leadership Computing Facility 
Argonne National Laboratory 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160713/8fd44f0a/attachment.html>