[llvm-dev] [PM] I think that the new PM needs to learn about inter-analysis dependencies...

Thu Jul 14 20:30:38 PDT 2016

On Thu, Jul 14, 2016 at 8:04 PM, Chandler Carruth <chandlerc at gmail.com>
wrote:

> We need better terminology to talk about this. I propose:
>
> analysis-dependencies: analysis A uses result of analysis B when *running*
> an analysis and not used by the result
> query-dependencies: result of analysis A uses result of analysis B when
> evaluating a query
> data-structure-depnedencies: result of analysis A uses data structures
> from the result of analysis B inside its own data structures
>
> I think these are much more precise than "hard" or "soft" and expose more
> facets.
>
> For analysis-depnedencies, I continue to think they work correctly. If a
> transformation claims it preserves an analysis, it must actually know this
> to be true. I don't see any actual problems here in practice today, and
> this isn't actually something changed in the new PM.
>
> For data-structure-dependencies, the investigation done by Sean seems to
> clearly show these to be rare, and I think having their invalidate methods
> be overridden to test that *all* of the data structures they depend on are
> preserved is the correct approach.
>

This is not enough for basic sanity, such as being able to pass an
arbitrary pass pipeline to opt and know it isn't going to have a
use-after-free bug due to analysis caching problems. Also presumably this
checking would only be enabled with assertions.
By itself, this leaves us in a position where assertions+ASan is needed to
have basic confidence when running a pipeline. That is not a world I want
to live in.

>
> The primary issue seems to be with query-dependencies. These in turn break
> down into two categories:
>
> 1) query-dependencies on "immutable" analysis results. These are results
> that we *never* expect to be invalidated and we just want easy access to.
> TargetLibraryInfo is the classic example here.
>
> 2) query-dependencies on normal analysis results. DominatorTree and and
> AAResults are the primary examples here.
>
> I would like to have a simple way to handle #1 by ensuring invalidation
> doesn't occur. I think this already works, but I'm interested if there is
> actually an issue here.
>

I believe this works as intended. Unfortunately I can't really see any use
of the `invalidate` callback beyond this w.r.t. the problems we are
discussion in this thread (I say this because I tried to fix the issues I
was running into by using it).

-- Sean Silva

>
> We *could* handle #2 much like data-structure-dependencies, but I hear Hal
> and others that this would become burdensome. However, my preference would
> be to instead handle #2 by making result APIs accept direct handles to the
> analysis results they rely on.
>
> The reason for preferring this approach is because I think it makes the
> relationship between these things much more clear to users of the analyses.
>
> I think the most annoying of these to handle are aggregation-style
> analyses results like AAResults. There, I think it might make more sense to
> handle them along the lines of data-structure-dependencies. I don't think
> we have so many of those that this would be a significant burden IMO.
>
> On Thu, Jul 14, 2016 at 7:48 PM Sean Silva <chisophugis at gmail.com> wrote:
>
>> On Thu, Jul 14, 2016 at 7:21 PM, Hal Finkel <hfinkel at anl.gov> wrote:
>>
>>> Hi Sean,
>>>
>>> Thanks for writing all of this up. I'll go back to my previous position:
>>> we need a general dependency graph built as the analysis cache is used. It
>>> should have the following properties:
>>>
>>>  1. When we call getResult or getCachedResult on an analysis manager, we
>>> record a dependency of the current pass on the returned result.
>>>  2. This dependency needs to be stored such that it can be used to
>>> invalidate the current result when the returned result is invalidates and
>>> so that the dependency can be deleted when the current result is
>>> invalidated.
>>>
>>> As I understand the problem, this is a fully-general solution. I see no
>>> reason not to have a fully-general solution.
>>>
>>
>> Yeah, the mechanics of maintaining this fully general mapping are
>> straightforward in the abstract (once we have a single analysis manager for
>> all IRUnitT's). Essentially, the analysis manager maintains a stack of (Analysis,
>> IRUnit) that it is currently computing and pushes/pops the stack as it
>> (re-)enters/exits get{,Cached}Result. If the stack is not empty (suppose
>> top of stack is `(AnalysisFoo, IRUnitBar)`), then when we go to push
>> `(AnalysisBaz, IRUnitQux)` we record a dependency that the result cached on
>> key `(AnalysisBaz, IRUnitQux)` must be invalidated if the result cached on
>> key `(AnalysisFoo, IRUnitBar)` is invalidated.
>>
>> The difficult part is what guarantees we provide about things being stale
>> or not and how we invalidate when IRUnit's are deleted or created.
>> For example, suppose a function pass DCE's a call which causes an SCC Foo
>> of 3 functions to no longer be an SCC. When/how do analyses cached on Foo
>> get invalidated? And is it valid to query them? One of the expected use
>> cases (I'm told) for CGSCC passes is to propagate function-attribute like
>> things, so these are being potentially queried by that same function pass.
>>
>> -- Sean Silva
>>
>>
>>>
>>> Thanks again,
>>> Hal
>>>
>>> ------------------------------
>>>
>>> *From: *"Sean Silva" <chisophugis at gmail.com>
>>> *To: *"Chandler Carruth" <chandlerc at gmail.com>
>>> *Cc: *"Xinliang David Li" <davidxl at google.com>, "llvm-dev" <
>>> llvm-dev at lists.llvm.org>, "Davide Italiano" <dccitaliano at gmail.com>,
>>> "Tim Amini Golling" <mehdi.amini at apple.com>, "Hal Finkel" <
>>> hfinkel at anl.gov>, "Sanjoy Das" <sanjoy at playingwithpointers.com>, "Pete
>>> Cooper" <peter_cooper at apple.com>
>>> *Sent: *Thursday, July 14, 2016 4:11:58 AM
>>> *Subject: *Re: [PM] I think that the new PM needs to learn about
>>> inter-analysis dependencies...
>>>
>>>
>>>
>>> On Thu, Jul 14, 2016 at 12:51 AM, Sean Silva <chisophugis at gmail.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Wed, Jul 13, 2016 at 1:48 AM, Sean Silva <chisophugis at gmail.com>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Wed, Jul 13, 2016 at 12:34 AM, Chandler Carruth <
>>>>> chandlerc at gmail.com> wrote:
>>>>>
>>>>>> On Wed, Jul 13, 2016 at 12:25 AM Sean Silva <chisophugis at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> On Tue, Jul 12, 2016 at 11:39 PM, Chandler Carruth <
>>>>>>> chandlerc at gmail.com> wrote:
>>>>>>>
>>>>>>>> On Tue, Jul 12, 2016 at 11:34 PM Sean Silva <chisophugis at gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> On Tue, Jul 12, 2016 at 11:32 PM, Xinliang David Li <
>>>>>>>>> davidxl at google.com> wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, Jul 12, 2016 at 10:57 PM, Chandler Carruth <
>>>>>>>>>> chandlerc at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Yea, this is a nasty problem.
>>>>>>>>>>>
>>>>>>>>>>> One important thing to understand is that this is specific to
>>>>>>>>>>> analyses which hold references to other analyses. While this isn't unheard
>>>>>>>>>>> of, it isn't as common as it could be. Still, definitely something we need
>>>>>>>>>>> to address.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> We can call this type of dependencies (holding references)
>>>>>>>>>> hard-dependency. The soft dependency refers to the case where analysis 'A'
>>>>>>>>>> depends on 'B' during computation, but does not need 'B' once it is
>>>>>>>>>> computed.
>>>>>>>>>>
>>>>>>>>>> There are actually quite a few examples of hard-dependency case.
>>>>>>>>>> For instance LoopAccessInfo, LazyValueInfo etc which hold references to
>>>>>>>>>> other analyses.
>>>>>>>>>>
>>>>>>>>>> Problem involving hard-dependency is actually easier to detect,
>>>>>>>>>> as it is usually a compile time problem. Issues involving soft dependencies
>>>>>>>>>> are more subtle and can lead to wrong code gen.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Did you mean to say that soft-dependency problems are easier to
>>>>>>>>> detect? At least my intuition is that soft-dependency is easier because
>>>>>>>>> there is no risk of dangling pointers to other analyses.
>>>>>>>>>
>>>>>>>>
>>>>>>>> The issue is that the fact that there is *any* dependency isn't
>>>>>>>> clear.
>>>>>>>>
>>>>>>>> However, I think the only real problem here are these "hard
>>>>>>>> dependencies" (I don't really like that term though). For others, only an
>>>>>>>> analysis that is *explicitly* preserved survives. So I'm not worried about
>>>>>>>> the fact that people have to remember this.
>>>>>>>>
>>>>>>>> The question is how often there are cross-data-structure
>>>>>>>> references. David mentions a few examples, and I'm sure there are more, but
>>>>>>>> it isn't clear to me yet whether this is pervasive or occasional.
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I just did a quick run-through of PassRegistry.def and this is what
>>>>>>> I found:
>>>>>>>
>>>>>>> Module analyses: 0/5 hold pointers to other analyses
>>>>>>> CallGraph: No pointers to other analyses.
>>>>>>> LazyCallGraph: No pointers to other analyses.
>>>>>>> ProfileSummaryAnalysis: No pointers to other analyses.
>>>>>>> TargetLibraryAnalysis: No pointers to other analyses.
>>>>>>> VerifierAnalysis: No pointers to other analyses.
>>>>>>>
>>>>>>> Module alias analyses: 1/1 keeps pointer to other analysis.
>>>>>>> GlobalsAA: Result keeps pointer to TLI (this is a function analysis).
>>>>>>>
>>>>>>> Function analyses: 9/17 keep pointers to other analysis
>>>>>>> AAManager: Its Result holds TLI pointer and pointers to individual
>>>>>>> AA result objects.
>>>>>>> AssumptionAnalysis: No pointers to other analyses.
>>>>>>> BlockFrequencyAnalysis: Its Result holds pointers to LoopInfo and
>>>>>>> BPI.
>>>>>>> BranchProbabilityAnalysis: Stores no pointers to other analyses.
>>>>>>> (uses LoopInfo to "recalculate" though)
>>>>>>> DominatorTreeAnalysis: Stores no pointers to other analyses.
>>>>>>> PostDominatorTreeAnalysis: Stores no pointers to other analyses.
>>>>>>> DemandedBitsAnalysis: Stores pointers to AssumptionCache
>>>>>>> and DominatorTree
>>>>>>> DominanceFrontierAnalysis: Stores no pointers to other analyses.
>>>>>>> (uses DominatorTreeAnalysis for "recalculate" though).
>>>>>>> LoopInfo: Uses DominatorTreeAnalysis for "recalculate" but stores no
>>>>>>> pointers.
>>>>>>> LazyValueAnalysis: Stores pointers to AssumptionCache,
>>>>>>> TargetLibraryInfo, DominatorTree.
>>>>>>> DependenceAnalysis: Stores pointers to AliasAnalysis,
>>>>>>> ScalarEvolution, LoopInfo
>>>>>>> MemoryDependenceAnalysis: Stores pointers to AliasAnalysis,
>>>>>>> AssumptionCache, TargetLibraryInfo, DominatorTree
>>>>>>> MemorySSAAnalysis: Stores pointers to AliasAnalysis, DominatorTree
>>>>>>> RegionInfoAnalysis: Stores pointers to DomTree, PostDomTree,
>>>>>>> DomFrontier
>>>>>>> ScalarEvolutionAnalysis: Stores pointers to TargetLibraryInfo,
>>>>>>> AssumptionCache, DominatorTree, LoopInfo
>>>>>>> TargetLibraryAnalysis: Has no dependencies
>>>>>>> TargetIRAnalysis: Has no dependencies.
>>>>>>>
>>>>>>> Function alias analyses: 3/5 keep pointers to other analyses
>>>>>>> BasicAA: Keeps pointers to TargetLibraryInfo, AssumptionCache,
>>>>>>> DominatorTree, LoopInfo
>>>>>>> CFLAA: Keeps pointer to TargetLibraryInfo
>>>>>>> SCEVAA: Keeps pointer to ScalarEvolution
>>>>>>> ScopedNoAliasAA: No dependencies
>>>>>>> TypeBasedAA: No dependencies
>>>>>>>
>>>>>>>
>>>>>>> Total: 13/28 analyses (~50%) hold pointers to other analyses.
>>>>>>> Of the 15/28 analyses that don't hold pointers, 12/15 simply have no
>>>>>>> dependencies. Only 3/15 (BPI, LoopInfo, DominanceFrontier) have
>>>>>>> dependencies that are used just for a "recalculate" step that retains no
>>>>>>> pointers.
>>>>>>> So I think it is fair to say that analyses which hold pointers to
>>>>>>> other analyses is not an exceptional case. In fact, analyses that use other
>>>>>>> analyses just for a "recalculate" step seems to be the exceptional case
>>>>>>> (only 3/28 or about 10%)
>>>>>>>
>>>>>>
>>>>>> Interesting!
>>>>>>
>>>>>> Most of these look like they hold a pointer to the root analysis as
>>>>>> opposed to detailed objects *inside* the analysis?
>>>>>>
>>>>>> It might make sense to try to handle this very specific pattern in a
>>>>>> special way of overriding the invalidate routines is too error prone.... We
>>>>>> could try to make this work "automatically" but I'm worried this would be
>>>>>> challenging to get right. Open to suggestions of course.
>>>>>>
>>>>>> Any other ideas about what would make sense to handle this?
>>>>>>
>>>>>> Does it make sense to override the invalidate routines now and
>>>>>> iterate from there? I feel like you've done a lot of the research necessary
>>>>>> for this already...
>>>>>>
>>>>>
>>>>> I'll keep pushing forward tomorrow with building test-suite
>>>>> successfully using the new PM for the LTO pipeline (I was doing some
>>>>> unrelated LLD stuff for most of today). It will be interesting to see how
>>>>> many `invalidate` overrides will be needed to avoid these issues for just
>>>>> the LTO pipeline on test-suite.
>>>>>
>>>>
>>>> I spent the better part of today working on this and will continue
>>>> tomorrow; this problem seems nastier than I thought. For some reason the
>>>> LTO pipeline (or something about LTO) seems to hit on these issues much
>>>> more (I'm talking like 40k lines of ASan error reports from building
>>>> test-suite with the LTO pipeline in the new PM; per-TU steps still using
>>>> the old PM). Some notes:
>>>>
>>>> - BasicAA's dependence on domtree and loopinfo in the new PM seems to
>>>> account for quite a few of the problems.
>>>> - BasicAA and other stuff are marked (by overriding `invalidate` to
>>>> return false) to never be invalidated because they are "stateless". However
>>>> they still hold pointers and so they do need to be invalidated.
>>>> - CallGraph uses AssertingVH (PR28400) and so I needed a workaround
>>>> similar to r274656 in various passes.
>>>> - D21921 is holding up -- I haven't hit any issues with the core logic
>>>> of that patch.
>>>> - AAResults holds handles to various AA result objects. This means it
>>>> pretty much always needs to be invalidated unless you are sure that none of
>>>> the AA's will get invalidated.
>>>>
>>>>
>>>> The existing `invalidate` method doesn't have the right semantics for
>>>> even an error-prone solution :( We are going to need to make some
>>>> significant changes to even get basic sanity I think. Perhaps each analysis
>>>> can expose a "preserve" static function. E.g. instead of
>>>> `PA.preserve<Foo>();` you have to do `Foo::setPreserved(PA);`.
>>>> I'm actually not quite sure that that will even work. Once I have
>>>> test-suite fully building successfully with the LTO pipeline in the new PM
>>>> I'll be able to give a more confident answer (esp. w.r.t. the manager for
>>>> different IRUnitT's).
>>>> But at this point I'm not confident running *any* pass pipeline in the
>>>> new PM without at least assertions+ASan.
>>>>
>>>> We may want to have a proper design discussion around this problem
>>>> though.
>>>>
>>>> Also I'd like to have test-suite working (by hook or by crook) with LTO
>>>> in the new PM so we can get some numbers on the resident set impact of all
>>>> this caching; if it is really problematic then we may need to start talking
>>>> front-and-center about different invalidation policies for keeping this in
>>>> check instead of leaving it as something that we will be able to patch
>>>> later.
>>>>
>>>>
>>>>
>>>> The more I think about it, the more I'm convinced that the real "hard"
>>>> problem that the new PM is exposing us to is having the ability for any
>>>> pass to ask for any analysis on any IRUnitT (and any specific IRUnit of
>>>> that IRUnitT) and have the result stored somewhere and then invalidated.
>>>> This means that "getAnalysisUsage" is not just a list of passes, but much
>>>> more complicated and is essentially a set of arbitrary pairs "(analysis,
>>>> IRUnit)" (and the associated potential tangle of dependencies between the
>>>> state cached on these tuples). With the old PM, you essentially are looking
>>>> at a problem of scheduling the lifetime of analyses of the same IRUnit
>>>> intermingled with transformation passes on that same IRUnit, so you only
>>>> have the "analysis" part of the tuple above, making things much simpler
>>>> (and handling dependencies is much simpler too).
>>>>
>>>
>>> To clarify, it seems like the current new PM is essentially trying to
>>> solve the problem of maintaining/updating a mapping:
>>> (Analysis, IRUnit) -> AnalysisResult
>>> where the AnalysisResult's can have an arbitrary dependency on an
>>> arbitrary set of other AnalysisResult's currently maintained in this
>>> mapping. In order to invalidate any AnalysisResult you need to invalidate
>>> all AnalysisResult's that transitively depend on it. Therefore the
>>> right-hand side of this mapping needs to be something like
>>> `(AnalysisResult, SetOfDependents)`.
>>> So the mapping is really `(Analysis, IRUnit) -> (AnalysisResult,
>>> SetOfDependents)`
>>> Also, this mapping can be updated at any point during the execution of a
>>> transformation pass (and various other places) and must stay correct as the
>>> IR is changed (more on this below).
>>> For example, you might have something like:
>>> (DominatorTreeAnalysis, function @foo) -> (DominatorTree for @foo,
>>> [(DemandedBitsAnalysis, function @foo)])
>>> (AssumptionAnalysis, function @foo) -> (AssumptionCache for @foo,
>>> [(DemandedBitsAnalysis, function @foo)])
>>> (DemandedBitsAnalysis, function @foo) -> (DemandedBits for @foo, [])
>>> (AssumptionAnalysis, function @bar) -> (AssumptionCache for @bar,
>>> [(SomeModuleAnalysis, module TheModule)])
>>> (AssumptionAnalysis, function @baz) -> (AssumptionCache for @baz,
>>> [(SomeModuleAnalysis, module TheModule)])
>>> (SomeModuleAnalysis, module TheModule) -> (SomeModuleAnalysisResult for
>>> TheModule, [(SomeFunctionAnalysis, function @baz)])
>>> (SomeFunctionAnalysis, function @baz) -> (SomeFunctionAnalysisResult for
>>> @baz, [])
>>>
>>> So for example, when a transformation pass invalidates
>>> `(AssumptionAnalysis, function @bar)`, we need to walk
>>> `(SomeModuleAnalysis, module TheModule)` and `(SomeFunctionAnalysis,
>>> function @baz)` to invalidate them.
>>>
>>>
>>> Compare this with the old PM (although like I said we have outgrown this
>>> model). Essentially you take the previous mapping, and require IRUnit to be
>>> a constant at any given point in time. Hence the mapping is essentially
>>> Analysis -> AnalysisResult
>>> Since this is 1:1 there is no real distinction between the Analysis and
>>> the AnalysisResult (and as part of transitioning to the new PM this has had
>>> to be untangled).
>>> This also makes the dependencies simpler since you just have a set of
>>> "what analyses have been run at this point". You just need to run the
>>> analyses individually and make sure they are in the right order. Also,
>>> getAnalysis just takes the Analysis to get the AnalysisResult which makes
>>> it simpler -- you just query which analyses are live.
>>>
>>>
>>> Also, the mapping `(Analysis, IRUnit) -> (AnalysisResult,
>>> SetOfDependents)` that the new PM is essentially trying to keep is even
>>> more complicated because for e.g. Loop and CGSCC passes the IRUnit itself
>>> is an object created by an analysis and subject to invalidation of that
>>> analysis as the IR changes underneath it.
>>>
>>> And then there is the question of at what points must this mapping be
>>> valid (i.e. no stale analysis results, no dangling pointers, etc.) and when
>>> the transitive invalidation walking happens. Evidently while a
>>> transformation pass is running, things might temporarily be stale; what are
>>> the "checkpoints" where the mapping is guaranteed to be valid? At the start
>>> of each transformation pass? At least Chandler's D21464 does not stick to
>>> this because the IRUnit's (SCC's) are only updated at the end of running
>>> potentially many function transformation passes. I.e. all but the first
>>> function transformation pass might observe stale IRUnit's (SCC's).
>>>
>>> One other thing to note is that soft-dependencies (using David's
>>> terminology) don't require this kind of dependency tracking. An analysis
>>> result can be cached even though its soft-dependencies are not cached. And
>>> invalidation of soft-dependencies does not require invalidating the
>>> soft-dependents. Actually, this makes it the terminology "soft" and "hard'
>>> quite natural; "hard" requires an edge to track the dependency for
>>> invalidation purposes, "soft" does not.
>>>
>>> This is all quite general. Perhaps too much. We clearly need to go
>>> beyond the old PM's model, but we may not need to go to the fully general
>>> case. Is there a good middle-ground that meets our needs? What restrictions
>>> would we be willing to live with in order to make it easier? The first one
>>> on my list is to not have the IRUnit's themselves depend on analyses. Like
>>> Chandler mentioned on D21921 this has the effect of e.g. preventing caching
>>> across the intervening module pass in a case like
>>> `module(cgscc(require<foo-cgscc-analysis>),some-module-pass-that-makes-no-changes,cgscc(some-cgscc-pass-that-uses-foo-cgscc-analysis))`
>>> but that seems like a restriction we can live with.
>>>
>>>
>>> Again, sorry for the braindump.
>>>
>>>
>>> -- Sean Silva
>>>
>>>
>>>> We've obviously outgrown this model with examples like LAA,
>>>> AssumptionCacheTracker, etc. that hack around this in the old PM. We may
>>>> want to have a fresh re-examination of what problems we are exactly trying
>>>> to solve.
>>>>
>>>> For me, my main concern now is what changes need to be made in order to
>>>> feel confident running a pipeline in the new PM without assertions+ASan.
>>>>
>>>>
>>>> Sorry for the long post, just brain-dumping before heading home.
>>>>
>>>> -- Sean Silva
>>>>
>>>>
>>>>
>>>>
>>>>>
>>>>> -- Sean Silva
>>>>>
>>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Hal Finkel
>>> Assistant Computational Scientist
>>> Leadership Computing Facility
>>> Argonne National Laboratory
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160714/60d371af/attachment.html>