[llvm-dev] [PM] I think that the new PM needs to learn about inter-analysis dependencies...

Fri Jul 15 03:11:04 PDT 2016

On Thu, Jul 14, 2016 at 8:41 PM, Sean Silva <chisophugis at gmail.com> wrote:

>
>
> On Thu, Jul 14, 2016 at 7:43 PM, Hal Finkel <hfinkel at anl.gov> wrote:
>
>>
>> ------------------------------
>>
>> *From: *"Mehdi Amini" <mehdi.amini at apple.com>
>> *To: *"Hal Finkel" <hfinkel at anl.gov>
>> *Cc: *"Sean Silva" <chisophugis at gmail.com>, "Xinliang David Li" <
>> davidxl at google.com>, "llvm-dev" <llvm-dev at lists.llvm.org>, "Davide
>> Italiano" <dccitaliano at gmail.com>, "Sanjoy Das" <
>> sanjoy at playingwithpointers.com>, "Pete Cooper" <peter_cooper at apple.com>,
>> "Chandler Carruth" <chandlerc at gmail.com>
>> *Sent: *Thursday, July 14, 2016 9:37:38 PM
>> *Subject: *Re: [PM] I think that the new PM needs to learn about
>> inter-analysis dependencies...
>>
>>
>>
>> Sent from my iPhone
>>
>> On Jul 14, 2016, at 7:21 PM, Hal Finkel <hfinkel at anl.gov> wrote:
>>
>> Hi Sean,
>>
>> Thanks for writing all of this up. I'll go back to my previous position:
>> we need a general dependency graph built as the analysis cache is used.
>>
>>
>> I share the same feeling.
>>
>> It should have the following properties:
>>
>>  1. When we call getResult or getCachedResult on an analysis manager, we
>> record a dependency of the current pass on the returned result.
>>
>>
>> Because of the hard vs soft dependency, I'm not sure if this has to be
>> implicit or if it'd be better made explicit though, i.e. with a separate
>> API call.
>>
>> Can you explain? You still need to invalidate for soft dependencies.
>>
>
> You don't necessarily need too. For example, a pass may manually modify
> LoopInfo to keep it up to date but not update DominatorTree.
> On the other hand, invalidation in the case of soft dependencies is still
> correct, so we may decide to not worry about the hard/soft distinction
> unless there are real compile time savings to be gotten by adding special
> handling for soft dependencies.
> I.e. the PreservedAnalyses set is about what is preserved, not about what
> is invalidated. For a PreservedAnalyses set PA, the set that needs to be
> invalidated is something like:
> union(transitiveHardDeps(x) for x in ~PA) & ~PA
>

The `& ~PA` is spurious here. (I think when I started writing this I was
thinking in terms of transitively following soft deps too or something like
that)

Now that I think about it, we will probably want (for debugging /
development) some sort of debug output along the lines of "your
transformation pass marked analysis Foo as preserved, but it was still
invalidated because analysis Bar was not preserved".

-- Sean Silva

>
> I.e. in response to Mehdi's question, the soft case can be phrased as an
> explicit opt-out of dependency tracking.
>
> -- Sean Silva
>
>
>>
>>
>>  -Hal
>>
>>
>> --
>> Mehdi
>>
>>
>>  2. This dependency needs to be stored such that it can be used to
>> invalidate the current result when the returned result is invalidates and
>> so that the dependency can be deleted when the current result is
>> invalidated.
>>
>> As I understand the problem, this is a fully-general solution. I see no
>> reason not to have a fully-general solution.
>>
>> Thanks again,
>> Hal
>>
>> ------------------------------
>>
>> *From: *"Sean Silva" <chisophugis at gmail.com>
>> *To: *"Chandler Carruth" <chandlerc at gmail.com>
>> *Cc: *"Xinliang David Li" <davidxl at google.com>, "llvm-dev" <
>> llvm-dev at lists.llvm.org>, "Davide Italiano" <dccitaliano at gmail.com>,
>> "Tim Amini Golling" <mehdi.amini at apple.com>, "Hal Finkel" <
>> hfinkel at anl.gov>, "Sanjoy Das" <sanjoy at playingwithpointers.com>, "Pete
>> Cooper" <peter_cooper at apple.com>
>> *Sent: *Thursday, July 14, 2016 4:11:58 AM
>> *Subject: *Re: [PM] I think that the new PM needs to learn about
>> inter-analysis dependencies...
>>
>>
>>
>> On Thu, Jul 14, 2016 at 12:51 AM, Sean Silva <chisophugis at gmail.com>
>> wrote:
>>
>>>
>>>
>>> On Wed, Jul 13, 2016 at 1:48 AM, Sean Silva <chisophugis at gmail.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Wed, Jul 13, 2016 at 12:34 AM, Chandler Carruth <chandlerc at gmail.com
>>>> > wrote:
>>>>
>>>>> On Wed, Jul 13, 2016 at 12:25 AM Sean Silva <chisophugis at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> On Tue, Jul 12, 2016 at 11:39 PM, Chandler Carruth <
>>>>>> chandlerc at gmail.com> wrote:
>>>>>>
>>>>>>> On Tue, Jul 12, 2016 at 11:34 PM Sean Silva <chisophugis at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> On Tue, Jul 12, 2016 at 11:32 PM, Xinliang David Li <
>>>>>>>> davidxl at google.com> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Jul 12, 2016 at 10:57 PM, Chandler Carruth <
>>>>>>>>> chandlerc at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Yea, this is a nasty problem.
>>>>>>>>>>
>>>>>>>>>> One important thing to understand is that this is specific to
>>>>>>>>>> analyses which hold references to other analyses. While this isn't unheard
>>>>>>>>>> of, it isn't as common as it could be. Still, definitely something we need
>>>>>>>>>> to address.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> We can call this type of dependencies (holding references)
>>>>>>>>> hard-dependency. The soft dependency refers to the case where analysis 'A'
>>>>>>>>> depends on 'B' during computation, but does not need 'B' once it is
>>>>>>>>> computed.
>>>>>>>>>
>>>>>>>>> There are actually quite a few examples of hard-dependency case.
>>>>>>>>> For instance LoopAccessInfo, LazyValueInfo etc which hold references to
>>>>>>>>> other analyses.
>>>>>>>>>
>>>>>>>>> Problem involving hard-dependency is actually easier to detect, as
>>>>>>>>> it is usually a compile time problem. Issues involving soft dependencies
>>>>>>>>> are more subtle and can lead to wrong code gen.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Did you mean to say that soft-dependency problems are easier to
>>>>>>>> detect? At least my intuition is that soft-dependency is easier because
>>>>>>>> there is no risk of dangling pointers to other analyses.
>>>>>>>>
>>>>>>>
>>>>>>> The issue is that the fact that there is *any* dependency isn't
>>>>>>> clear.
>>>>>>>
>>>>>>> However, I think the only real problem here are these "hard
>>>>>>> dependencies" (I don't really like that term though). For others, only an
>>>>>>> analysis that is *explicitly* preserved survives. So I'm not worried about
>>>>>>> the fact that people have to remember this.
>>>>>>>
>>>>>>> The question is how often there are cross-data-structure references.
>>>>>>> David mentions a few examples, and I'm sure there are more, but it isn't
>>>>>>> clear to me yet whether this is pervasive or occasional.
>>>>>>>
>>>>>>
>>>>>>
>>>>>> I just did a quick run-through of PassRegistry.def and this is what I
>>>>>> found:
>>>>>>
>>>>>> Module analyses: 0/5 hold pointers to other analyses
>>>>>> CallGraph: No pointers to other analyses.
>>>>>> LazyCallGraph: No pointers to other analyses.
>>>>>> ProfileSummaryAnalysis: No pointers to other analyses.
>>>>>> TargetLibraryAnalysis: No pointers to other analyses.
>>>>>> VerifierAnalysis: No pointers to other analyses.
>>>>>>
>>>>>> Module alias analyses: 1/1 keeps pointer to other analysis.
>>>>>> GlobalsAA: Result keeps pointer to TLI (this is a function analysis).
>>>>>>
>>>>>> Function analyses: 9/17 keep pointers to other analysis
>>>>>> AAManager: Its Result holds TLI pointer and pointers to individual AA
>>>>>> result objects.
>>>>>> AssumptionAnalysis: No pointers to other analyses.
>>>>>> BlockFrequencyAnalysis: Its Result holds pointers to LoopInfo and BPI.
>>>>>> BranchProbabilityAnalysis: Stores no pointers to other analyses.
>>>>>> (uses LoopInfo to "recalculate" though)
>>>>>> DominatorTreeAnalysis: Stores no pointers to other analyses.
>>>>>> PostDominatorTreeAnalysis: Stores no pointers to other analyses.
>>>>>> DemandedBitsAnalysis: Stores pointers to AssumptionCache
>>>>>> and DominatorTree
>>>>>> DominanceFrontierAnalysis: Stores no pointers to other analyses.
>>>>>> (uses DominatorTreeAnalysis for "recalculate" though).
>>>>>> LoopInfo: Uses DominatorTreeAnalysis for "recalculate" but stores no
>>>>>> pointers.
>>>>>> LazyValueAnalysis: Stores pointers to AssumptionCache,
>>>>>> TargetLibraryInfo, DominatorTree.
>>>>>> DependenceAnalysis: Stores pointers to AliasAnalysis,
>>>>>> ScalarEvolution, LoopInfo
>>>>>> MemoryDependenceAnalysis: Stores pointers to AliasAnalysis,
>>>>>> AssumptionCache, TargetLibraryInfo, DominatorTree
>>>>>> MemorySSAAnalysis: Stores pointers to AliasAnalysis, DominatorTree
>>>>>> RegionInfoAnalysis: Stores pointers to DomTree, PostDomTree,
>>>>>> DomFrontier
>>>>>> ScalarEvolutionAnalysis: Stores pointers to TargetLibraryInfo,
>>>>>> AssumptionCache, DominatorTree, LoopInfo
>>>>>> TargetLibraryAnalysis: Has no dependencies
>>>>>> TargetIRAnalysis: Has no dependencies.
>>>>>>
>>>>>> Function alias analyses: 3/5 keep pointers to other analyses
>>>>>> BasicAA: Keeps pointers to TargetLibraryInfo, AssumptionCache,
>>>>>> DominatorTree, LoopInfo
>>>>>> CFLAA: Keeps pointer to TargetLibraryInfo
>>>>>> SCEVAA: Keeps pointer to ScalarEvolution
>>>>>> ScopedNoAliasAA: No dependencies
>>>>>> TypeBasedAA: No dependencies
>>>>>>
>>>>>>
>>>>>> Total: 13/28 analyses (~50%) hold pointers to other analyses.
>>>>>> Of the 15/28 analyses that don't hold pointers, 12/15 simply have no
>>>>>> dependencies. Only 3/15 (BPI, LoopInfo, DominanceFrontier) have
>>>>>> dependencies that are used just for a "recalculate" step that retains no
>>>>>> pointers.
>>>>>> So I think it is fair to say that analyses which hold pointers to
>>>>>> other analyses is not an exceptional case. In fact, analyses that use other
>>>>>> analyses just for a "recalculate" step seems to be the exceptional case
>>>>>> (only 3/28 or about 10%)
>>>>>>
>>>>>
>>>>> Interesting!
>>>>>
>>>>> Most of these look like they hold a pointer to the root analysis as
>>>>> opposed to detailed objects *inside* the analysis?
>>>>>
>>>>> It might make sense to try to handle this very specific pattern in a
>>>>> special way of overriding the invalidate routines is too error prone.... We
>>>>> could try to make this work "automatically" but I'm worried this would be
>>>>> challenging to get right. Open to suggestions of course.
>>>>>
>>>>> Any other ideas about what would make sense to handle this?
>>>>>
>>>>> Does it make sense to override the invalidate routines now and iterate
>>>>> from there? I feel like you've done a lot of the research necessary for
>>>>> this already...
>>>>>
>>>>
>>>> I'll keep pushing forward tomorrow with building test-suite
>>>> successfully using the new PM for the LTO pipeline (I was doing some
>>>> unrelated LLD stuff for most of today). It will be interesting to see how
>>>> many `invalidate` overrides will be needed to avoid these issues for just
>>>> the LTO pipeline on test-suite.
>>>>
>>>
>>> I spent the better part of today working on this and will continue
>>> tomorrow; this problem seems nastier than I thought. For some reason the
>>> LTO pipeline (or something about LTO) seems to hit on these issues much
>>> more (I'm talking like 40k lines of ASan error reports from building
>>> test-suite with the LTO pipeline in the new PM; per-TU steps still using
>>> the old PM). Some notes:
>>>
>>> - BasicAA's dependence on domtree and loopinfo in the new PM seems to
>>> account for quite a few of the problems.
>>> - BasicAA and other stuff are marked (by overriding `invalidate` to
>>> return false) to never be invalidated because they are "stateless". However
>>> they still hold pointers and so they do need to be invalidated.
>>> - CallGraph uses AssertingVH (PR28400) and so I needed a workaround
>>> similar to r274656 in various passes.
>>> - D21921 is holding up -- I haven't hit any issues with the core logic
>>> of that patch.
>>> - AAResults holds handles to various AA result objects. This means it
>>> pretty much always needs to be invalidated unless you are sure that none of
>>> the AA's will get invalidated.
>>>
>>>
>>> The existing `invalidate` method doesn't have the right semantics for
>>> even an error-prone solution :( We are going to need to make some
>>> significant changes to even get basic sanity I think. Perhaps each analysis
>>> can expose a "preserve" static function. E.g. instead of
>>> `PA.preserve<Foo>();` you have to do `Foo::setPreserved(PA);`.
>>> I'm actually not quite sure that that will even work. Once I have
>>> test-suite fully building successfully with the LTO pipeline in the new PM
>>> I'll be able to give a more confident answer (esp. w.r.t. the manager for
>>> different IRUnitT's).
>>> But at this point I'm not confident running *any* pass pipeline in the
>>> new PM without at least assertions+ASan.
>>>
>>> We may want to have a proper design discussion around this problem
>>> though.
>>>
>>> Also I'd like to have test-suite working (by hook or by crook) with LTO
>>> in the new PM so we can get some numbers on the resident set impact of all
>>> this caching; if it is really problematic then we may need to start talking
>>> front-and-center about different invalidation policies for keeping this in
>>> check instead of leaving it as something that we will be able to patch
>>> later.
>>>
>>>
>>>
>>> The more I think about it, the more I'm convinced that the real "hard"
>>> problem that the new PM is exposing us to is having the ability for any
>>> pass to ask for any analysis on any IRUnitT (and any specific IRUnit of
>>> that IRUnitT) and have the result stored somewhere and then invalidated.
>>> This means that "getAnalysisUsage" is not just a list of passes, but much
>>> more complicated and is essentially a set of arbitrary pairs "(analysis,
>>> IRUnit)" (and the associated potential tangle of dependencies between the
>>> state cached on these tuples). With the old PM, you essentially are looking
>>> at a problem of scheduling the lifetime of analyses of the same IRUnit
>>> intermingled with transformation passes on that same IRUnit, so you only
>>> have the "analysis" part of the tuple above, making things much simpler
>>> (and handling dependencies is much simpler too).
>>>
>>
>> To clarify, it seems like the current new PM is essentially trying to
>> solve the problem of maintaining/updating a mapping:
>> (Analysis, IRUnit) -> AnalysisResult
>> where the AnalysisResult's can have an arbitrary dependency on an
>> arbitrary set of other AnalysisResult's currently maintained in this
>> mapping. In order to invalidate any AnalysisResult you need to invalidate
>> all AnalysisResult's that transitively depend on it. Therefore the
>> right-hand side of this mapping needs to be something like
>> `(AnalysisResult, SetOfDependents)`.
>> So the mapping is really `(Analysis, IRUnit) -> (AnalysisResult,
>> SetOfDependents)`
>> Also, this mapping can be updated at any point during the execution of a
>> transformation pass (and various other places) and must stay correct as the
>> IR is changed (more on this below).
>> For example, you might have something like:
>> (DominatorTreeAnalysis, function @foo) -> (DominatorTree for @foo,
>> [(DemandedBitsAnalysis, function @foo)])
>> (AssumptionAnalysis, function @foo) -> (AssumptionCache for @foo,
>> [(DemandedBitsAnalysis, function @foo)])
>> (DemandedBitsAnalysis, function @foo) -> (DemandedBits for @foo, [])
>> (AssumptionAnalysis, function @bar) -> (AssumptionCache for @bar,
>> [(SomeModuleAnalysis, module TheModule)])
>> (AssumptionAnalysis, function @baz) -> (AssumptionCache for @baz,
>> [(SomeModuleAnalysis, module TheModule)])
>> (SomeModuleAnalysis, module TheModule) -> (SomeModuleAnalysisResult for
>> TheModule, [(SomeFunctionAnalysis, function @baz)])
>> (SomeFunctionAnalysis, function @baz) -> (SomeFunctionAnalysisResult for
>> @baz, [])
>>
>> So for example, when a transformation pass invalidates
>> `(AssumptionAnalysis, function @bar)`, we need to walk
>> `(SomeModuleAnalysis, module TheModule)` and `(SomeFunctionAnalysis,
>> function @baz)` to invalidate them.
>>
>>
>> Compare this with the old PM (although like I said we have outgrown this
>> model). Essentially you take the previous mapping, and require IRUnit to be
>> a constant at any given point in time. Hence the mapping is essentially
>> Analysis -> AnalysisResult
>> Since this is 1:1 there is no real distinction between the Analysis and
>> the AnalysisResult (and as part of transitioning to the new PM this has had
>> to be untangled).
>> This also makes the dependencies simpler since you just have a set of
>> "what analyses have been run at this point". You just need to run the
>> analyses individually and make sure they are in the right order. Also,
>> getAnalysis just takes the Analysis to get the AnalysisResult which makes
>> it simpler -- you just query which analyses are live.
>>
>>
>> Also, the mapping `(Analysis, IRUnit) -> (AnalysisResult,
>> SetOfDependents)` that the new PM is essentially trying to keep is even
>> more complicated because for e.g. Loop and CGSCC passes the IRUnit itself
>> is an object created by an analysis and subject to invalidation of that
>> analysis as the IR changes underneath it.
>>
>> And then there is the question of at what points must this mapping be
>> valid (i.e. no stale analysis results, no dangling pointers, etc.) and when
>> the transitive invalidation walking happens. Evidently while a
>> transformation pass is running, things might temporarily be stale; what are
>> the "checkpoints" where the mapping is guaranteed to be valid? At the start
>> of each transformation pass? At least Chandler's D21464 does not stick to
>> this because the IRUnit's (SCC's) are only updated at the end of running
>> potentially many function transformation passes. I.e. all but the first
>> function transformation pass might observe stale IRUnit's (SCC's).
>>
>> One other thing to note is that soft-dependencies (using David's
>> terminology) don't require this kind of dependency tracking. An analysis
>> result can be cached even though its soft-dependencies are not cached. And
>> invalidation of soft-dependencies does not require invalidating the
>> soft-dependents. Actually, this makes it the terminology "soft" and "hard'
>> quite natural; "hard" requires an edge to track the dependency for
>> invalidation purposes, "soft" does not.
>>
>> This is all quite general. Perhaps too much. We clearly need to go beyond
>> the old PM's model, but we may not need to go to the fully general case. Is
>> there a good middle-ground that meets our needs? What restrictions would we
>> be willing to live with in order to make it easier? The first one on my
>> list is to not have the IRUnit's themselves depend on analyses. Like
>> Chandler mentioned on D21921 this has the effect of e.g. preventing caching
>> across the intervening module pass in a case like
>> `module(cgscc(require<foo-cgscc-analysis>),some-module-pass-that-makes-no-changes,cgscc(some-cgscc-pass-that-uses-foo-cgscc-analysis))`
>> but that seems like a restriction we can live with.
>>
>>
>> Again, sorry for the braindump.
>>
>>
>> -- Sean Silva
>>
>>
>>> We've obviously outgrown this model with examples like LAA,
>>> AssumptionCacheTracker, etc. that hack around this in the old PM. We may
>>> want to have a fresh re-examination of what problems we are exactly trying
>>> to solve.
>>>
>>> For me, my main concern now is what changes need to be made in order to
>>> feel confident running a pipeline in the new PM without assertions+ASan.
>>>
>>>
>>> Sorry for the long post, just brain-dumping before heading home.
>>>
>>> -- Sean Silva
>>>
>>>
>>>
>>>
>>>>
>>>> -- Sean Silva
>>>>
>>>>
>>>
>>
>>
>>
>> --
>> Hal Finkel
>> Assistant Computational Scientist
>> Leadership Computing Facility
>> Argonne National Laboratory
>>
>>
>>
>>
>> --
>> Hal Finkel
>> Assistant Computational Scientist
>> Leadership Computing Facility
>> Argonne National Laboratory
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160715/ac8a4b52/attachment.html>