[llvm-dev] [PM] I think that the new PM needs to learn about inter-analysis dependencies...

Mon Jul 25 12:47:19 PDT 2016

On Mon, Jul 25, 2016 at 9:27 AM, Hal Finkel <hfinkel at anl.gov> wrote:

>
> ------------------------------
>
> *From: *"Sean Silva" <chisophugis at gmail.com>
> *To: *"Chandler Carruth" <chandlerc at gmail.com>
> *Cc: *"Xinliang David Li" <davidxl at google.com>, "llvm-dev" <
> llvm-dev at lists.llvm.org>, "Davide Italiano" <dccitaliano at gmail.com>, "Tim
> Amini Golling" <mehdi.amini at apple.com>, "Hal Finkel" <hfinkel at anl.gov>,
> "Sanjoy Das" <sanjoy at playingwithpointers.com>, "Pete Cooper" <
> peter_cooper at apple.com>
> *Sent: *Thursday, July 21, 2016 3:59:28 AM
> *Subject: *Re: [PM] I think that the new PM needs to learn about
> inter-analysis dependencies...
>
> We did some basic sanity checking that memory usage didn't go out of
> control (it doesn't; at least with with a simple
> preserves-all/preserves-none invalidation scheme and the current LTO
> pipeline). Also, I did some basic sanity checking for compile time. The
> simple preserves-all/preserves-none invalidation scheme seems marginally
> slower, but no real conclusions (besides a simple sanity check) can be
> drawn without the real analysis preservation semantics in place.
>
> I'll start working on fixing the analysis managers. There seem to
> basically be two parts (although they may need to be done simultaneously to
> make sure all the pieces fit together):
> - unify all the analysis managers into a single analysis manager for all
> IRUnitT's (requires type-erasing the IRUnit)
> - introduce the dependency tracking machinery
>
> I think I gave a reasonable outline in the two posts above:
> - the one starting with "To clarify, it seems like the current new PM is
> essentially trying to solve the problem of maintaining/updating a mapping"
> - the one starting with "Yeah, the mechanics of maintaining this fully
> general mapping are straightforward in the abstract"
>
> I'm happy to do a centralized writeup if anybody wants. Just let me know.
>
> As far as changes to the code, the updates to the new PM passes should
> hopefully be mechanical (despite there being many of them). However, from
> what I can tell, fixing this problem will require touching most lines of
> the analysis manager machinery so the diff will probably be a bit scary,
> even though I think we can keep the same basic structure (i.e. a per-IRUnit
> std::list holding one analysis result (at a stable address) per element,
> combined with a DenseMap from (Analysis, IRUnit) to an element of the
> per-IRUnit storage list (see AnalysisResultListMapT and AnalysisResultMapT
> in include/llvm/IR/PassManager.h)).
> The main changes to the analysis manager will be:
> - type-erasing the IRUnit
> - the elements of the AnalysisResultListMapT will need to keep track of
> any dependents
> - the analysis manager will need to update those dependencies as it is
> re-entered by analyses getting results of other analyses
> - the analysis manager will need to walk the dependencies to do transitive
> invalidation
>
> I think the most robust approach is for analysis dependencies to be
> implicitly constructed by the analysis manager via tracking entry/exit from
> get{,Cached}Result.
> One alternative is for analyses to explicitly pass in their ID to
> getResult to indicate the dependency, but that seems error-prone (and also
> verbose). The issue is that we will need a getResult API that does not
> track dependencies for use by transformation passes (since there is no
> dependency to track in that case); so an innocuous copy-paste from a
> transform pass to an analysis can result in a failure to track dependencies
> and risk of use-after-free (we could fight this with the type system, but I
> think that would get a bit verbose (I'm willing to try it though if people
> would prefer))
> One restriction of the implicit tracking approach is that it requires all
> calls into the analysis manager to occur in the `run` method of the
> analysis (so that the dependencies are implicitly tracked via re-entrance
> to get{,Cached}Result); is this a reasonable restriction?
>
> What's the potential use case for getting an analysis outside of something
> in, or called by, run()?
>

The main thing I had in mind was that if an analysis happens to stash a
pointer directly to the analysis manager, then it may call into the
analysis manager in the query path. For example, you would have this
situation if a module analysis has a query path that wants to lazily
compute a function analysis (i.e. compute the function analysis outside of
the `run` call).

>
>
> One annoying problem is that I think that the dependency links will need
> to be bidirectional. To use the example analysis cache from my other post:
> (AssumptionAnalysis, function @bar) -> (AssumptionCache for @bar,
> [(SomeModuleAnalysis, module TheModule)])
> (AssumptionAnalysis, function @baz) -> (AssumptionCache for @baz,
> [(SomeModuleAnalysis, module TheModule)])
> (SomeModuleAnalysis, module TheModule) -> (SomeModuleAnalysisResult for
> TheModule, [(SomeFunctionAnalysis, function @baz)])
> (SomeFunctionAnalysis, function @baz) -> (SomeFunctionAnalysisResult for
> @baz, [])
>
> if we delete function @baz, then the dependent list  [(SomeFunctionAnalysis,
> function @baz)] for `(SomeModuleAnalysis, module TheModule)` will now
> have a stale pointer to function @baz. I think that in practice we can
> check to see if `(SomeFunctionAnalysis, function @baz)` is in our hash
> table and if it isn't then just ignore the dependency as "this dependent
> analysis result has already been freed". In the worst case (memory
> allocator reuses the memory for another function) we may spuriously free an
> analysis result for a different function. However it is still unsettling
> (and may actually be UB in C++?).
> Ideally we would track bidirectional links; that way when we remove an
> analysis result we also have it remove itself from dependence lists of all
> of its dependencies.
>
> I think that bidirectional tracking is preferable. I don't see how to
> avoid UB otherwise, and spuriously freeing things based on allocator
> address reuse will likely lead to non-deterministic behavior (e.g. because
> a re-run analysis might be more accurate than a preserved one).
>

Yeah, it doesn't seem onerous to maintain the bidirectional tracking (I
have a design mocked up in my log). The problem actually ends up being
similar to use/user tracking we have in the IR data structures (although
it's different enough that there isn't really any code to share).

-- Sean Silva

>
> Thanks again,
> Hal
>
> -- Sean Silva
>
> On Fri, Jul 15, 2016 at 8:40 PM, Sean Silva <chisophugis at gmail.com> wrote:
>
>>
>>
>> On Fri, Jul 15, 2016 at 8:39 PM, Sean Silva <chisophugis at gmail.com>
>> wrote:
>>
>>> It looks like there is really no sane fix within the current
>>> infrastructure. I've had to essentially trigger invalidation (except in the
>>> PreservedAnalyses::all() case) in the function pass manager and function to
>>> loop adapters.
>>>
>>
>> invalidation of *everything* I mean.
>>
>> -- Sean Silva
>>
>>
>>>
>>> So we basically need to get the analysis manager dependency tracking
>>> fixed.
>>>
>>> Davide and I will get measurements on the resident set impact of all
>>> this caching (even with the overconservative invalidation for now) to see
>>> the impact. If there is a big rss impact then we probably want to consider
>>> that problem at the same time as the rewrite of the analysis manager.
>>>
>>> -- Sean Silva
>>>
>>> On Thu, Jul 14, 2016 at 12:51 AM, Sean Silva <chisophugis at gmail.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Wed, Jul 13, 2016 at 1:48 AM, Sean Silva <chisophugis at gmail.com>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Wed, Jul 13, 2016 at 12:34 AM, Chandler Carruth <
>>>>> chandlerc at gmail.com> wrote:
>>>>>
>>>>>> On Wed, Jul 13, 2016 at 12:25 AM Sean Silva <chisophugis at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> On Tue, Jul 12, 2016 at 11:39 PM, Chandler Carruth <
>>>>>>> chandlerc at gmail.com> wrote:
>>>>>>>
>>>>>>>> On Tue, Jul 12, 2016 at 11:34 PM Sean Silva <chisophugis at gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> On Tue, Jul 12, 2016 at 11:32 PM, Xinliang David Li <
>>>>>>>>> davidxl at google.com> wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, Jul 12, 2016 at 10:57 PM, Chandler Carruth <
>>>>>>>>>> chandlerc at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Yea, this is a nasty problem.
>>>>>>>>>>>
>>>>>>>>>>> One important thing to understand is that this is specific to
>>>>>>>>>>> analyses which hold references to other analyses. While this isn't unheard
>>>>>>>>>>> of, it isn't as common as it could be. Still, definitely something we need
>>>>>>>>>>> to address.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> We can call this type of dependencies (holding references)
>>>>>>>>>> hard-dependency. The soft dependency refers to the case where analysis 'A'
>>>>>>>>>> depends on 'B' during computation, but does not need 'B' once it is
>>>>>>>>>> computed.
>>>>>>>>>>
>>>>>>>>>> There are actually quite a few examples of hard-dependency case.
>>>>>>>>>> For instance LoopAccessInfo, LazyValueInfo etc which hold references to
>>>>>>>>>> other analyses.
>>>>>>>>>>
>>>>>>>>>> Problem involving hard-dependency is actually easier to detect,
>>>>>>>>>> as it is usually a compile time problem. Issues involving soft dependencies
>>>>>>>>>> are more subtle and can lead to wrong code gen.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Did you mean to say that soft-dependency problems are easier to
>>>>>>>>> detect? At least my intuition is that soft-dependency is easier because
>>>>>>>>> there is no risk of dangling pointers to other analyses.
>>>>>>>>>
>>>>>>>>
>>>>>>>> The issue is that the fact that there is *any* dependency isn't
>>>>>>>> clear.
>>>>>>>>
>>>>>>>> However, I think the only real problem here are these "hard
>>>>>>>> dependencies" (I don't really like that term though). For others, only an
>>>>>>>> analysis that is *explicitly* preserved survives. So I'm not worried about
>>>>>>>> the fact that people have to remember this.
>>>>>>>>
>>>>>>>> The question is how often there are cross-data-structure
>>>>>>>> references. David mentions a few examples, and I'm sure there are more, but
>>>>>>>> it isn't clear to me yet whether this is pervasive or occasional.
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I just did a quick run-through of PassRegistry.def and this is what
>>>>>>> I found:
>>>>>>>
>>>>>>> Module analyses: 0/5 hold pointers to other analyses
>>>>>>> CallGraph: No pointers to other analyses.
>>>>>>> LazyCallGraph: No pointers to other analyses.
>>>>>>> ProfileSummaryAnalysis: No pointers to other analyses.
>>>>>>> TargetLibraryAnalysis: No pointers to other analyses.
>>>>>>> VerifierAnalysis: No pointers to other analyses.
>>>>>>>
>>>>>>> Module alias analyses: 1/1 keeps pointer to other analysis.
>>>>>>> GlobalsAA: Result keeps pointer to TLI (this is a function analysis).
>>>>>>>
>>>>>>> Function analyses: 9/17 keep pointers to other analysis
>>>>>>> AAManager: Its Result holds TLI pointer and pointers to individual
>>>>>>> AA result objects.
>>>>>>> AssumptionAnalysis: No pointers to other analyses.
>>>>>>> BlockFrequencyAnalysis: Its Result holds pointers to LoopInfo and
>>>>>>> BPI.
>>>>>>> BranchProbabilityAnalysis: Stores no pointers to other analyses.
>>>>>>> (uses LoopInfo to "recalculate" though)
>>>>>>> DominatorTreeAnalysis: Stores no pointers to other analyses.
>>>>>>> PostDominatorTreeAnalysis: Stores no pointers to other analyses.
>>>>>>> DemandedBitsAnalysis: Stores pointers to AssumptionCache
>>>>>>> and DominatorTree
>>>>>>> DominanceFrontierAnalysis: Stores no pointers to other analyses.
>>>>>>> (uses DominatorTreeAnalysis for "recalculate" though).
>>>>>>> LoopInfo: Uses DominatorTreeAnalysis for "recalculate" but stores no
>>>>>>> pointers.
>>>>>>> LazyValueAnalysis: Stores pointers to AssumptionCache,
>>>>>>> TargetLibraryInfo, DominatorTree.
>>>>>>> DependenceAnalysis: Stores pointers to AliasAnalysis,
>>>>>>> ScalarEvolution, LoopInfo
>>>>>>> MemoryDependenceAnalysis: Stores pointers to AliasAnalysis,
>>>>>>> AssumptionCache, TargetLibraryInfo, DominatorTree
>>>>>>> MemorySSAAnalysis: Stores pointers to AliasAnalysis, DominatorTree
>>>>>>> RegionInfoAnalysis: Stores pointers to DomTree, PostDomTree,
>>>>>>> DomFrontier
>>>>>>> ScalarEvolutionAnalysis: Stores pointers to TargetLibraryInfo,
>>>>>>> AssumptionCache, DominatorTree, LoopInfo
>>>>>>> TargetLibraryAnalysis: Has no dependencies
>>>>>>> TargetIRAnalysis: Has no dependencies.
>>>>>>>
>>>>>>> Function alias analyses: 3/5 keep pointers to other analyses
>>>>>>> BasicAA: Keeps pointers to TargetLibraryInfo, AssumptionCache,
>>>>>>> DominatorTree, LoopInfo
>>>>>>> CFLAA: Keeps pointer to TargetLibraryInfo
>>>>>>> SCEVAA: Keeps pointer to ScalarEvolution
>>>>>>> ScopedNoAliasAA: No dependencies
>>>>>>> TypeBasedAA: No dependencies
>>>>>>>
>>>>>>>
>>>>>>> Total: 13/28 analyses (~50%) hold pointers to other analyses.
>>>>>>> Of the 15/28 analyses that don't hold pointers, 12/15 simply have no
>>>>>>> dependencies. Only 3/15 (BPI, LoopInfo, DominanceFrontier) have
>>>>>>> dependencies that are used just for a "recalculate" step that retains no
>>>>>>> pointers.
>>>>>>> So I think it is fair to say that analyses which hold pointers to
>>>>>>> other analyses is not an exceptional case. In fact, analyses that use other
>>>>>>> analyses just for a "recalculate" step seems to be the exceptional case
>>>>>>> (only 3/28 or about 10%)
>>>>>>>
>>>>>>
>>>>>> Interesting!
>>>>>>
>>>>>> Most of these look like they hold a pointer to the root analysis as
>>>>>> opposed to detailed objects *inside* the analysis?
>>>>>>
>>>>>> It might make sense to try to handle this very specific pattern in a
>>>>>> special way of overriding the invalidate routines is too error prone.... We
>>>>>> could try to make this work "automatically" but I'm worried this would be
>>>>>> challenging to get right. Open to suggestions of course.
>>>>>>
>>>>>> Any other ideas about what would make sense to handle this?
>>>>>>
>>>>>> Does it make sense to override the invalidate routines now and
>>>>>> iterate from there? I feel like you've done a lot of the research necessary
>>>>>> for this already...
>>>>>>
>>>>>
>>>>> I'll keep pushing forward tomorrow with building test-suite
>>>>> successfully using the new PM for the LTO pipeline (I was doing some
>>>>> unrelated LLD stuff for most of today). It will be interesting to see how
>>>>> many `invalidate` overrides will be needed to avoid these issues for just
>>>>> the LTO pipeline on test-suite.
>>>>>
>>>>
>>>> I spent the better part of today working on this and will continue
>>>> tomorrow; this problem seems nastier than I thought. For some reason the
>>>> LTO pipeline (or something about LTO) seems to hit on these issues much
>>>> more (I'm talking like 40k lines of ASan error reports from building
>>>> test-suite with the LTO pipeline in the new PM; per-TU steps still using
>>>> the old PM). Some notes:
>>>>
>>>> - BasicAA's dependence on domtree and loopinfo in the new PM seems to
>>>> account for quite a few of the problems.
>>>> - BasicAA and other stuff are marked (by overriding `invalidate` to
>>>> return false) to never be invalidated because they are "stateless". However
>>>> they still hold pointers and so they do need to be invalidated.
>>>> - CallGraph uses AssertingVH (PR28400) and so I needed a workaround
>>>> similar to r274656 in various passes.
>>>> - D21921 is holding up -- I haven't hit any issues with the core logic
>>>> of that patch.
>>>> - AAResults holds handles to various AA result objects. This means it
>>>> pretty much always needs to be invalidated unless you are sure that none of
>>>> the AA's will get invalidated.
>>>>
>>>>
>>>> The existing `invalidate` method doesn't have the right semantics for
>>>> even an error-prone solution :( We are going to need to make some
>>>> significant changes to even get basic sanity I think. Perhaps each analysis
>>>> can expose a "preserve" static function. E.g. instead of
>>>> `PA.preserve<Foo>();` you have to do `Foo::setPreserved(PA);`.
>>>> I'm actually not quite sure that that will even work. Once I have
>>>> test-suite fully building successfully with the LTO pipeline in the new PM
>>>> I'll be able to give a more confident answer (esp. w.r.t. the manager for
>>>> different IRUnitT's).
>>>> But at this point I'm not confident running *any* pass pipeline in the
>>>> new PM without at least assertions+ASan.
>>>>
>>>> We may want to have a proper design discussion around this problem
>>>> though.
>>>>
>>>> Also I'd like to have test-suite working (by hook or by crook) with LTO
>>>> in the new PM so we can get some numbers on the resident set impact of all
>>>> this caching; if it is really problematic then we may need to start talking
>>>> front-and-center about different invalidation policies for keeping this in
>>>> check instead of leaving it as something that we will be able to patch
>>>> later.
>>>>
>>>>
>>>>
>>>> The more I think about it, the more I'm convinced that the real "hard"
>>>> problem that the new PM is exposing us to is having the ability for any
>>>> pass to ask for any analysis on any IRUnitT (and any specific IRUnit of
>>>> that IRUnitT) and have the result stored somewhere and then invalidated.
>>>> This means that "getAnalysisUsage" is not just a list of passes, but much
>>>> more complicated and is essentially a set of arbitrary pairs "(analysis,
>>>> IRUnit)" (and the associated potential tangle of dependencies between the
>>>> state cached on these tuples). With the old PM, you essentially are looking
>>>> at a problem of scheduling the lifetime of analyses of the same IRUnit
>>>> intermingled with transformation passes on that same IRUnit, so you only
>>>> have the "analysis" part of the tuple above, making things much simpler
>>>> (and handling dependencies is much simpler too). We've obviously outgrown
>>>> this model with examples like LAA, AssumptionCacheTracker, etc. that hack
>>>> around this in the old PM. We may want to have a fresh re-examination of
>>>> what problems we are exactly trying to solve.
>>>>
>>>> For me, my main concern now is what changes need to be made in order to
>>>> feel confident running a pipeline in the new PM without assertions+ASan.
>>>>
>>>>
>>>> Sorry for the long post, just brain-dumping before heading home.
>>>>
>>>> -- Sean Silva
>>>>
>>>>
>>>>
>>>>
>>>>>
>>>>> -- Sean Silva
>>>>>
>>>>>
>>>>
>>>
>>
>
>
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160725/ddd6e413/attachment.html>