[llvm-dev] [PM] I think that the new PM needs to learn about inter-analysis dependencies...

Mon Jul 25 13:17:16 PDT 2016

On Mon, Jul 25, 2016 at 1:03 PM, Hal Finkel <hfinkel at anl.gov> wrote:

>
> ------------------------------
>
> *From: *"Sean Silva" <chisophugis at gmail.com>
> *To: *"Hal Finkel" <hfinkel at anl.gov>
> *Cc: *"Xinliang David Li" <davidxl at google.com>, "llvm-dev" <
> llvm-dev at lists.llvm.org>, "Davide Italiano" <dccitaliano at gmail.com>, "Tim
> Amini Golling" <mehdi.amini at apple.com>, "Sanjoy Das" <
> sanjoy at playingwithpointers.com>, "Pete Cooper" <peter_cooper at apple.com>,
> "Chandler Carruth" <chandlerc at gmail.com>
> *Sent: *Monday, July 25, 2016 2:47:19 PM
>
> *Subject: *Re: [PM] I think that the new PM needs to learn about
> inter-analysis dependencies...
>
>
>
> On Mon, Jul 25, 2016 at 9:27 AM, Hal Finkel <hfinkel at anl.gov> wrote:
>
>>
>> ------------------------------
>>
>> *From: *"Sean Silva" <chisophugis at gmail.com>
>> *To: *"Chandler Carruth" <chandlerc at gmail.com>
>> *Cc: *"Xinliang David Li" <davidxl at google.com>, "llvm-dev" <
>> llvm-dev at lists.llvm.org>, "Davide Italiano" <dccitaliano at gmail.com>,
>> "Tim Amini Golling" <mehdi.amini at apple.com>, "Hal Finkel" <
>> hfinkel at anl.gov>, "Sanjoy Das" <sanjoy at playingwithpointers.com>, "Pete
>> Cooper" <peter_cooper at apple.com>
>> *Sent: *Thursday, July 21, 2016 3:59:28 AM
>> *Subject: *Re: [PM] I think that the new PM needs to learn about
>> inter-analysis dependencies...
>>
>> We did some basic sanity checking that memory usage didn't go out of
>> control (it doesn't; at least with with a simple
>> preserves-all/preserves-none invalidation scheme and the current LTO
>> pipeline). Also, I did some basic sanity checking for compile time. The
>> simple preserves-all/preserves-none invalidation scheme seems marginally
>> slower, but no real conclusions (besides a simple sanity check) can be
>> drawn without the real analysis preservation semantics in place.
>>
>> I'll start working on fixing the analysis managers. There seem to
>> basically be two parts (although they may need to be done simultaneously to
>> make sure all the pieces fit together):
>> - unify all the analysis managers into a single analysis manager for all
>> IRUnitT's (requires type-erasing the IRUnit)
>> - introduce the dependency tracking machinery
>>
>> I think I gave a reasonable outline in the two posts above:
>> - the one starting with "To clarify, it seems like the current new PM is
>> essentially trying to solve the problem of maintaining/updating a mapping"
>> - the one starting with "Yeah, the mechanics of maintaining this fully
>> general mapping are straightforward in the abstract"
>>
>> I'm happy to do a centralized writeup if anybody wants. Just let me know.
>>
>> As far as changes to the code, the updates to the new PM passes should
>> hopefully be mechanical (despite there being many of them). However, from
>> what I can tell, fixing this problem will require touching most lines of
>> the analysis manager machinery so the diff will probably be a bit scary,
>> even though I think we can keep the same basic structure (i.e. a per-IRUnit
>> std::list holding one analysis result (at a stable address) per element,
>> combined with a DenseMap from (Analysis, IRUnit) to an element of the
>> per-IRUnit storage list (see AnalysisResultListMapT and AnalysisResultMapT
>> in include/llvm/IR/PassManager.h)).
>> The main changes to the analysis manager will be:
>> - type-erasing the IRUnit
>> - the elements of the AnalysisResultListMapT will need to keep track of
>> any dependents
>> - the analysis manager will need to update those dependencies as it is
>> re-entered by analyses getting results of other analyses
>> - the analysis manager will need to walk the dependencies to do
>> transitive invalidation
>>
>> I think the most robust approach is for analysis dependencies to be
>> implicitly constructed by the analysis manager via tracking entry/exit from
>> get{,Cached}Result.
>> One alternative is for analyses to explicitly pass in their ID to
>> getResult to indicate the dependency, but that seems error-prone (and also
>> verbose). The issue is that we will need a getResult API that does not
>> track dependencies for use by transformation passes (since there is no
>> dependency to track in that case); so an innocuous copy-paste from a
>> transform pass to an analysis can result in a failure to track dependencies
>> and risk of use-after-free (we could fight this with the type system, but I
>> think that would get a bit verbose (I'm willing to try it though if people
>> would prefer))
>> One restriction of the implicit tracking approach is that it requires all
>> calls into the analysis manager to occur in the `run` method of the
>> analysis (so that the dependencies are implicitly tracked via re-entrance
>> to get{,Cached}Result); is this a reasonable restriction?
>>
>> What's the potential use case for getting an analysis outside of
>> something in, or called by, run()?
>>
>
> The main thing I had in mind was that if an analysis happens to stash a
> pointer directly to the analysis manager, then it may call into the
> analysis manager in the query path. For example, you would have this
> situation if a module analysis has a query path that wants to lazily
> compute a function analysis (i.e. compute the function analysis outside of
> the `run` call).
>
> Yes, I'm pretty sure that we want to support that (a module analysis
> lazily getting function-level analysis results on the functions in the
> module). To put it another way, we definitely want a module transformation
> to be able to do that, and if an analysis can't, that limits our ability to
> properly organize/reuse code.
>
> We might have a slightly different interface for that, however, if it
> can't be implicit in that case.
>

Thankfully, the solution is straightforward enough. We just need a query
API that takes an explicit dependency.

As a strawman design that we may want to move to long-term, we could make
the FooAnalysis's `run` method take an AnalysisManagerRef<FooAnalysis>,
which wraps the raw analysis manager (all of whose methods take an explicit
dependency) and automatically passes in the dependency on FooAnalysis.
Analysis would be free to hold the AnalysisManagerRef and query it at any
time. It also removes the need for the analysis manager to track when it is
re-entered.
This is probably not worth doing until we have more experience with this
analysis management stuff (e.g. actually use it in production).

-- Sean Silva

>
>
>  -Hal
>
>
>
>>
>>
>> One annoying problem is that I think that the dependency links will need
>> to be bidirectional. To use the example analysis cache from my other post:
>> (AssumptionAnalysis, function @bar) -> (AssumptionCache for @bar,
>> [(SomeModuleAnalysis, module TheModule)])
>> (AssumptionAnalysis, function @baz) -> (AssumptionCache for @baz,
>> [(SomeModuleAnalysis, module TheModule)])
>> (SomeModuleAnalysis, module TheModule) -> (SomeModuleAnalysisResult for
>> TheModule, [(SomeFunctionAnalysis, function @baz)])
>> (SomeFunctionAnalysis, function @baz) -> (SomeFunctionAnalysisResult for
>> @baz, [])
>>
>> if we delete function @baz, then the dependent list  [(SomeFunctionAnalysis,
>> function @baz)] for `(SomeModuleAnalysis, module TheModule)` will now
>> have a stale pointer to function @baz. I think that in practice we can
>> check to see if `(SomeFunctionAnalysis, function @baz)` is in our hash
>> table and if it isn't then just ignore the dependency as "this dependent
>> analysis result has already been freed". In the worst case (memory
>> allocator reuses the memory for another function) we may spuriously free an
>> analysis result for a different function. However it is still unsettling
>> (and may actually be UB in C++?).
>> Ideally we would track bidirectional links; that way when we remove an
>> analysis result we also have it remove itself from dependence lists of all
>> of its dependencies.
>>
>> I think that bidirectional tracking is preferable. I don't see how to
>> avoid UB otherwise, and spuriously freeing things based on allocator
>> address reuse will likely lead to non-deterministic behavior (e.g. because
>> a re-run analysis might be more accurate than a preserved one).
>>
>
> Yeah, it doesn't seem onerous to maintain the bidirectional tracking (I
> have a design mocked up in my log). The problem actually ends up being
> similar to use/user tracking we have in the IR data structures (although
> it's different enough that there isn't really any code to share).
>
> -- Sean Silva
>
>
>>
>> Thanks again,
>> Hal
>>
>> -- Sean Silva
>>
>> On Fri, Jul 15, 2016 at 8:40 PM, Sean Silva <chisophugis at gmail.com>
>> wrote:
>>
>>>
>>>
>>> On Fri, Jul 15, 2016 at 8:39 PM, Sean Silva <chisophugis at gmail.com>
>>> wrote:
>>>
>>>> It looks like there is really no sane fix within the current
>>>> infrastructure. I've had to essentially trigger invalidation (except in the
>>>> PreservedAnalyses::all() case) in the function pass manager and function to
>>>> loop adapters.
>>>>
>>>
>>> invalidation of *everything* I mean.
>>>
>>> -- Sean Silva
>>>
>>>
>>>>
>>>> So we basically need to get the analysis manager dependency tracking
>>>> fixed.
>>>>
>>>> Davide and I will get measurements on the resident set impact of all
>>>> this caching (even with the overconservative invalidation for now) to see
>>>> the impact. If there is a big rss impact then we probably want to consider
>>>> that problem at the same time as the rewrite of the analysis manager.
>>>>
>>>> -- Sean Silva
>>>>
>>>> On Thu, Jul 14, 2016 at 12:51 AM, Sean Silva <chisophugis at gmail.com>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Wed, Jul 13, 2016 at 1:48 AM, Sean Silva <chisophugis at gmail.com>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Jul 13, 2016 at 12:34 AM, Chandler Carruth <
>>>>>> chandlerc at gmail.com> wrote:
>>>>>>
>>>>>>> On Wed, Jul 13, 2016 at 12:25 AM Sean Silva <chisophugis at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> On Tue, Jul 12, 2016 at 11:39 PM, Chandler Carruth <
>>>>>>>> chandlerc at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> On Tue, Jul 12, 2016 at 11:34 PM Sean Silva <chisophugis at gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> On Tue, Jul 12, 2016 at 11:32 PM, Xinliang David Li <
>>>>>>>>>> davidxl at google.com> wrote:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Jul 12, 2016 at 10:57 PM, Chandler Carruth <
>>>>>>>>>>> chandlerc at gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Yea, this is a nasty problem.
>>>>>>>>>>>>
>>>>>>>>>>>> One important thing to understand is that this is specific to
>>>>>>>>>>>> analyses which hold references to other analyses. While this isn't unheard
>>>>>>>>>>>> of, it isn't as common as it could be. Still, definitely something we need
>>>>>>>>>>>> to address.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> We can call this type of dependencies (holding references)
>>>>>>>>>>> hard-dependency. The soft dependency refers to the case where analysis 'A'
>>>>>>>>>>> depends on 'B' during computation, but does not need 'B' once it is
>>>>>>>>>>> computed.
>>>>>>>>>>>
>>>>>>>>>>> There are actually quite a few examples of hard-dependency case.
>>>>>>>>>>> For instance LoopAccessInfo, LazyValueInfo etc which hold references to
>>>>>>>>>>> other analyses.
>>>>>>>>>>>
>>>>>>>>>>> Problem involving hard-dependency is actually easier to detect,
>>>>>>>>>>> as it is usually a compile time problem. Issues involving soft dependencies
>>>>>>>>>>> are more subtle and can lead to wrong code gen.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Did you mean to say that soft-dependency problems are easier to
>>>>>>>>>> detect? At least my intuition is that soft-dependency is easier because
>>>>>>>>>> there is no risk of dangling pointers to other analyses.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The issue is that the fact that there is *any* dependency isn't
>>>>>>>>> clear.
>>>>>>>>>
>>>>>>>>> However, I think the only real problem here are these "hard
>>>>>>>>> dependencies" (I don't really like that term though). For others, only an
>>>>>>>>> analysis that is *explicitly* preserved survives. So I'm not worried about
>>>>>>>>> the fact that people have to remember this.
>>>>>>>>>
>>>>>>>>> The question is how often there are cross-data-structure
>>>>>>>>> references. David mentions a few examples, and I'm sure there are more, but
>>>>>>>>> it isn't clear to me yet whether this is pervasive or occasional.
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> I just did a quick run-through of PassRegistry.def and this is what
>>>>>>>> I found:
>>>>>>>>
>>>>>>>> Module analyses: 0/5 hold pointers to other analyses
>>>>>>>> CallGraph: No pointers to other analyses.
>>>>>>>> LazyCallGraph: No pointers to other analyses.
>>>>>>>> ProfileSummaryAnalysis: No pointers to other analyses.
>>>>>>>> TargetLibraryAnalysis: No pointers to other analyses.
>>>>>>>> VerifierAnalysis: No pointers to other analyses.
>>>>>>>>
>>>>>>>> Module alias analyses: 1/1 keeps pointer to other analysis.
>>>>>>>> GlobalsAA: Result keeps pointer to TLI (this is a function
>>>>>>>> analysis).
>>>>>>>>
>>>>>>>> Function analyses: 9/17 keep pointers to other analysis
>>>>>>>> AAManager: Its Result holds TLI pointer and pointers to individual
>>>>>>>> AA result objects.
>>>>>>>> AssumptionAnalysis: No pointers to other analyses.
>>>>>>>> BlockFrequencyAnalysis: Its Result holds pointers to LoopInfo and
>>>>>>>> BPI.
>>>>>>>> BranchProbabilityAnalysis: Stores no pointers to other analyses.
>>>>>>>> (uses LoopInfo to "recalculate" though)
>>>>>>>> DominatorTreeAnalysis: Stores no pointers to other analyses.
>>>>>>>> PostDominatorTreeAnalysis: Stores no pointers to other analyses.
>>>>>>>> DemandedBitsAnalysis: Stores pointers to AssumptionCache
>>>>>>>> and DominatorTree
>>>>>>>> DominanceFrontierAnalysis: Stores no pointers to other analyses.
>>>>>>>> (uses DominatorTreeAnalysis for "recalculate" though).
>>>>>>>> LoopInfo: Uses DominatorTreeAnalysis for "recalculate" but stores
>>>>>>>> no pointers.
>>>>>>>> LazyValueAnalysis: Stores pointers to AssumptionCache,
>>>>>>>> TargetLibraryInfo, DominatorTree.
>>>>>>>> DependenceAnalysis: Stores pointers to AliasAnalysis,
>>>>>>>> ScalarEvolution, LoopInfo
>>>>>>>> MemoryDependenceAnalysis: Stores pointers to AliasAnalysis,
>>>>>>>> AssumptionCache, TargetLibraryInfo, DominatorTree
>>>>>>>> MemorySSAAnalysis: Stores pointers to AliasAnalysis, DominatorTree
>>>>>>>> RegionInfoAnalysis: Stores pointers to DomTree, PostDomTree,
>>>>>>>> DomFrontier
>>>>>>>> ScalarEvolutionAnalysis: Stores pointers to TargetLibraryInfo,
>>>>>>>> AssumptionCache, DominatorTree, LoopInfo
>>>>>>>> TargetLibraryAnalysis: Has no dependencies
>>>>>>>> TargetIRAnalysis: Has no dependencies.
>>>>>>>>
>>>>>>>> Function alias analyses: 3/5 keep pointers to other analyses
>>>>>>>> BasicAA: Keeps pointers to TargetLibraryInfo, AssumptionCache,
>>>>>>>> DominatorTree, LoopInfo
>>>>>>>> CFLAA: Keeps pointer to TargetLibraryInfo
>>>>>>>> SCEVAA: Keeps pointer to ScalarEvolution
>>>>>>>> ScopedNoAliasAA: No dependencies
>>>>>>>> TypeBasedAA: No dependencies
>>>>>>>>
>>>>>>>>
>>>>>>>> Total: 13/28 analyses (~50%) hold pointers to other analyses.
>>>>>>>> Of the 15/28 analyses that don't hold pointers, 12/15 simply have
>>>>>>>> no dependencies. Only 3/15 (BPI, LoopInfo, DominanceFrontier) have
>>>>>>>> dependencies that are used just for a "recalculate" step that retains no
>>>>>>>> pointers.
>>>>>>>> So I think it is fair to say that analyses which hold pointers to
>>>>>>>> other analyses is not an exceptional case. In fact, analyses that use other
>>>>>>>> analyses just for a "recalculate" step seems to be the exceptional case
>>>>>>>> (only 3/28 or about 10%)
>>>>>>>>
>>>>>>>
>>>>>>> Interesting!
>>>>>>>
>>>>>>> Most of these look like they hold a pointer to the root analysis as
>>>>>>> opposed to detailed objects *inside* the analysis?
>>>>>>>
>>>>>>> It might make sense to try to handle this very specific pattern in a
>>>>>>> special way of overriding the invalidate routines is too error prone.... We
>>>>>>> could try to make this work "automatically" but I'm worried this would be
>>>>>>> challenging to get right. Open to suggestions of course.
>>>>>>>
>>>>>>> Any other ideas about what would make sense to handle this?
>>>>>>>
>>>>>>> Does it make sense to override the invalidate routines now and
>>>>>>> iterate from there? I feel like you've done a lot of the research necessary
>>>>>>> for this already...
>>>>>>>
>>>>>>
>>>>>> I'll keep pushing forward tomorrow with building test-suite
>>>>>> successfully using the new PM for the LTO pipeline (I was doing some
>>>>>> unrelated LLD stuff for most of today). It will be interesting to see how
>>>>>> many `invalidate` overrides will be needed to avoid these issues for just
>>>>>> the LTO pipeline on test-suite.
>>>>>>
>>>>>
>>>>> I spent the better part of today working on this and will continue
>>>>> tomorrow; this problem seems nastier than I thought. For some reason the
>>>>> LTO pipeline (or something about LTO) seems to hit on these issues much
>>>>> more (I'm talking like 40k lines of ASan error reports from building
>>>>> test-suite with the LTO pipeline in the new PM; per-TU steps still using
>>>>> the old PM). Some notes:
>>>>>
>>>>> - BasicAA's dependence on domtree and loopinfo in the new PM seems to
>>>>> account for quite a few of the problems.
>>>>> - BasicAA and other stuff are marked (by overriding `invalidate` to
>>>>> return false) to never be invalidated because they are "stateless". However
>>>>> they still hold pointers and so they do need to be invalidated.
>>>>> - CallGraph uses AssertingVH (PR28400) and so I needed a workaround
>>>>> similar to r274656 in various passes.
>>>>> - D21921 is holding up -- I haven't hit any issues with the core
>>>>> logic of that patch.
>>>>> - AAResults holds handles to various AA result objects. This means it
>>>>> pretty much always needs to be invalidated unless you are sure that none of
>>>>> the AA's will get invalidated.
>>>>>
>>>>>
>>>>> The existing `invalidate` method doesn't have the right semantics for
>>>>> even an error-prone solution :( We are going to need to make some
>>>>> significant changes to even get basic sanity I think. Perhaps each analysis
>>>>> can expose a "preserve" static function. E.g. instead of
>>>>> `PA.preserve<Foo>();` you have to do `Foo::setPreserved(PA);`.
>>>>> I'm actually not quite sure that that will even work. Once I have
>>>>> test-suite fully building successfully with the LTO pipeline in the new PM
>>>>> I'll be able to give a more confident answer (esp. w.r.t. the manager for
>>>>> different IRUnitT's).
>>>>> But at this point I'm not confident running *any* pass pipeline in the
>>>>> new PM without at least assertions+ASan.
>>>>>
>>>>> We may want to have a proper design discussion around this problem
>>>>> though.
>>>>>
>>>>> Also I'd like to have test-suite working (by hook or by crook) with
>>>>> LTO in the new PM so we can get some numbers on the resident set impact of
>>>>> all this caching; if it is really problematic then we may need to start
>>>>> talking front-and-center about different invalidation policies for keeping
>>>>> this in check instead of leaving it as something that we will be able to
>>>>> patch later.
>>>>>
>>>>>
>>>>>
>>>>> The more I think about it, the more I'm convinced that the real "hard"
>>>>> problem that the new PM is exposing us to is having the ability for any
>>>>> pass to ask for any analysis on any IRUnitT (and any specific IRUnit of
>>>>> that IRUnitT) and have the result stored somewhere and then invalidated.
>>>>> This means that "getAnalysisUsage" is not just a list of passes, but much
>>>>> more complicated and is essentially a set of arbitrary pairs "(analysis,
>>>>> IRUnit)" (and the associated potential tangle of dependencies between the
>>>>> state cached on these tuples). With the old PM, you essentially are looking
>>>>> at a problem of scheduling the lifetime of analyses of the same IRUnit
>>>>> intermingled with transformation passes on that same IRUnit, so you only
>>>>> have the "analysis" part of the tuple above, making things much simpler
>>>>> (and handling dependencies is much simpler too). We've obviously outgrown
>>>>> this model with examples like LAA, AssumptionCacheTracker, etc. that hack
>>>>> around this in the old PM. We may want to have a fresh re-examination of
>>>>> what problems we are exactly trying to solve.
>>>>>
>>>>> For me, my main concern now is what changes need to be made in order
>>>>> to feel confident running a pipeline in the new PM without assertions+ASan.
>>>>>
>>>>>
>>>>> Sorry for the long post, just brain-dumping before heading home.
>>>>>
>>>>> -- Sean Silva
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> -- Sean Silva
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>>
>>
>> --
>> Hal Finkel
>> Assistant Computational Scientist
>> Leadership Computing Facility
>> Argonne National Laboratory
>>
>
>
>
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160725/040a45da/attachment.html>