[llvm-dev] [PM] I think that the new PM needs to learn about inter-analysis dependencies...

Tue Jul 26 23:26:17 PDT 2016

On Tue, Jul 26, 2016 at 7:11 AM, Hal Finkel <hfinkel at anl.gov> wrote:

>
> ------------------------------
>
> *From: *"Sean Silva" <chisophugis at gmail.com>
> *To: *"Chandler Carruth" <chandlerc at google.com>
> *Cc: *"Hal J. Finkel" <hfinkel at anl.gov>, "llvm-dev" <
> llvm-dev at lists.llvm.org>, "Davide Italiano" <dccitaliano at gmail.com>,
> "Xinliang David Li" <davidxl at google.com>
> *Sent: *Monday, July 25, 2016 8:32:06 PM
> *Subject: *Re: [llvm-dev] [PM] I think that the new PM needs to learn
> about inter-analysis dependencies...
>
>
>
>
> On Mon, Jul 25, 2016 at 3:48 PM, Chandler Carruth <chandlerc at google.com>
> wrote:
>
>> On Mon, Jul 25, 2016 at 3:40 PM Finkel, Hal J. via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> *Sent from my Verizon Wireless 4G LTE DROID*
>>>
>>> On Jul 25, 2016 6:16 PM, Sean Silva <chisophugis at gmail.com> wrote:
>>> >
>>> >
>>> >
>>> > On Mon, Jul 25, 2016 at 9:27 AM, Hal Finkel <hfinkel at anl.gov> wrote:
>>> >>
>>> >>
>>> >> ________________________________
>>> >>>
>>> >>> From: "Sean Silva" <chisophugis at gmail.com>
>>> >>> To: "Chandler Carruth" <chandlerc at gmail.com>
>>> >>> Cc: "Xinliang David Li" <davidxl at google.com>, "llvm-dev" <
>>> llvm-dev at lists.llvm.org>, "Davide Italiano" <dccitaliano at gmail.com>,
>>> "Tim Amini Golling" <mehdi.amini at apple.com>, "Hal Finkel" <
>>> hfinkel at anl.gov>, "Sanjoy Das" <sanjoy at playingwithpointers.com>, "Pete
>>> Cooper" <peter_cooper at apple.com>
>>> >>> Sent: Friday, July 22, 2016 3:55:52 AM
>>> >>> Subject: Re: [PM] I think that the new PM needs to learn about
>>> inter-analysis dependencies...
>>> >>>
>>> >>> The more closely I look at this, the more it seems like there may be
>>> a useful incremental step in the transition to the new PM: use the new PM
>>> analysis machinery in the old PM. If this is possible, it will simplify the
>>> old PM and (hopefully) allow an incremental transition to the new PM
>>> instead of a flag day transition for the switch.
>>> >>>
>>> >>> I.e., AFAICT, the new PM transition is essentially about 2 mostly
>>> orthogonal aspects of running optimization pipelines:
>>> >>> 1. Analysis computation and analysis result lifetime management
>>> (including avoiding analysis recomputation)
>>> >>> 2. Running transformation passes over their respective IRUnit's in
>>> some order
>>> >>>
>>> >>> These are conflated in the old PM. In reality, the only interaction
>>> between them (with the new PM machinery for 1.) is a small number of places
>>> within 2. which need to call 'invalidate'.
>>> >>>
>>> >>> I'm pretty sure that 2. is fairly similar in the new PM and old PM
>>> (the main difference is that the notion of "adapters" is split out in the
>>> new PM). The analysis handling seems to be what makes the old PM so
>>> difficult to understand (e.g. it is the cause of the multiple inheritance
>>> in the implementation). Trying to unify analyses and transformations (and
>>> some questionable (in hindsight) implementation decisions) seems to be the
>>> main "problem" with the design of the old PM AFAICT (there are other
>>> issues, but they are more "nice to have").
>>> >>>
>>> >>> IMO it is an anti-pattern to think of analyses as "passes". There
>>> are just "analyses" and "transformations" and they are two separate things.
>>> In fact, the `run` method on analyses should probably be called
>>> `computeResult` or something like that to avoid confusion.
>>> >>
>>> >> This makes sense to me.
>>> >>
>>> >> We do currently have some "in between" passes, like LCSSA, which are
>>> transformations, but are required by other passes, and transform the IR but
>>> whose preservation represents properties of the IR. The particulars of how
>>> we handle LCSSA aside (e.g. I think we should preserve it more, perhaps
>>> everywhere), how are we planning on handling this class of things?
>>> >
>>> >
>>> > The new PM doesn't currently have a concept like this. As you
>>> mentioned, it is a weird cross between a transformation and an analysis: it
>>> can be "invalidated" like an analysis, but "recomputing" it actually
>>> mutates the IR like a transformation.
>>> >
>>> > I'd like to preface the below with the following:
>>> > No matter how we ultimately address this requirement, my preference is
>>> that we do so in a way that applies to the old PM. This is a case where the
>>> old PM supports a richer set of functionality than the new PM. By
>>> incrementally refactoring the old PM away from its use of this extra
>>> capability and towards whatever "new" way there is to do it, we will
>>> understand better what it is that we actually need.
>>> >
>>> > (and sorry for the brain dump in the rest of this post)
>>> >
>>> >
>>> >
>>> > I have not seen any mention of a solution to this problem besides "we
>>> shouldn't do that", which is sort of a cop-out. Here is a strawman proposal:
>>> >
>>> > If it isn't too expensive, one simple alternative is to have passes
>>> just make a call to a utility function to put things in LCSSA if they need
>>> it (this utility would also have to invalidate analyses).
>>> > If that ends up being expensive, we can have a dummy "indicator"
>>> analysis IRIsInLCSSAForm which, if cached, means "don't bother to call the
>>> utility function". We could maybe just use the LCSSA pass directly to do
>>> the transformation. LCSSA could have IRIsInLCSSAForm as an member typedef
>>> `IndicatorT` so it can be accessed generically. We could then support an
>>> API like:
>>>
>>> I think this idea makes sense. My understanding is: There is nothing
>>> that prevents an analysis results from exposing a utility that transforms
>>> IR, and the result can certainly cache whether or not this transformation
>>> has been performed.
>>>
>>
>> Somewhat agreed, but I don't actually think this problem is as bad as it
>> seems in practice.
>>
>> We only have two places that do this (loop simplify and lcssa) and they
>> both *can* be modeled as "check if it is form X, and if not, put it in form
>> X" or as "check if it is form X, and if not, give up on transform". This
>> has been discussed several times, and the direction things have been
>> leaning for a long time has been:
>>
>
>> - Make LCSSA increasingly fundamental to the IR and always present, *or*
>> don't require LCSSA at all for transforms. Either of these solve the
>> problem.
>>
>> - Check for loop-simplified form if necessary, and skip the
>> transformation if not present. Because simplified form is simple to check
>> this seems to work well.
>>
>
> Personally, I would find it very disturbing if a transformation ever just
> silently does nothing. Especially if it depends on whether some set of
> previous transformations makes particular changes.
>
> In some sense, this is always true, and it is not particularly disturbing.
> Optimizations enable other optimizations, and so on. I think you're
> referring here to a more-specific situation: transformations have hard
> dependencies on other transformations, and those dependencies are not
> declared, or automatically satisfied, making them hard to use and debug.
> i.e. composing an opt command line to replicate a problem will be
> somewhat-unfortunately subtle.
>

Actually, I think that some loop pass not being able to do anything because
its input in LCSSA as a case that is less problematic. I think your
argument is pretty solid that transformations interact in subtle ways
already.

I'm more concerned with the "must be cached at a higher level" issue, as in
my practical testing in the new PM this has led to substantial wastes of
time. Essentially, what this leads to is that you write out a pipeline
(e.g. insert a pass somewhere in the pipeline) and rebuild test-suite, a
bunch of errors spew, then you have to go to the source code of the pass
that is emitting the "must be cached at a higher level", see what else it
needs, try adding that, and repeating. It's pretty miserable and wastes a
lot of time.

(also, it doesn't help that the pass pipeline parser's diagnostics are
currently at the level of "error parsing pass pipeline" with no other
information; yeah, yeah, patches welcome etc. just haven't got around to it)

-- Sean Silva

>
>
> When I talked to you in person at the social (last one or the one before
> IIRC), you also mentioned that you think that silently doing nothing is the
> solution to when an analysis on a larger IRUnit is not cached.
>
>
>> Anyways, I don't think we have to solve this problem 100% to make
>> progress on the pass manager. AT no point have I felt particularly blocked
>> on this.
>>
>
>>
>>>
>>>
>>> >
>>> > ```
>>> > FooTransformation.cpp:
>>> >
>>> > PreservedAnalyses FooTransformation::run(Function &F, AnalysisManager
>>> AM) {
>>> >   // Must be called before getting analyses, as it might invalidate
>>> some.
>>> >   canonicalizeIR<LCSSA>(F, AM);
>>> >
>>> >   ...
>>> > }
>>> >
>>> >
>>> > include/IR/Canonicalization.h:
>>> >
>>> > template <typename CanonicalizationT, typename IRUnitT>
>>> > void canonicalizeIR(IRUnitT &IR, AnalysisManager &AM) {
>>> >   using IndicatorT = typename CanonicalizationT::IndicatorAnalysis;
>>> >   if (AM.getCachedResult<IndicatorT>(IR))
>>> >     return;
>>> >   CanonicalizationT C;
>>> >   PreservedAnalysis PA = C.run(IR, AM);
>>> >   AM.invalidate(IR, PA);
>>> >   (void)AM.getResult<IndicatorT>(IR);
>>> > }
>>> >
>>> > ```
>>> >
>>> >
>>> > One place that talks about this problem of "requiring a
>>> transformation" is http://llvm.org/devmtg/2014-04/PDFs/Talks/Passes.pdf
>>> on slide 17.
>>> >
>>> > One reason it provides for "we shouldn't do that" is that if you think
>>> about these things as "canonicalize the IR into a specific form", then when
>>> you have N>1 such dependencies (like some passes do on LoopSimplify and
>>> LCSSA), one must have a subset of the requirements of the other. I.e. you
>>> can't have two canonicalizations that "fight" each other. Using an explicit
>>> mutation API like the strawman above is a bit less bulletproof than
>>> scheduling based on statically known interferences between
>>> canonicalizations (e.g. CanonicalizationA may invalidate CanonicalizationB,
>>> but not the reverse, so it would automatically know to run
>>> CanonicalizationA before CanonicalizationB), but given that we have
>>> relatively few "canonicalizations" (to give them a name) that use this
>>> feature of the old PM, it may be livable (at least in the middle-end, it
>>> seems like there is just LCSSA, LoopSimplify, BreakCriticalEdges, and
>>> LowerSwitch in calls to addPreservedID/addRequiredID).
>>> >
>>> > I don't find the "Causes rampant re-running of invalidated analyses"
>>> argument in that slide convincing. If a pass needs the IR in LCSSA then it
>>> needs it. There isn't much we can do about that.
>>> >
>>> >
>>> >
>>> >
>>> > One invariant I'd like to preserve in the new pass manager is that
>>> whatever pipeline is provided on the opt command line, we end up running
>>> something "valid"; so a cop-out like "if a pass needs LCSSA, you need to
>>> make sure to add LCSSA at an appropriate place before it in the pipeline"
>>> is not something I think is reasonable (way too error-prone).
>>> >
>>> > Small rant:
>>> >
>>> > We already are in this error-prone situation in the new PM with the
>>> need to call `getCachedResult` to access analyses from a larger IRUnitT
>>> (e.g. the situation I explained in the post-commit thread of r274712);
>>>
>>> Yea, I don't like this either. I think we both agree that we need a
>>> better solution to this. I think we should fix this now and then deal with
>>> potential concurrency issues when we actually have a design for that so we
>>> know what that means.
>>>
>>
>> FWIW, I strongly disagree.
>>
>
> Thankfully at least right now we just flat-out assert/crash alerting to
> the issue. I don't want to live in a world where passes start to silently
> become no-ops (possibly in a way that only manifests if e.g. a particular
> other pass in the pipeline happens to make changes and hence invalidate a
> particular analysis).
>
> That would mean living in a world where e.g. you do you build of
> test-suite with an experimental pipeline and halfway through get a message
> like "warning: licm is doing nothing" (if you even get that) and have to go
> and figure out which `require<...>` you need to put at a higher level, or
> figuring out which loop pass invalidated something that licm needed.
> This is exactly the current situation, but instead of a message you just
> get an assertion failure / crash.
>
> I'm happy to have a mode which crashes, for us, to help with test-case
> reduction, etc. I don't think we should have this by default, however. We
> can have warnings and collect statistics instead.
>
> Thanks again,
> Hal
>
>
> FWIW, I've been running realistic pipelines and actually using the new PM
> "for real" (e.g. bisecting a realistic pass pipeline on the opt command
> line to find where a bug is coming from, testing different optimization
> pipelines, etc.) and this is definitely one of the main issues.
> I think once you start testing out the new PM "for real" you will change
> your position.
> (Correct me if I'm wrong, but I have to assume that you haven't yet
> because you would have run into the showstopping bug that started this
> thread (filed as PR28622), or PR28400, or even simply PR28577.)
>
> -- Sean Silva
>
>
>
>>
>> I think it would be better to iterate on this once we understand how the
>> new pass manager works. I think exposing the fact that these things are
>> cached is really important and useful, and it makes querying across IR unit
>> boundaries significantly more clear at the call site.
>>
>
>
>
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160726/49b5b76f/attachment.html>