[llvm-dev] [RFC] Add IR level interprocedural outliner for code size.

Tue Aug 1 01:17:51 PDT 2017

...and adding $0.02 to the "IR outline + inline + MIR outline" idea, my gut
feeling (yes, only a "feeling" -- and one coming from my gut, not head!) is
that inlining correcting wrong IR outlining decisions with MIR outlining
correcting wrong inlining decisions is absolutely unrealistic and a
heuristics nightmare at best.

Inliner's heuristics are already complex enough and not 100% bulletproof;
if we add IR outliner heuristics to the mix -- and then just a little bit
of MIR outliner heuristics (which are more precise but, as demonstrated
above, not 100% precise as well) on top... you can imagine.

Yours,
Andrey

On Tue, Aug 1, 2017 at 10:07 AM, Andrey Bokhanko <andreybokhanko at gmail.com>
wrote:

> All,
>
> +1 to what Mehdi said.
>
> It's a fair concern to question whatever we need yet another Outlining
> pass. I believe this concern has been cleared by River -- both with
> theoretical arguments and practical data (benchmark numbers).
>
> Jessica's pipeline proposal is completely orthogonal. It's not fair to
> request River to implement / fit into what she suggested. Sure, it's a
> valid topic to discuss -- but yet completely orthogonal one. If anything,
> accepting River's implementation would enable us to do experiments /
> developments like pipeline changes of this ilk!
>
> Yours,
> Andrey
> ===
> Compiler Architect
> NXP
>
>
> On Tue, Aug 1, 2017 at 7:38 AM, Mehdi AMINI via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>>
>>
>> 2017-07-28 21:58 GMT-07:00 Chris Bieneman via llvm-dev <
>> llvm-dev at lists.llvm.org>:
>>
>>> Apologies for delayed joining of this discussion, but I had a few notes
>>> from this thread that I really wanted to chime in about.
>>>
>>> River,
>>>
>>> I don't mean to put you on the spot, but I do want to start on a
>>> semantic issue. In several places in the thread you used the words "we" and
>>> "our" to imply that you're not alone in writing this (which is totally
>>> fine), but your initial thread presented this as entirely your own work.
>>> So, when you said things like "we feel there's an advantage to being at the
>>> IR level", can you please clarify who is "we"?
>>>
>>> Given that there are a number of disagreements and opinions floating
>>> around I think it benefits us all to speak clearly about who is taking what
>>> stances.
>>>
>>> One particular disagreement that I think very much needs to be revisited
>>> in this thread was Jessica's proposal of a pipeline of:
>>>
>>>    1. IR outline
>>>    2. Inline
>>>    3. MIR outline
>>>
>>> In your response to that proposal you dismissed it out of hand with
>>> "feelings" but not data. Given that the proposal came from Jessica (a
>>> community member with significant relevant experience in outlining), and it
>>> was also recognized as interesting by Eric Christopher (a long-time member
>>> of the community with wide reaching expertise), I think dismissing it may
>>> have been a little premature.
>>>
>>
>> It isn't clear to me how much the *exact* pipeline and ordering of passes
>> is relevant to consider if "having an outliner at the IR level" is a good
>> idea.
>>
>>
>>
>>> I also want to visit a few procedural notes.
>>>
>>> Mehdi commented on the thread that it wouldn't be fair to ask for a
>>> comparative study because the MIR outliner didn't have one. While I don't
>>> think anyone is asking for a comparative study, I want to point out that I
>>> think it is completely fair.
>>>
>> If a new contributor approached the community with a new SROA pass and
>>> wanted to land it in-tree it would be appropriate to ask for a comparative
>>> analysis against the existing pass. How is this different?
>>>
>>
>> It seems quite different to me because there is no outliner at the IR
>> level and so they don't provide the same functionality. The "Why at the IR
>> level" section of the original email combined with the performance numbers
>> seems largely enough to me to explain why it isn't redundant to the
>> Machine-level outliner.
>> I'd consider this work for inclusion upstream purely on its technical
>> merit at this point.
>> Discussing inclusion as part of any of the default pipeline is a
>> different story.
>>
>> Similarly last year, the IR-level PGO was also implemented even though we
>> already had a PGO implementation, because 1) it provided a generic
>> solutions for other frontend (just like here it could be said that it
>> provides a generic solution for targets) and 2) it supported cases that
>> FE-PGO didn't (especially around better counter-context using pre-inlining
>> and such).
>>
>>
>>
>>>
>>> Adding a new IR outliner is a different situation from when the MIR one
>>> was added. When the MIR outliner was introduced there was no in-tree
>>> analog.
>>>
>>
>> We still usually discuss design extensively. Skipping the IR-level option
>> didn't seem obvious to me, to say the least. And it wasn't really much
>> discussed/considered extensively upstream.
>> If the idea is that implementing a concept at the machine level may
>> preclude a future implementation at the IR level, it means we should be *a
>> lot* more picky before accepting such contribution.
>> In this case, if I had anticipated any push-back on an IR-level
>> implementation only based on the fact that we have now a Machine-level one,
>> I'd likely have pushed back on the machine-level one.
>>
>>
>>
>>> When someone comes to the community with something that has no existing
>>> in-tree analog it isn't fair to necessarily ask them to implement it
>>> multiple different ways to prove their solution is the best.
>>>
>>
>> It may or may not be fair, but there is a tradeoff in how much effort we
>> would require them to convince the community that this is *the* right way
>> to go, depending on what it implies for future approaches.
>>
>> --
>> Mehdi
>>
>>
>>> However, as a community, we do still exercise the right to reject
>>> contributions we disagree with, and we frequently request changes to the
>>> implementation (as is shown every time someone tries to add SPIR-V support).
>>>
>>> In the LLVM community we have a long history of approaching large
>>> contributions (especially ones from new contributors) with scrutiny and
>>> discussion. It would be a disservice to the project to forget that.
>>>
>>> River, as a last note. I see that you've started uploading patches to
>>> Phabricator, and I know you're relatively new to the community. When
>>> uploading patches it helps to include appropriate reviewers so that the
>>> right people see the patches as they come in. To that end can you please
>>> include Jessica as a reviewer? Given her relevant domain experience I think
>>> her feedback on the patches will be very valuable.
>>>
>>> Thank you,
>>> -Chris
>>>
>>> On Jul 26, 2017, at 1:52 PM, River Riddle via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>> Hey Sanjoy,
>>>
>>> On Wed, Jul 26, 2017 at 1:41 PM, Sanjoy Das via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>>> Hi,
>>>>
>>>> On Wed, Jul 26, 2017 at 12:54 PM, Sean Silva <chisophugis at gmail.com>
>>>> wrote:
>>>> > The way I interpret Quentin's statement is something like:
>>>> >
>>>> > - Inlining turns an interprocedural problem into an intraprocedural
>>>> problem
>>>> > - Outlining turns an intraprocedural problem into an interprocedural
>>>> problem
>>>> >
>>>> > Insofar as our intraprocedural analyses and transformations are
>>>> strictly
>>>> > more powerful than interprocedural, then there is a precise sense in
>>>> which
>>>> > inlining exposes optimization opportunities while outlining does not.
>>>>
>>>> While I think our intra-proc optimizations are *generally* more
>>>> powerful, I don't think they are *always* more powerful.  For
>>>> instance, LICM (today) won't hoist full regions but it can hoist
>>>> single function calls.  If we can extract out a region into a
>>>> readnone+nounwind function call then LICM will hoist it to the
>>>> preheader if the safety checks pass.
>>>>
>>>> > Actually, for his internship last summer River wrote a profile-guided
>>>> > outliner / partial inliner (it didn't try to do deduplication; so it
>>>> was
>>>> > more like PartialInliner.cpp). IIRC he found that LLVM's
>>>> interprocedural
>>>> > analyses were so bad that there were pretty adverse effects from many
>>>> of the
>>>> > outlining decisions. E.g. if you outline from the left side of a
>>>> diamond,
>>>> > that side basically becomes a black box to most LLVM analyses and
>>>> forces
>>>> > downstream dataflow meet points to give an overly conservative
>>>> result, even
>>>> > though our standard intraprocedural analyses would have happily dug
>>>> through
>>>> > the left side of the diamond if the code had not been outlined.
>>>> >
>>>> > Also, River's patch (the one in this thread) does parameterized
>>>> outlining.
>>>> > For example, two sequences containing stores can be outlined even if
>>>> the
>>>> > corresponding stores have different pointers. The pointer to be
>>>> loaded from
>>>> > is passed as a parameter to the outlined function. In that sense, the
>>>> > outlined function's behavior becomes a conservative approximation of
>>>> both
>>>> > which in principle loses precision.
>>>>
>>>> Can we outline only once we've already done all of these optimizations
>>>> that outlining would block?
>>>>
>>>
>>>   The outliner is able to run at any point in the interprocedural
>>> pipeline. There are currently two locations: Early outlining(pre inliner)
>>> and late outlining(practically the last pass to run). It is configured to
>>> run either Early+Late, or just Late.
>>>
>>>
>>>> > I like your EarlyCSE example and it is interesting that combined with
>>>> > functionattrs it can make a "cheap" pass get a transformation that an
>>>> > "expensive" pass would otherwise be needed. Are there any cases where
>>>> we
>>>> > only have the "cheap" pass and thus the outlining would be essential
>>>> for our
>>>> > optimization pipeline to get the optimization right?
>>>> >
>>>> > The case that comes to mind for me is cases where we have some cutoff
>>>> of
>>>> > search depth. Reducing a sequence to a single call (+ functionattr
>>>> > inference) can essentially summarize the sequence and effectively
>>>> increase
>>>> > search depth, which might give more results. That seems like a bit of
>>>> a weak
>>>> > example though.
>>>>
>>>> I don't know if River's patch outlines entire control flow regions at
>>>> a time, but if it does then we could use cheap basic block scanning
>>>> analyses for things that would normally require CFG-level analysis.
>>>>
>>>
>>>   The current patch currently just supports outlining from within a
>>> single block. Although, I had a working prototype for Region based
>>> outlining, I kept it from this patch for simplicity. So its entirely
>>> possible to add that kind of functionality because I've already tried.
>>> Thanks,
>>>   River Riddle
>>>
>>>
>>>>
>>>> -- Sanjoy
>>>>
>>>> >
>>>> > -- Sean Silva
>>>> >
>>>> > On Wed, Jul 26, 2017 at 12:07 PM, Sanjoy Das via llvm-dev
>>>> > <llvm-dev at lists.llvm.org> wrote:
>>>> >>
>>>> >> Hi,
>>>> >>
>>>> >> On Wed, Jul 26, 2017 at 10:10 AM, Quentin Colombet via llvm-dev
>>>> >> <llvm-dev at lists.llvm.org> wrote:
>>>> >> > No, I mean in terms of enabling other optimizations in the
>>>> pipeline like
>>>> >> > vectorizer. Outliner does not expose any of that.
>>>> >>
>>>> >> I have not made a lot of effort to understand the full discussion
>>>> here (so
>>>> >> what
>>>> >> I say below may be off-base), but I think there are some cases where
>>>> >> outlining
>>>> >> (especially working with function-attrs) can make optimization
>>>> easier.
>>>> >>
>>>> >> It can help transforms that duplicate code (like loop unrolling and
>>>> >> inlining) be
>>>> >> more profitable -- I'm thinking of cases where unrolling/inlining
>>>> would
>>>> >> have to
>>>> >> duplicate a lot of code, but after outlining would require
>>>> duplicating
>>>> >> only a
>>>> >> few call instructions.
>>>> >>
>>>> >>
>>>> >> It can help EarlyCSE do things that require GVN today:
>>>> >>
>>>> >> void foo() {
>>>> >>   ... complex computation that computes func()
>>>> >>   ... complex computation that computes func()
>>>> >> }
>>>> >>
>>>> >> outlining=>
>>>> >>
>>>> >> int func() { ... }
>>>> >>
>>>> >> void foo() {
>>>> >>   int x = func();
>>>> >>   int y = func();
>>>> >> }
>>>> >>
>>>> >> functionattrs=>
>>>> >>
>>>> >> int func() readonly { ... }
>>>> >>
>>>> >> void foo(int a, int b) {
>>>> >>   int x = func();
>>>> >>   int y = func();
>>>> >> }
>>>> >>
>>>> >> earlycse=>
>>>> >>
>>>> >> int func(int t) readnone { ... }
>>>> >>
>>>> >> void foo(int a, int b) {
>>>> >>   int x = func(a);
>>>> >>   int y = x;
>>>> >> }
>>>> >>
>>>> >> GVN will catch this, but EarlyCSE is (at least supposed to be!)
>>>> cheaper.
>>>> >>
>>>> >>
>>>> >> Once we have an analysis that can prove that certain functions can't
>>>> trap,
>>>> >> outlining can allow LICM etc. to speculate entire outlined regions
>>>> out of
>>>> >> loops.
>>>> >>
>>>> >>
>>>> >> Generally, I think outlining exposes information that certain
>>>> regions of
>>>> >> the
>>>> >> program are doing identical things.  We should expect to get some
>>>> mileage
>>>> >> out of
>>>> >> this information.
>>>> >>
>>>> >> -- Sanjoy
>>>> >> _______________________________________________
>>>> >> LLVM Developers mailing list
>>>> >> llvm-dev at lists.llvm.org
>>>> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>> >
>>>> >
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170801/49ddd482/attachment.html>