[llvm-dev] Revisiting/refining the definition of optnone with interprocedural transformations

Thu Apr 22 10:21:05 PDT 2021

On Thu, Apr 22, 2021 at 10:04 AM Johannes Doerfert
<johannesdoerfert at gmail.com> wrote:
>
>
> On 4/22/21 11:49 AM, David Blaikie wrote:
> > There seems to be a bunch of confusion and probably some
> > conflation/ambiguity about whether we're talking about IR constructs
> > on the C attributes.
> >
> > Johannes - I assume your claim is restricted mostly to the IR? That
> > having optnone not require or imply noinline improves orthogonality of
> > features and that there are reasonable use cases where one might want
> > optnone while allowing inlining (of the optnone function) or optnone
> > while disallowing inlining (of the optnone function)
>
> Yes, IR level is my concern. I'd be in favor of matching it in C
> but I don't care enough to go through the discussions.
>
> My latest use case: `s/optnone//g` should not result in a verifier
> error when you just want to try out an optimization on a single function.

I guess you meant adding optnone, rather than removing it? (removing
optnone shouldn't cause any verifier errors, does it?)

> > Paul - I think you're mostly thinking about/interested in the specific
> > source level/end user use case that motivated the initial
> > implementation of optnone. Where, I tend to agree - inlining an
> > optnone function is not advantageous to the user. Though it's possible
> > Johannes 's argument could be generalized from IR to C and still
> > apply: orthogonal features are more powerful and the user can always
> > compose them together to get what they want. (good chance they're
> > using attributes behind macros for ease of use anyway - they're a bit
> > verbose to write by hand all the time)
>
> I'd give the user the capability building blocks and let them work
> with that, but again, this is not my main concern right now.
>
>
> > There's also the -O0 use of optnone these days (clang puts optnone on
> > all functions when compiling with -O0 - the intent being to treat such
> > functions as though they were compiled in a separate object file
> > without optimizations (that's me projecting what I /think/ the mental
> > model should be) - which, similarly, I think will probably want to
> > keep the current behavior (no ipa/inlining and no optimization -
> > however that's phrased).
> >
> > Essentially the high level use cases of optnone all look like "imagine
> > if I compiled this in a separate object file without LTO" - apparently
> > even noipa+optnone isn't enough for that, maybe (I need to test that
> > more, based on a comment from Johannes earlier that inexact
> > definitions don't stop the inliner... )? Sounds like maybe it's more
> > like what I'm thinking of is "what if this function had weak linkage"
> > (ie: could be replaced by a totally different one)?
>
> Yes, weak should do the trick. That said, if you want "separate
> object file without LTO", go with `noinline` + `noipa`. This will
> make the call edges optimization barriers. If you want to also not
> optimize the function, add `optnone` as required.

Starts to feel like a long list to get what seems like one wholistic
concept ("treat it as though it were a separate non-LTO object file" /
"treat it as though this function had weak linkage"), but it's
probably OK/a fine thing to do (not like there's a high cost to having
multiple attributes since the attribute lists are shared, etc) - just
psychologically for me, there seems to be one core concept and
stitching it together from several attributes makes me worry that
there are gaps (as there have been/what's motivating this discussion -
though it certainly sounds like there will be fewer gaps after this
work, for sure).

> FWIW, if all functions are `optnone`, `noinline` and `noipa` are
> not needed, that is the -O0 case. More specifically, if your caller
> is `optnone`, `noinline` and `noipa` are not needed (for that caller).

Right - though LTO is the case that motivated adding optnone for -O0,
so it would be respected even under LTO - so in that case we'd want
noinline and noipa, by the sounds of it.

- Dave

> ~ Johannes
>
>
> >
> > On Thu, Apr 22, 2021 at 9:23 AM Johannes Doerfert
> > <johannesdoerfert at gmail.com> wrote:
> >>
> >> On 4/22/21 11:05 AM, paul.robinson at sony.com wrote:
> >>>> On 4/22/21 9:43 AM, paul.robinson at sony.com wrote:
> >>>>>> That said, I believe it is a mistake that `optnone` requires
> >>>>>> `noinline`. There is no reason for it to do so on the IR level.
> >>>>>> If you argue C-level `optnone` should imply `noinline`, that is
> >>>>>> a something worth discussing, though on the IR level we can
> >>>>>> decouple them. Use case, for example, the not-optimized version
> >>>>>> is called from functions that are `optnone` themselves while
> >>>>>> other call sites are inlined and your function is optimized. So
> >>>>>> you can use the two attributes to do context sensitive `optnone`.
> >>>>> The original intent for `optnone` was to imitate the -O0 pipeline
> >>>>> to the extent that was feasible.  The -O0 pipeline (as constructed
> >>>>> by Clang) runs just the always-inliner, not the regular inliner;
> >>>>> so, functions marked `optnone` should not be inlined.  The way
> >>>>> to achieve that effect most simply is to have `optnone` require
> >>>>> `noinline` and that's what we did.
> >>>>>
> >>>>> If we have `optnone` stop requiring `noinline` and teach the
> >>>>> inliner to inline an `optnone` callee only into an `optnone` caller,
> >>>>> then we are violating the intent that `optnone` imitate -O0, because
> >>>>> that inlining would not have happened at -O0.
> >>>> I think I initially read this wrong, hence the part below.
> >>>> After reading it again, I have one question: Why would the
> >>>> inliner inline something that is not `always_inline` into
> >>>> an `optnone` caller? That would violate the idea of `optnone`,
> >>>> IMHO, regardless if the callee is `optnone` or not. That is
> >>>> why I don't believe `noinline` on the callee is necessary
> >>>> for your use case.
> >>> The inliner should be ignoring `optnone` callers, so it would
> >>> never inline *anything* into an `optnone` caller.  (Other than
> >>> an `alwaysinline` function.)
> >> My point is, it already does ignore `optnone` callers
> >> and inlines only `alwaysinline` calls into them:
> >>
> >> https://clang.godbolt.org/z/fznbjTEd5
> >>
> >>
> >>> I had read this:
> >>>
> >>>>>> I believe it is a mistake that `optnone` requires `noinline`.
> >>> and the case that came to mind is inlining an `optnone` callee
> >>> into a not-`optnone` caller.  The inlined copy would then be
> >>> treated to further optimization, which violates the idea of
> >>> `optnone`.
> >> But that is a composition issue. If you do not want to
> >> inline a `optnone` callee into an non-`optnone` caller,
> >> then add `noinline` to the callee. If you don't mind if
> >> it is inlined into non-`optnone` callers and optimized
> >> in there, then don't. My last email contained an example
> >> to show the different cases, you can mix and match IR
> >> attributes to get what you want. Requiring them to be
> >> tied is not improving anything but just restricting the
> >> options.
> >>
> >>
> >>> Now, the inliner already knows to avoid `noinline` callees, so
> >>> attaching `noinline` to `optnone` functions was (at the time)
> >>> considered an optimal way to avoid the problematic case.  We
> >>> could instead teach the inliner to skip `optnone` callees, and
> >>> that would allow us to eliminate the requirement that `optnone`
> >>> functions must also be `noinline`.  I am unclear why redefining
> >>> `optnone` to _imply_ `noinline` (rather than _require_ `noinline`)
> >>> is better, but then I don't work much with attributes.
> >> The inliner will not inline calls into an `optnone` caller
> >> if it is not necessary. As said before, that would violate
> >> the `optnone` idea for the caller, no matter what the callee
> >> looks like. So requiring  `noinline` on the callee seems
> >> to me like a workaround or an oversight.
> >>
> >> It is better to not require them together because you can
> >> actually describe more distinct scenarios. Please take
> >> another look at my example in the last email, it shows
> >> what is possible if you split them. Furthermore, `optnone`
> >> does by design imply `noinline` for the call sites in the
> >> caller, or at least nobody argued that it shouldn't. Thus,
> >> requiring `noinline` on the callee is simply unnecessary
> >> as it does not add any value.
> >>
> >>
> >>> The notion of allowing an `optnone` caller to inline an `optnone`
> >>> callee sounds like it would also violate the intent of `optnone`
> >>> in that it should imitate -O0, where inlining is confined to
> >>> `alwaysinline` callees, and `optnone` is defined to conflict with
> >>> `alwaysinline` (because if you always inline something, you are
> >>> allowing it to have subsequent optimizations same as the caller,
> >>> which conflicts with `optnone`).
> >> Nobody said `optnone` callers should inline calls that are
> >> not always_inline, at least so far I have not seen that
> >> argument be made anywhere. I'll just skip this paragraph.
> >>
> >>
> >>> So, if you want to undo the _requirement_ that `optnone` must
> >>> have `noinline`, but then redefine `optnone` such that it can't
> >>> be inlined anywhere, you've done something that seems to have no
> >>> practical effect.  Maybe that helps Attributor in some way, but
> >>> I don't see any other reason to be making this change.
> >> I do not want to say `optnone` cannot be inlined. `noinline`
> >> says it cannot be inlined. If you want it to not be inlined,
> >> use `noinline`, if you want it to not be optimized in it's
> >> own function, use `optnone`, if you want both, use both.
> >>
> >> The practical effect was literally showcased in my last email,
> >> please go back and look at the example.
> >>
> >> I don't know why the Attributor has to do with this, I'm happy
> >> to hear your thoughts on that though :)
> >>
> >> ~ Johannes
> >>
> >>
> >>> --paulr
> >>>
> >>