[llvm-dev] Revisiting/refining the definition of optnone with interprocedural transformations

Thu Apr 22 10:32:04 PDT 2021

On 4/22/21 12:21 PM, David Blaikie wrote:
> On Thu, Apr 22, 2021 at 10:04 AM Johannes Doerfert
> <johannesdoerfert at gmail.com> wrote:
>>
>> On 4/22/21 11:49 AM, David Blaikie wrote:
>>> There seems to be a bunch of confusion and probably some
>>> conflation/ambiguity about whether we're talking about IR constructs
>>> on the C attributes.
>>>
>>> Johannes - I assume your claim is restricted mostly to the IR? That
>>> having optnone not require or imply noinline improves orthogonality of
>>> features and that there are reasonable use cases where one might want
>>> optnone while allowing inlining (of the optnone function) or optnone
>>> while disallowing inlining (of the optnone function)
>> Yes, IR level is my concern. I'd be in favor of matching it in C
>> but I don't care enough to go through the discussions.
>>
>> My latest use case: `s/optnone//g` should not result in a verifier
>> error when you just want to try out an optimization on a single function.
> I guess you meant adding optnone, rather than removing it? (removing
> optnone shouldn't cause any verifier errors, does it?)

Yes, right, my bad. Removing `noinline` or adding `optnone` can
get you in trouble.

>>> Paul - I think you're mostly thinking about/interested in the specific
>>> source level/end user use case that motivated the initial
>>> implementation of optnone. Where, I tend to agree - inlining an
>>> optnone function is not advantageous to the user. Though it's possible
>>> Johannes 's argument could be generalized from IR to C and still
>>> apply: orthogonal features are more powerful and the user can always
>>> compose them together to get what they want. (good chance they're
>>> using attributes behind macros for ease of use anyway - they're a bit
>>> verbose to write by hand all the time)
>> I'd give the user the capability building blocks and let them work
>> with that, but again, this is not my main concern right now.
>>
>>
>>> There's also the -O0 use of optnone these days (clang puts optnone on
>>> all functions when compiling with -O0 - the intent being to treat such
>>> functions as though they were compiled in a separate object file
>>> without optimizations (that's me projecting what I /think/ the mental
>>> model should be) - which, similarly, I think will probably want to
>>> keep the current behavior (no ipa/inlining and no optimization -
>>> however that's phrased).
>>>
>>> Essentially the high level use cases of optnone all look like "imagine
>>> if I compiled this in a separate object file without LTO" - apparently
>>> even noipa+optnone isn't enough for that, maybe (I need to test that
>>> more, based on a comment from Johannes earlier that inexact
>>> definitions don't stop the inliner... )? Sounds like maybe it's more
>>> like what I'm thinking of is "what if this function had weak linkage"
>>> (ie: could be replaced by a totally different one)?
>> Yes, weak should do the trick. That said, if you want "separate
>> object file without LTO", go with `noinline` + `noipa`. This will
>> make the call edges optimization barriers. If you want to also not
>> optimize the function, add `optnone` as required.
> Starts to feel like a long list to get what seems like one wholistic
> concept ("treat it as though it were a separate non-LTO object file" /
> "treat it as though this function had weak linkage"), but it's
> probably OK/a fine thing to do (not like there's a high cost to having
> multiple attributes since the attribute lists are shared, etc) - just
> psychologically for me, there seems to be one core concept and
> stitching it together from several attributes makes me worry that
> there are gaps (as there have been/what's motivating this discussion -
> though it certainly sounds like there will be fewer gaps after this
> work, for sure).

If you want "separate non-LTO" behavior by design, put it in a
different file and don't compile that file with LTO ;)

Inside a single TU there is no "separate non-LTO" idea, it is
a single TU after all. To get the same effect we build it from
blocks that have a meaning in the single TU case. Sure, there
might be gaps left, that usually means there is something missing
in the single TU case as well so a new attribute is in order.

>> FWIW, if all functions are `optnone`, `noinline` and `noipa` are
>> not needed, that is the -O0 case. More specifically, if your caller
>> is `optnone`, `noinline` and `noipa` are not needed (for that caller).
> Right - though LTO is the case that motivated adding optnone for -O0,
> so it would be respected even under LTO - so in that case we'd want
> noinline and noipa, by the sounds of it.

If you run only one file with -O0 and then LTO it with other files
that do not have -O0, you probably want to add `noinline` + `noipa`
at the "entry points" that you want to debug. I think -O0 could
reasonably add all three arguments anyway, if you say O0 they all
make sense (to me). If you want more control, you need to seed them
manually.

~ Johannes

>
> - Dave
>
>> ~ Johannes
>>
>>
>>> On Thu, Apr 22, 2021 at 9:23 AM Johannes Doerfert
>>> <johannesdoerfert at gmail.com> wrote:
>>>> On 4/22/21 11:05 AM, paul.robinson at sony.com wrote:
>>>>>> On 4/22/21 9:43 AM, paul.robinson at sony.com wrote:
>>>>>>>> That said, I believe it is a mistake that `optnone` requires
>>>>>>>> `noinline`. There is no reason for it to do so on the IR level.
>>>>>>>> If you argue C-level `optnone` should imply `noinline`, that is
>>>>>>>> a something worth discussing, though on the IR level we can
>>>>>>>> decouple them. Use case, for example, the not-optimized version
>>>>>>>> is called from functions that are `optnone` themselves while
>>>>>>>> other call sites are inlined and your function is optimized. So
>>>>>>>> you can use the two attributes to do context sensitive `optnone`.
>>>>>>> The original intent for `optnone` was to imitate the -O0 pipeline
>>>>>>> to the extent that was feasible.  The -O0 pipeline (as constructed
>>>>>>> by Clang) runs just the always-inliner, not the regular inliner;
>>>>>>> so, functions marked `optnone` should not be inlined.  The way
>>>>>>> to achieve that effect most simply is to have `optnone` require
>>>>>>> `noinline` and that's what we did.
>>>>>>>
>>>>>>> If we have `optnone` stop requiring `noinline` and teach the
>>>>>>> inliner to inline an `optnone` callee only into an `optnone` caller,
>>>>>>> then we are violating the intent that `optnone` imitate -O0, because
>>>>>>> that inlining would not have happened at -O0.
>>>>>> I think I initially read this wrong, hence the part below.
>>>>>> After reading it again, I have one question: Why would the
>>>>>> inliner inline something that is not `always_inline` into
>>>>>> an `optnone` caller? That would violate the idea of `optnone`,
>>>>>> IMHO, regardless if the callee is `optnone` or not. That is
>>>>>> why I don't believe `noinline` on the callee is necessary
>>>>>> for your use case.
>>>>> The inliner should be ignoring `optnone` callers, so it would
>>>>> never inline *anything* into an `optnone` caller.  (Other than
>>>>> an `alwaysinline` function.)
>>>> My point is, it already does ignore `optnone` callers
>>>> and inlines only `alwaysinline` calls into them:
>>>>
>>>> https://clang.godbolt.org/z/fznbjTEd5
>>>>
>>>>
>>>>> I had read this:
>>>>>
>>>>>>>> I believe it is a mistake that `optnone` requires `noinline`.
>>>>> and the case that came to mind is inlining an `optnone` callee
>>>>> into a not-`optnone` caller.  The inlined copy would then be
>>>>> treated to further optimization, which violates the idea of
>>>>> `optnone`.
>>>> But that is a composition issue. If you do not want to
>>>> inline a `optnone` callee into an non-`optnone` caller,
>>>> then add `noinline` to the callee. If you don't mind if
>>>> it is inlined into non-`optnone` callers and optimized
>>>> in there, then don't. My last email contained an example
>>>> to show the different cases, you can mix and match IR
>>>> attributes to get what you want. Requiring them to be
>>>> tied is not improving anything but just restricting the
>>>> options.
>>>>
>>>>
>>>>> Now, the inliner already knows to avoid `noinline` callees, so
>>>>> attaching `noinline` to `optnone` functions was (at the time)
>>>>> considered an optimal way to avoid the problematic case.  We
>>>>> could instead teach the inliner to skip `optnone` callees, and
>>>>> that would allow us to eliminate the requirement that `optnone`
>>>>> functions must also be `noinline`.  I am unclear why redefining
>>>>> `optnone` to _imply_ `noinline` (rather than _require_ `noinline`)
>>>>> is better, but then I don't work much with attributes.
>>>> The inliner will not inline calls into an `optnone` caller
>>>> if it is not necessary. As said before, that would violate
>>>> the `optnone` idea for the caller, no matter what the callee
>>>> looks like. So requiring  `noinline` on the callee seems
>>>> to me like a workaround or an oversight.
>>>>
>>>> It is better to not require them together because you can
>>>> actually describe more distinct scenarios. Please take
>>>> another look at my example in the last email, it shows
>>>> what is possible if you split them. Furthermore, `optnone`
>>>> does by design imply `noinline` for the call sites in the
>>>> caller, or at least nobody argued that it shouldn't. Thus,
>>>> requiring `noinline` on the callee is simply unnecessary
>>>> as it does not add any value.
>>>>
>>>>
>>>>> The notion of allowing an `optnone` caller to inline an `optnone`
>>>>> callee sounds like it would also violate the intent of `optnone`
>>>>> in that it should imitate -O0, where inlining is confined to
>>>>> `alwaysinline` callees, and `optnone` is defined to conflict with
>>>>> `alwaysinline` (because if you always inline something, you are
>>>>> allowing it to have subsequent optimizations same as the caller,
>>>>> which conflicts with `optnone`).
>>>> Nobody said `optnone` callers should inline calls that are
>>>> not always_inline, at least so far I have not seen that
>>>> argument be made anywhere. I'll just skip this paragraph.
>>>>
>>>>
>>>>> So, if you want to undo the _requirement_ that `optnone` must
>>>>> have `noinline`, but then redefine `optnone` such that it can't
>>>>> be inlined anywhere, you've done something that seems to have no
>>>>> practical effect.  Maybe that helps Attributor in some way, but
>>>>> I don't see any other reason to be making this change.
>>>> I do not want to say `optnone` cannot be inlined. `noinline`
>>>> says it cannot be inlined. If you want it to not be inlined,
>>>> use `noinline`, if you want it to not be optimized in it's
>>>> own function, use `optnone`, if you want both, use both.
>>>>
>>>> The practical effect was literally showcased in my last email,
>>>> please go back and look at the example.
>>>>
>>>> I don't know why the Attributor has to do with this, I'm happy
>>>> to hear your thoughts on that though :)
>>>>
>>>> ~ Johannes
>>>>
>>>>
>>>>> --paulr
>>>>>