[llvm-dev] Revisiting/refining the definition of optnone with interprocedural transformations

Mon Apr 19 14:41:33 PDT 2021

On Sun, Apr 18, 2021 at 10:30 PM Johannes Doerfert
<johannesdoerfert at gmail.com> wrote:
>
>
> On 4/18/21 10:51 PM, David Blaikie wrote:
> > On Sun, Apr 18, 2021 at 8:29 PM Johannes Doerfert <
> > johannesdoerfert at gmail.com> wrote:
> >
> >> I'm very much in favor of `noipa`. It comes up every few months
> >> and it would be widely useful.
> >
> > Out of curiosity, what sort of uses do you have in mind for it?
>
> Most times people basically want `noinline` to also mean "no
> interprocedural optimization", but without `optnone`. So, your
> function is optimized but actually called and the call result
> is used, no constants are propagated etc.
>
> Example:
>
> ```
> __attribute__((noipa))
> void foo() { return 1 + 2; }
> void bar() { return foo(); }
> ```
> should become
>
> ```
> __attribute__((noipa))
> void foo() { return 3; }
> void bar() { return foo(); }
> ```
> which it does not right now.

I'm curious what the use case is you've come across (the justification
for the GCC implementation of noipa was mostly for compiler testing -
which is my interest in having these semantics (under optnone or
otherwise) - so just curious what other use cases I should have in
mind, etc)

> >> I'd expose it via Clang and -O0 could
> >> set it as well (for the LTO case).
> >>
> >> When it comes to inexact definitions, optnone functions, and existing
> >> attributes,
> >> I'd be in favor of 1) always allowing the use of existing attributes,
> >>
> > I'm not sure what you mean by this ^ - could you rephrase/elaborate?
> >
> >
> >> and 2) not deriving new ones for an inexact or optnone definition.
> >>
> > Also this ^ I'm similarly confused/unclear about.
>
> So if you have a call of F, and F has attribute A, we can use
> that fact at the call site, regardless of the definition of F.
> F could be `optnone` or with non-exact linkage, but the information
> attached to it is still usable.

+1 SGTM.

> If we go for the above we can never derive/attach information
> for a non-exact linkage definitions. That way we prevent IPO from
> using information that might be invalid if the definition is replaced.

Yup, sounds good.

> It is all about where you disturb the ipo deduction in this case, I think
> it is more beneficial to not attach new things but an argument could be
> made to allow that but no propagation.

Allow adding them, but never using them? Yeah, that doesn't seem
especially helpful/useful - the attributes are entirely for IPO, so if
you want to block IPO it seems best not to add them.

> Both have benefits, its' not 100%
> clear what is more desirable at the end of the day.
>
>
> >
> >
> >> This is how the Attributor determines if it a function level attribute
> >> could
> >> be derived or if we should only stick with the existing information:
> >>
> >>       /// Determine whether the function \p F is IPO amendable
> >>       ///
> >>       /// If a function is exactly defined or it has alwaysinline attribute
> >>       /// and is viable to be inlined, we say it is IPO amendable
> >>       bool isFunctionIPOAmendable(const Function &F) {
> >>         return F.hasExactDefinition() ||
> >> InfoCache.InlineableFunctions.count(&F);
> >>       }
> >>
> >> So, if the above check doesn't hold we will not add new attributes but
> >> we will
> >> still use existing ones. This seems to me the right way to allow
> >> users/frontends
> >> to provide information selectively.
> >>
> > Yep, that sounds right to me (if you put attributes on an optnone/noipa
> > function, they should be usable/used - but none should be discovered/added
> > later by inspection of the implementation of such a function) - currently
> > doesn't seem to be the case for the (old pass manager?) FunctionAttrs pass,
> > so I have to figure some things out there.
>
> That is what I tried to say above, I think.
>
> In the end, I want to know that foo does not access memory but
> bar could for all we know:
>
> ```
> __attribute__((pure, optnone))         // or non-exact linkage
> void pure_optnone() { /* empty */ }
>
> __attribute__((optnone))               // or non-exact linkage
> void optnone() { /* empty */ }
>
> void foo() { pure_optnone(); }
>
> void bar() { optnone(); }
> ```

Got it,

I'll see about posting an implementation of noipa and switching
__attribute__((optnone)) over to lower to LLVM's optnone+noipa rather
than optnone+noinline.

Happy if someone wants to add clang support for an
__attribute__((noipa)) lowering to that LLVM noipa once it's in (maybe
I'll do it, guess it's probably fairly cheap/easy).

- Dave

> ~ Johannes
>
>
> >
> >
> >> That said, right now the Attributor will not propagate any information
> >> from an
> >> optnone function or derive new information. Nevertheless, I'd be in
> >> favor to allow
> >> existing information to be used for IPO.
> >>
> > *nod* I think I'm with you there.
> >
> > - Dave
> >
> >
> >> ~ Johannes
> >>
> >>
> >> On 4/18/21 8:40 PM, David Blaikie via llvm-dev wrote:
> >>> Prototyping the idea of "isDefinitionExact" returning false for optnone
> >>> (whether or not we split it out into noipo or not) I've tripped over
> >>> something it seems I created 5 years ago:
> >>>
> >>> I added some IPC support for optnone to GlobalsModRef:
> >>>
> >> https://github.com/llvm/llvm-project/commit/c662b501508200076e581beb9345a7631173a1d8#diff-55664e96a7ce3533b46f12c6906acecb2bd9a599e2b79c97506af4b1b4873fa1
> >>> - so it wouldn't conclude properties of an optnone function.
> >>>
> >>> But I then made a follow-up commit (without a lot of context as to why,
> >>> unfortunately :/ ) that allowed GlobasModRef to use existing attributes
> >> on
> >>> an optnone function:
> >>>
> >> https://github.com/llvm/llvm-project/commit/7a9b788830da0a426fb0ff0a4cec6d592bb026e9#diff-55664e96a7ce3533b46f12c6906acecb2bd9a599e2b79c97506af4b1b4873fa1
> >>> But it seems making the function definition inexact, breaks the unit
> >>> testing added in the latter commit. I suppose then it's an open question
> >>> whether existing attributes on an inexact definition should be used at
> >> all?
> >>> (I don't know what motivated me to support them for optnone)
> >>>
> >>> Oh, and here's a change from Chandler around the same time similarly
> >>> blocking some ipo for optnone:
> >>>
> >> https://github.com/llvm/llvm-project/commit/0fb998110abcf3d67495d12f854a1576b182d811#diff-cc618a9485181a9246c4e0367dc9f1a19d3cb6811d1e488713f53a753d3da60c
> >>> - in this case preventing FunctionAttrs from deriving the attributes for
> >> an
> >>> optnone function. That functionality looks like it can be subsumed by the
> >>> inexact approach - applying inexact to optnone and removing the change in
> >>> Chandler's patch still passes the tests. (hmm, tested - not quite, but
> >> more
> >>> work to do there)
> >>>
> >>> On Sun, Apr 18, 2021 at 10:06 AM David Blaikie <dblaikie at gmail.com>
> >> wrote:
> >>>> On Sun, Apr 18, 2021 at 9:43 AM Roman Lebedev <lebedev.ri at gmail.com>
> >>>> wrote:
> >>>>
> >>>>> There's 'noipa' attribute in GCC, currently it is not supported by
> >> clang.
> >>>>> Theoretically, how would one implement it?
> >>>>>
> >>>> If we wanted to do this really robustly, I guess we might have to
> >>>> introduce some sort of "here's the usual way to check if this is a
> >>>> definition/get the body of the function" (which for noipa it says
> >> "there is
> >>>> no body/don't look here") and "no, really, I need the definition" (for
> >>>> actual code generation).
> >>>>
> >>>> Though I'm not advocating for that - I'm OK with a more
> >> ad-hoc/best-effort
> >>>> implementation targeting the -O0/debugging assistant
> >>>> __attribute__((optnone)) kind of use case - happy to fix cases as they
> >> come
> >>>> up to improve the user experience for these situations.
> >>>>
> >>>> Maybe we could get away with generalizing this by having an optnone (or
> >>>> noipa) function appear "interposable" even though it doesn't have a real
> >>>> interposable linkage? That should hinder/disable any IPA.
> >>>>
> >>>> Hmm, looks like GlobalValue::isDefinitionExact would be best to return
> >>>> false in this case (whatever we end up naming it) /maybe/
> >>>> mayBeDerefined should return false too.
> >>>>
> >>>> Yeah, I guess if we can implement such a robust generalization, then
> >> it'd
> >>>> probably be OK/easy enough to implement both noipa and optnone implies
> >>>> noipa the same as it implies noinline (well, I guess noipa would subsume
> >>>> the noinline implication - if the function isn't exact, the inliner
> >> won't
> >>>> inline it so there wouldn't be any need for the explicit noinline)
> >>>>
> >>>>
> >>>>> With your proposal, clang `noipa` attribute could be lowered
> >>>>> to `optnone` on the whole function, To me that seems like
> >>>>> too much of a hammer, should that be the path forward.
> >>>>>
> >>>> I agree that lowering noipa to optnone would be a very aggressive form
> >> of
> >>>> noipa - likely if we want to support noipa it would be to support it
> >>>> separately and maybe either lower -O0 (& maybe
> >> __attribute__((optnone))) to
> >>>> both optnone+noipa+noinline (since optnone already implies noinline) or
> >>>> make optnone imply ipa/be a superset of it implicitly (if we do have
> >> noipa
> >>>> it's probably best to have "optnone requires noipa" the same way
> >> "optnone
> >>>> requires noinline" rather than an implicit superset sort of thing).
> >>>>
> >>>> I think that'd certainly be appropriate for -O0, and I'd argue it'd be
> >>>> appropriate for __attribute__((optnone)) because I think it'd be what
> >>>> people expect/is consistent with the motivation for the attribute (for
> >>>> debuggability - so you wouldn't want a caller to not fill in
> >>>> parameters/pass in garbage because it knows the implementation doesn't
> >>>> matter, or not use the result because it knows what the result should
> >> be).
> >>>>
> >>>>> Would it not be best to not conflate the two,
> >>>>> and just introduce the `noipa` attribute?
> >>>>>
> >>>> I think we'd still want to conflate them for user-facing functionality,
> >>>> even if they were separable at the IR level.
> >>>>
> >>>> - Dave
> >>>>
> >>>>
> >>>>> Roman
> >>>>>
> >>>>> On Sun, Apr 18, 2021 at 7:37 PM David Blaikie <dblaikie at gmail.com>
> >> wrote:
> >>>>>> While trying to reproduce some debug info thing (I don't have the
> >> exact
> >>>>> example at the moment - but I think it was more aggressive than the
> >> example
> >>>>> I have now, but something like this:
> >>>>>> __attribute__((optnone)) int f1() {
> >>>>>>     return 3;
> >>>>>> }
> >>>>>> int main() {
> >>>>>>     return f1();
> >>>>>> }
> >>>>>>
> >>>>>>
> >>>>>> (actually I think in my case I had a variable to hold the return value
> >>>>> from f1, with the intent that this variable's location couldn't use a
> >>>>> constant - a load from a volatile variable would probably have provided
> >>>>> similar functionality in this case)
> >>>>>> LLVM (& specifically Sparse Conditional Constant Propagation,
> >>>>> llvm/lib/Transforms/Scalar/SCCP.cpp) optimizes this code noting that f1
> >>>>> always returns 3, so rather than using the return value from the call
> >> to
> >>>>> f1, it ends up hardcoding the return value:
> >>>>>> define dso_local i32 @main() local_unnamed_addr #1 {
> >>>>>>
> >>>>>> entry:
> >>>>>>
> >>>>>>     %call = tail call i32 @_Z2f1v()
> >>>>>>
> >>>>>>     ret i32 3
> >>>>>>
> >>>>>> }
> >>>>>>
> >>>>>>
> >>>>>> I consider this a bug - in that optnone is used to implement -O0 for
> >>>>> LTO, so it seemed to me that the correct behavior is for an optnone
> >>>>> function to behave as though it were compiled in another object file
> >>>>> outside the purview of optimizations - interprocedural or
> >> intraprocedural.
> >>>>>> So I sent https://reviews.llvm.org/D100353 to fix that.
> >>>>>>
> >>>>>> Florian pointed out that this wasn't quite specified in the LangRef,
> >>>>> which says this about optnone:
> >>>>>> This function attribute indicates that most optimization passes will
> >>>>> skip this function, with the exception of interprocedural optimization
> >>>>> passes. Code generation defaults to the “fast” instruction selector.
> >> This
> >>>>> attribute cannot be used together with the alwaysinline attribute; this
> >>>>> attribute is also incompatible with the minsize attribute and the
> >> optsize
> >>>>> attribute.
> >>>>>> This attribute requires the noinline attribute to be specified on the
> >>>>> function as well, so the function is never inlined into any caller.
> >> Only
> >>>>> functions with the alwaysinline attribute are valid candidates for
> >> inlining
> >>>>> into the body of this function.
> >>>>>> So the spec of optnone is unclear (or arguably explicitly disallows)
> >>>>> whether interprocedural optimizations should treat optnone functions
> >> in any
> >>>>> particular way.
> >>>>>> So I was going to update the wording to rephrase this to say
> >>>>> "Interprocedural optimizations should treat this function as though it
> >> were
> >>>>> defined in an isolated module/object." (perhaps "interprocedural
> >>>>> optimizations should treat optnone functions as opaque" or "as though
> >> they
> >>>>> were only declarations")
> >>>>>> The choice of this direction was based on my (possibly incorrect or
> >>>>> debatable) understanding of optnone, that it was equivalent to the
> >> function
> >>>>> being in a separate/non-lto object. (this seems consistent with the way
> >>>>> optnone is used to implement -O0 under lto - you could imagine a user
> >>>>> debugging a binary, using -O0 for the code they're interested in
> >> debugging,
> >>>>> and potentially using an interactive debugger to change some state in
> >> the
> >>>>> function causing it to return a different value - which would get quite
> >>>>> confusing if the return value was effectively hardcoded into the
> >> caller)
> >>>>>> What're folks thoughts on this?
> >>>>>>
> >>>>>> - Dave
> >>> _______________________________________________
> >>> LLVM Developers mailing list
> >>> llvm-dev at lists.llvm.org
> >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev