[llvm-dev] Revisiting/refining the definition of optnone with interprocedural transformations

David Blaikie via llvm-dev llvm-dev at lists.llvm.org
Wed Apr 21 19:29:24 PDT 2021


Implemented a first-pass at adding noipa IR/bitcode and the basic
functionality, noipa implying "may be unrefined"/not
is-definition-exact. https://reviews.llvm.org/D101011

On Mon, Apr 19, 2021 at 4:32 PM Johannes Doerfert
<johannesdoerfert at gmail.com> wrote:
>
>
> On 4/19/21 4:41 PM, David Blaikie wrote:
> > On Sun, Apr 18, 2021 at 10:30 PM Johannes Doerfert
> > <johannesdoerfert at gmail.com>  wrote:
> >> On 4/18/21 10:51 PM, David Blaikie wrote:
> >>> On Sun, Apr 18, 2021 at 8:29 PM Johannes Doerfert <
> >>> johannesdoerfert at gmail.com> wrote:
> >>>
> >>>> I'm very much in favor of `noipa`. It comes up every few months
> >>>> and it would be widely useful.
> >>> Out of curiosity, what sort of uses do you have in mind for it?
> >> Most times people basically want `noinline` to also mean "no
> >> interprocedural optimization", but without `optnone`. So, your
> >> function is optimized but actually called and the call result
> >> is used, no constants are propagated etc.
> >>
> >> Example:
> >>
> >> ```
> >> __attribute__((noipa))
> >> void foo() { return 1 + 2; }
> >> void bar() { return foo(); }
> >> ```
> >> should become
> >>
> >> ```
> >> __attribute__((noipa))
> >> void foo() { return 3; }
> >> void bar() { return foo(); }
> >> ```
> >> which it does not right now.
> > I'm curious what the use case is you've come across (the justification
> > for the GCC implementation of noipa was mostly for compiler testing -
> > which is my interest in having these semantics (under optnone or
> > otherwise) - so just curious what other use cases I should have in
> > mind, etc)
>
> I looked for `noipa` in my inbox, here are some results that
> show different use cases people brought up since March 2020:
>
> https://reviews.llvm.org/D75815#1939277
> https://bugs.llvm.org/show_bug.cgi?id=46463
> https://reviews.llvm.org/D93838#2472155
> https://reviews.llvm.org/D97971#2608302
>
> Another use case is runtime call detection in the presence of definitions.
> So, we detect `malloc` and also various OpenMP runtime calls, which works
> fine because those are usually declarations. However, sometimes they are
> not and then we can easily end up with signatures that do not match what we
> expect anymore. At least that happens if we link in the OpenMP GPU runtime
> into an application.

Ah, thanks for all the links/context!

> >>>> I'd expose it via Clang and -O0 could
> >>>> set it as well (for the LTO case).
> >>>>
> >>>> When it comes to inexact definitions, optnone functions, and existing
> >>>> attributes,
> >>>> I'd be in favor of 1) always allowing the use of existing attributes,
> >>>>
> >>> I'm not sure what you mean by this ^ - could you rephrase/elaborate?
> >>>
> >>>
> >>>> and 2) not deriving new ones for an inexact or optnone definition.
> >>>>
> >>> Also this ^ I'm similarly confused/unclear about.
> >> So if you have a call of F, and F has attribute A, we can use
> >> that fact at the call site, regardless of the definition of F.
> >> F could be `optnone` or with non-exact linkage, but the information
> >> attached to it is still usable.
> > +1 SGTM.
> >
> >> If we go for the above we can never derive/attach information
> >> for a non-exact linkage definitions. That way we prevent IPO from
> >> using information that might be invalid if the definition is replaced.
> > Yup, sounds good.
> >
> >> It is all about where you disturb the ipo deduction in this case, I think
> >> it is more beneficial to not attach new things but an argument could be
> >> made to allow that but no propagation.
> > Allow adding them, but never using them? Yeah, that doesn't seem
> > especially helpful/useful - the attributes are entirely for IPO, so if
> > you want to block IPO it seems best not to add them.
>
> We could use them *inside* the function, but we can make that work
> differently as well. IPO seems the more important target.

Ah, right. Yeah, agreed.

> >> Both have benefits, its' not 100%
> >> clear what is more desirable at the end of the day.
> >>
> >>
> >>>> This is how the Attributor determines if it a function level attribute
> >>>> could
> >>>> be derived or if we should only stick with the existing information:
> >>>>
> >>>>        /// Determine whether the function \p F is IPO amendable
> >>>>        ///
> >>>>        /// If a function is exactly defined or it has alwaysinline attribute
> >>>>        /// and is viable to be inlined, we say it is IPO amendable
> >>>>        bool isFunctionIPOAmendable(const Function &F) {
> >>>>          return F.hasExactDefinition() ||
> >>>> InfoCache.InlineableFunctions.count(&F);
> >>>>        }
> >>>>
> >>>> So, if the above check doesn't hold we will not add new attributes but
> >>>> we will
> >>>> still use existing ones. This seems to me the right way to allow
> >>>> users/frontends
> >>>> to provide information selectively.
> >>>>
> >>> Yep, that sounds right to me (if you put attributes on an optnone/noipa
> >>> function, they should be usable/used - but none should be discovered/added
> >>> later by inspection of the implementation of such a function) - currently
> >>> doesn't seem to be the case for the (old pass manager?) FunctionAttrs pass,
> >>> so I have to figure some things out there.
> >> That is what I tried to say above, I think.
> >>
> >> In the end, I want to know that foo does not access memory but
> >> bar could for all we know:
> >>
> >> ```
> >> __attribute__((pure, optnone))         // or non-exact linkage
> >> void pure_optnone() { /* empty */ }
> >>
> >> __attribute__((optnone))               // or non-exact linkage
> >> void optnone() { /* empty */ }
> >>
> >> void foo() { pure_optnone(); }
> >>
> >> void bar() { optnone(); }
> >> ```
> > Got it,
> >
> > I'll see about posting an implementation of noipa and switching
> > __attribute__((optnone)) over to lower to LLVM's optnone+noipa rather
> > than optnone+noinline.
>
> FWIW, I think `noipa` should not imply `noinline`, unsure if you
> had that in mind or not.

Do you think it should require that noipa also carries noinline? (the
way optnone currently requires noinline) Or should we let the
non-inlining fall out naturally from the non-exact definition
property?

> > Happy if someone wants to add clang support for an
> > __attribute__((noipa)) lowering to that LLVM noipa once it's in (maybe
> > I'll do it, guess it's probably fairly cheap/easy).
>
> Agreed, I won't volunteer right now, I doubt that I'll get to it
> anytime soon. That said, I actually would like to use `noipa`, see
> above.
>
> ~ Johannes
>
>
> > - Dave
> >
> >> ~ Johannes
> >>
> >>
> >>>> That said, right now the Attributor will not propagate any information
> >>>> from an
> >>>> optnone function or derive new information. Nevertheless, I'd be in
> >>>> favor to allow
> >>>> existing information to be used for IPO.
> >>>>
> >>> *nod* I think I'm with you there.
> >>>
> >>> - Dave
> >>>
> >>>
> >>>> ~ Johannes
> >>>>
> >>>>
> >>>> On 4/18/21 8:40 PM, David Blaikie via llvm-dev wrote:
> >>>>> Prototyping the idea of "isDefinitionExact" returning false for optnone
> >>>>> (whether or not we split it out into noipo or not) I've tripped over
> >>>>> something it seems I created 5 years ago:
> >>>>>
> >>>>> I added some IPC support for optnone to GlobalsModRef:
> >>>>>
> >>>> https://github.com/llvm/llvm-project/commit/c662b501508200076e581beb9345a7631173a1d8#diff-55664e96a7ce3533b46f12c6906acecb2bd9a599e2b79c97506af4b1b4873fa1
> >>>>> - so it wouldn't conclude properties of an optnone function.
> >>>>>
> >>>>> But I then made a follow-up commit (without a lot of context as to why,
> >>>>> unfortunately :/ ) that allowed GlobasModRef to use existing attributes
> >>>> on
> >>>>> an optnone function:
> >>>>>
> >>>> https://github.com/llvm/llvm-project/commit/7a9b788830da0a426fb0ff0a4cec6d592bb026e9#diff-55664e96a7ce3533b46f12c6906acecb2bd9a599e2b79c97506af4b1b4873fa1
> >>>>> But it seems making the function definition inexact, breaks the unit
> >>>>> testing added in the latter commit. I suppose then it's an open question
> >>>>> whether existing attributes on an inexact definition should be used at
> >>>> all?
> >>>>> (I don't know what motivated me to support them for optnone)
> >>>>>
> >>>>> Oh, and here's a change from Chandler around the same time similarly
> >>>>> blocking some ipo for optnone:
> >>>>>
> >>>> https://github.com/llvm/llvm-project/commit/0fb998110abcf3d67495d12f854a1576b182d811#diff-cc618a9485181a9246c4e0367dc9f1a19d3cb6811d1e488713f53a753d3da60c
> >>>>> - in this case preventing FunctionAttrs from deriving the attributes for
> >>>> an
> >>>>> optnone function. That functionality looks like it can be subsumed by the
> >>>>> inexact approach - applying inexact to optnone and removing the change in
> >>>>> Chandler's patch still passes the tests. (hmm, tested - not quite, but
> >>>> more
> >>>>> work to do there)
> >>>>>
> >>>>> On Sun, Apr 18, 2021 at 10:06 AM David Blaikie<dblaikie at gmail.com>
> >>>> wrote:
> >>>>>> On Sun, Apr 18, 2021 at 9:43 AM Roman Lebedev<lebedev.ri at gmail.com>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> There's 'noipa' attribute in GCC, currently it is not supported by
> >>>> clang.
> >>>>>>> Theoretically, how would one implement it?
> >>>>>>>
> >>>>>> If we wanted to do this really robustly, I guess we might have to
> >>>>>> introduce some sort of "here's the usual way to check if this is a
> >>>>>> definition/get the body of the function" (which for noipa it says
> >>>> "there is
> >>>>>> no body/don't look here") and "no, really, I need the definition" (for
> >>>>>> actual code generation).
> >>>>>>
> >>>>>> Though I'm not advocating for that - I'm OK with a more
> >>>> ad-hoc/best-effort
> >>>>>> implementation targeting the -O0/debugging assistant
> >>>>>> __attribute__((optnone)) kind of use case - happy to fix cases as they
> >>>> come
> >>>>>> up to improve the user experience for these situations.
> >>>>>>
> >>>>>> Maybe we could get away with generalizing this by having an optnone (or
> >>>>>> noipa) function appear "interposable" even though it doesn't have a real
> >>>>>> interposable linkage? That should hinder/disable any IPA.
> >>>>>>
> >>>>>> Hmm, looks like GlobalValue::isDefinitionExact would be best to return
> >>>>>> false in this case (whatever we end up naming it) /maybe/
> >>>>>> mayBeDerefined should return false too.
> >>>>>>
> >>>>>> Yeah, I guess if we can implement such a robust generalization, then
> >>>> it'd
> >>>>>> probably be OK/easy enough to implement both noipa and optnone implies
> >>>>>> noipa the same as it implies noinline (well, I guess noipa would subsume
> >>>>>> the noinline implication - if the function isn't exact, the inliner
> >>>> won't
> >>>>>> inline it so there wouldn't be any need for the explicit noinline)
> >>>>>>
> >>>>>>
> >>>>>>> With your proposal, clang `noipa` attribute could be lowered
> >>>>>>> to `optnone` on the whole function, To me that seems like
> >>>>>>> too much of a hammer, should that be the path forward.
> >>>>>>>
> >>>>>> I agree that lowering noipa to optnone would be a very aggressive form
> >>>> of
> >>>>>> noipa - likely if we want to support noipa it would be to support it
> >>>>>> separately and maybe either lower -O0 (& maybe
> >>>> __attribute__((optnone))) to
> >>>>>> both optnone+noipa+noinline (since optnone already implies noinline) or
> >>>>>> make optnone imply ipa/be a superset of it implicitly (if we do have
> >>>> noipa
> >>>>>> it's probably best to have "optnone requires noipa" the same way
> >>>> "optnone
> >>>>>> requires noinline" rather than an implicit superset sort of thing).
> >>>>>>
> >>>>>> I think that'd certainly be appropriate for -O0, and I'd argue it'd be
> >>>>>> appropriate for __attribute__((optnone)) because I think it'd be what
> >>>>>> people expect/is consistent with the motivation for the attribute (for
> >>>>>> debuggability - so you wouldn't want a caller to not fill in
> >>>>>> parameters/pass in garbage because it knows the implementation doesn't
> >>>>>> matter, or not use the result because it knows what the result should
> >>>> be).
> >>>>>>> Would it not be best to not conflate the two,
> >>>>>>> and just introduce the `noipa` attribute?
> >>>>>>>
> >>>>>> I think we'd still want to conflate them for user-facing functionality,
> >>>>>> even if they were separable at the IR level.
> >>>>>>
> >>>>>> - Dave
> >>>>>>
> >>>>>>
> >>>>>>> Roman
> >>>>>>>
> >>>>>>> On Sun, Apr 18, 2021 at 7:37 PM David Blaikie<dblaikie at gmail.com>
> >>>> wrote:
> >>>>>>>> While trying to reproduce some debug info thing (I don't have the
> >>>> exact
> >>>>>>> example at the moment - but I think it was more aggressive than the
> >>>> example
> >>>>>>> I have now, but something like this:
> >>>>>>>> __attribute__((optnone)) int f1() {
> >>>>>>>>      return 3;
> >>>>>>>> }
> >>>>>>>> int main() {
> >>>>>>>>      return f1();
> >>>>>>>> }
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> (actually I think in my case I had a variable to hold the return value
> >>>>>>> from f1, with the intent that this variable's location couldn't use a
> >>>>>>> constant - a load from a volatile variable would probably have provided
> >>>>>>> similar functionality in this case)
> >>>>>>>> LLVM (& specifically Sparse Conditional Constant Propagation,
> >>>>>>> llvm/lib/Transforms/Scalar/SCCP.cpp) optimizes this code noting that f1
> >>>>>>> always returns 3, so rather than using the return value from the call
> >>>> to
> >>>>>>> f1, it ends up hardcoding the return value:
> >>>>>>>> define dso_local i32 @main() local_unnamed_addr #1 {
> >>>>>>>>
> >>>>>>>> entry:
> >>>>>>>>
> >>>>>>>>      %call = tail call i32 @_Z2f1v()
> >>>>>>>>
> >>>>>>>>      ret i32 3
> >>>>>>>>
> >>>>>>>> }
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> I consider this a bug - in that optnone is used to implement -O0 for
> >>>>>>> LTO, so it seemed to me that the correct behavior is for an optnone
> >>>>>>> function to behave as though it were compiled in another object file
> >>>>>>> outside the purview of optimizations - interprocedural or
> >>>> intraprocedural.
> >>>>>>>> So I senthttps://reviews.llvm.org/D100353  to fix that.
> >>>>>>>>
> >>>>>>>> Florian pointed out that this wasn't quite specified in the LangRef,
> >>>>>>> which says this about optnone:
> >>>>>>>> This function attribute indicates that most optimization passes will
> >>>>>>> skip this function, with the exception of interprocedural optimization
> >>>>>>> passes. Code generation defaults to the “fast” instruction selector.
> >>>> This
> >>>>>>> attribute cannot be used together with the alwaysinline attribute; this
> >>>>>>> attribute is also incompatible with the minsize attribute and the
> >>>> optsize
> >>>>>>> attribute.
> >>>>>>>> This attribute requires the noinline attribute to be specified on the
> >>>>>>> function as well, so the function is never inlined into any caller.
> >>>> Only
> >>>>>>> functions with the alwaysinline attribute are valid candidates for
> >>>> inlining
> >>>>>>> into the body of this function.
> >>>>>>>> So the spec of optnone is unclear (or arguably explicitly disallows)
> >>>>>>> whether interprocedural optimizations should treat optnone functions
> >>>> in any
> >>>>>>> particular way.
> >>>>>>>> So I was going to update the wording to rephrase this to say
> >>>>>>> "Interprocedural optimizations should treat this function as though it
> >>>> were
> >>>>>>> defined in an isolated module/object." (perhaps "interprocedural
> >>>>>>> optimizations should treat optnone functions as opaque" or "as though
> >>>> they
> >>>>>>> were only declarations")
> >>>>>>>> The choice of this direction was based on my (possibly incorrect or
> >>>>>>> debatable) understanding of optnone, that it was equivalent to the
> >>>> function
> >>>>>>> being in a separate/non-lto object. (this seems consistent with the way
> >>>>>>> optnone is used to implement -O0 under lto - you could imagine a user
> >>>>>>> debugging a binary, using -O0 for the code they're interested in
> >>>> debugging,
> >>>>>>> and potentially using an interactive debugger to change some state in
> >>>> the
> >>>>>>> function causing it to return a different value - which would get quite
> >>>>>>> confusing if the return value was effectively hardcoded into the
> >>>> caller)
> >>>>>>>> What're folks thoughts on this?
> >>>>>>>>
> >>>>>>>> - Dave
> >>>>> _______________________________________________
> >>>>> LLVM Developers mailing list
> >>>>> llvm-dev at lists.llvm.org
> >>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


More information about the llvm-dev mailing list