[llvm-dev] Revisiting/refining the definition of optnone with interprocedural transformations
llvm-dev at lists.llvm.org
Thu Apr 22 16:00:32 PDT 2021
I've taken some time to try to understand your viewpoint,
and I will give some more of the history as I remember it;
hopefully that will help. And a suggestion at the end.
> The point is, I never suggested to change the meaning of
> `optnone` (in IR). I argue to change the requirement for
> it to always go with `noinline`. `optnone` itself is not
> changed, you get exactly the same behavior you got before,
> and `noinline` is also not changed. They are simply not
> required to go together.
I can understand looking at the as-implemented handling of
'optnone' and taking that as the intended meaning. So from that
perspective, lacking the history, it does look to you like you are
not suggesting a change in its meaning.
Actually, removing the requirement to tie them together *is*
something that I would consider a semantic change, will require an
update to the LangRef and verifier, and so on. This would be more
obvious if we had originally done either of two things with the
same net semantic effect as we have now:
- make optnone *imply* noinline (like 'naked' does IIUC) instead
of *requiring* it.
- make the inliner check for optnone on the callee, instead of
simply tying the two attributes together.
I hope this helps explain why I see optnone+noinline as simply
two parts of one unified feature.
The meaning of 'optnone' was always intended as, don't optimize,
in as many ways as we can manage. I'd rather have not had the
coupling with noinline, but it was the best way forward at the
time to achieve the effect we needed. I spent too many months of
my life getting this accepted at all...
Maybe my definition of optnone in the LangRef is inadequate; but
I am quite sure I understand the original intent.
Which includes this:
When it comes to the interaction of optnone and inlining, there
were of course four cases to consider: caller and callee, and each
might or might not have optnone.
1) caller - N, callee - N
2) caller - N, callee - Y
3) caller - Y, callee - N
4) caller - Y, callee - Y
1) normal case. Inlining and other opts are okay.
2) callee is optnone, so we didn't want it inlined and optimized.
3) caller is optnone, so no inlining happens.
4) caller is optnone, so no inlining happens.
Cases 3/4 are handled by checking for optnone in the inliner pass,
just like a hundred other passes do. This is boilerplate and was
added in bulk to all those passes. Having it all be boilerplate
was important in getting the reviews accepted.
Case 2 was handled by coupling optnone to noinline. As I said,
this was a practical thing done at the time, not because we ever
intended the semantics of optnone to mean anything else. It got
the overall feature accepted, which was the key thing for Sony.
So, what we implemented achieved the effect we wanted, even if
that implementation didn't mean the new attribute *by itself* had
exactly all the effects we wanted.
Yes, decoupling optnone (as implemented) from noinline would allow
case 2 to inline bar into foo, and optimize the result, if that
somehow seems desirable. But, that result is very much against
the original intent of optnone, and is why I have been giving you
such a hard time about this.
I will continue to insist that something called 'optnone' cannot
properly fail to apply to all instances, inlined or not. But if
we're willing to rename the attribute, that problem is solved.
I'm assuming there would be resistance to doing either of the two
things I mentioned at the top (make optnone *imply* noinline, or
modify the inliner to check for optnone on the callee) as that
doesn't seem to be the direction people are moving in.
So the suggestion is:
- reword the definition of optnone to be something that would be
better named "nolocalopt";
- ideally, actually rename the attribute (because I still say
that "none" does not mean "unless we've inlined it");
- and yes, if you must, decouple it from noinline and remove that
paragraph from the LangRef description.
Clang will still pass both IR attributes, so end users won't see
any feature regressions. David can add 'noipa' and we can make
Clang pass that as well.
More information about the llvm-dev