[llvm-dev] Revisiting/refining the definition of optnone with interprocedural transformations

Thu Apr 22 16:00:32 PDT 2021

Hi Johannes,

I've taken some time to try to understand your viewpoint,
and I will give some more of the history as I remember it;
hopefully that will help.  And a suggestion at the end.

> The point is, I never suggested to change the meaning of
> `optnone` (in IR). I argue to change the requirement for
> it to always go with `noinline`. `optnone` itself is not
> changed, you get exactly the same behavior you got before,
> and `noinline` is also not changed. They are simply not
> required to go together.

Okay.

I can understand looking at the as-implemented handling of
'optnone' and taking that as the intended meaning. So from that 
perspective, lacking the history, it does look to you like you are 
not suggesting a change in its meaning.

Actually, removing the requirement to tie them together *is* 
something that I would consider a semantic change, will require an 
update to the LangRef and verifier, and so on.  This would be more 
obvious if we had originally done either of two things with the 
same net semantic effect as we have now:
- make optnone *imply* noinline (like 'naked' does IIUC) instead
  of *requiring* it.
- make the inliner check for optnone on the callee, instead of 
  simply tying the two attributes together.

I hope this helps explain why I see optnone+noinline as simply
two parts of one unified feature.

History:

The meaning of 'optnone' was always intended as, don't optimize,
in as many ways as we can manage.  I'd rather have not had the 
coupling with noinline, but it was the best way forward at the 
time to achieve the effect we needed.  I spent too many months of 
my life getting this accepted at all...
Maybe my definition of optnone in the LangRef is inadequate; but 
I am quite sure I understand the original intent.

Which includes this:

When it comes to the interaction of optnone and inlining, there 
were of course four cases to consider: caller and callee, and each 
might or might not have optnone.

1) caller - N, callee - N
2) caller - N, callee - Y
3) caller - Y, callee - N
4) caller - Y, callee - Y

1) normal case. Inlining and other opts are okay.
2) callee is optnone, so we didn't want it inlined and optimized.
3) caller is optnone, so no inlining happens.
4) caller is optnone, so no inlining happens.

Cases 3/4 are handled by checking for optnone in the inliner pass,
just like a hundred other passes do.  This is boilerplate and was 
added in bulk to all those passes.  Having it all be boilerplate 
was important in getting the reviews accepted.

Case 2 was handled by coupling optnone to noinline.  As I said,
this was a practical thing done at the time, not because we ever 
intended the semantics of optnone to mean anything else.  It got 
the overall feature accepted, which was the key thing for Sony.

So, what we implemented achieved the effect we wanted, even if
that implementation didn't mean the new attribute *by itself* had 
exactly all the effects we wanted.

Yes, decoupling optnone (as implemented) from noinline would allow
case 2 to inline bar into foo, and optimize the result, if that 
somehow seems desirable.  But, that result is very much against 
the original intent of optnone, and is why I have been giving you 
such a hard time about this.

I will continue to insist that something called 'optnone' cannot
properly fail to apply to all instances, inlined or not.  But if
we're willing to rename the attribute, that problem is solved.

I'm assuming there would be resistance to doing either of the two
things I mentioned at the top (make optnone *imply* noinline, or
modify the inliner to check for optnone on the callee) as that
doesn't seem to be the direction people are moving in.

So the suggestion is:
- reword the definition of optnone to be something that would be
  better named "nolocalopt";
- ideally, actually rename the attribute (because I still say
  that "none" does not mean "unless we've inlined it");
- and yes, if you must, decouple it from noinline and remove that 
  paragraph from the LangRef description.

Clang will still pass both IR attributes, so end users won't see
any feature regressions.  David can add 'noipa' and we can make
Clang pass that as well.
--paulr