[PATCH] Teach DeadArgElimination not to eliminate return values of functions with 'returned' arguments

Wed Jun 26 00:41:37 PDT 2013

Stephen Lin wrote:
> On Mon, Jun 24, 2013 at 2:38 PM, Stephen Lin<swlin at post.harvard.edu>  wrote:
>> On Mon, Jun 24, 2013 at 11:23 AM, Stephen Lin<swlin at post.harvard.edu>  wrote:
>>>> No, I don't have any alternatives, other than interprocedural coordination
>>>> of code generation. Do you have any alternatives?
>>>
>>> The only thing I can think of is a few more heuristics to avoid
>>> keeping the return value in cases where it is "obviously" not useful
>>> to the caller.
>>
>> After some thought, my feeling is that some other pass (or passes),
>> run before DeadArgumentElimination, should:
>>
>> 1. Infer 'returned' where it is implied by the callee's IR and looks
>> like it could be useful in the caller(s) (based on some
>> target-independent heuristics, which might not be 100% accurate but
>> can make a reasonable guess)
>> 2. Remove 'returned' if it's present but obviously not useful to any
>> caller (note that this itself won't eliminate the return value, only
>> the argument's 'returned' attribute, so wouldn't require the function
>> to be internalized.)
>> 3. Canonicalize uses of return values and uses of 'returned' arguments
>> dominated by the call (in some manner TBD)
>>
>> In other words, 'returned' should be treated as a hint that should
>> only be added by a pass if it has positive reason to believe that it
>> will be useful, and can be removed if it looks like it will not be,
>> rather than just being automatically applied automatically wherever
>> the semantics hold. I think this is the best compromise approach that
>> will allow the attribute to be taken advantage of without forcing
>> inter-procedural coordination and complicated control flow analysis at
>> code gen.
>>
>> Given that the 'returned' attribute will, for the near future, only
>> generated by clang in a specific place, this doesn't seem necessary
>> yet, but should be done eventually. In the meantime, I think the
>> current patch is the best tactical approach to at least get the
>> attribute working for ARM C++ ABI 'this'-returns (in particular,
>> allowing elision of 'this' save/restores across constructor or
>> destructor calls, which the current front-end only implementation does
>> not do) without breaking.
>
> Anyone still have a significant objection to the patch given the
> above? I would like to get this in, if not...

Well, yes. I'm not here to veto the patch, but I am going to take you up 
on your offer to raise an objection. :) I've been holding off on 
commenting further because I don't yet have the constructive suggestions 
that ought to follow my objection, and I think I might if I thought 
about it longer.

Let's explore the things we could do with the 'returned' attribute. If 
we have internal-linkage functions that return void, we could change 
them to return one of their arguments and mark it 'returned'. When is 
this profitable? You could even have some heuristics to decide which one 
of the arguments is most likely to be useful to the caller, and put 
'returned' on that one.

There's no reason that we need to limit ourselves to a single register. 
Keep the void return type, and let the x86 calling convention have a 
rule where we use AX, BX, CX, DX in that order for up to four 'returned' 
arguments. When is it profitable? You might not want to use all four if 
the caller has higher register pressure. It depends on the register 
pressure in the caller, and register allocation particulars in the callee.

The 'returned' attribute is interprocedural register allocation done at 
the IR level, encoded with parameter attributes.

This is a major architectural change to llvm. I don't think anybody 
realized the implications of the 'returned' patch when it was under review.

We don't have the necessary framework to decide when or where 'returned' 
belongs. We can't know whether it will require expensive extra copies in 
the callee or if the caller will be under high register pressure, not at 
the IR level. Heuristics would be an approximation of codegen, with a 
tradeoff in complexity vs. accuracy  equal to how much of codegen we 
want to reimplement.

I do understand how it solves your problem in the cross-TU case, where 
you have this great ABI guarantee that the argument gets returned, which 
you can use in the register allocator. It isn't even a 
calling-convention on its own, it's more of a modifier that can be 
applied to any calling convention. I can see how it ended up as a 
parameter attribute, much like zeroext is a ABI-level parameter 
attribute. But once we start to apply its semantics in the same-TU case, 
it becomes an llvm IR codified hack to work around the lack of 
interprocedural register allocation. That's not okay.

Anyhow, as I said I'm not going to block this patch because I don't have 
ideas on what would be a better way to do it, but please get an OK from 
Dan first.

Nick