[PATCH] Teach DeadArgElimination not to eliminate return values of functions with 'returned' arguments

Wed Jun 26 12:58:15 PDT 2013

On Jun 26, 2013, at 12:41 AM, Nick Lewycky <nicholas at mxc.ca> wrote:

> Stephen Lin wrote:
>> On Mon, Jun 24, 2013 at 2:38 PM, Stephen Lin<swlin at post.harvard.edu>  wrote:
>>> On Mon, Jun 24, 2013 at 11:23 AM, Stephen Lin<swlin at post.harvard.edu>  wrote:
>>>>> No, I don't have any alternatives, other than interprocedural coordination
>>>>> of code generation. Do you have any alternatives?
>>>> 
>>>> The only thing I can think of is a few more heuristics to avoid
>>>> keeping the return value in cases where it is "obviously" not useful
>>>> to the caller.
>>> 
>>> After some thought, my feeling is that some other pass (or passes),
>>> run before DeadArgumentElimination, should:
>>> 
>>> 1. Infer 'returned' where it is implied by the callee's IR and looks
>>> like it could be useful in the caller(s) (based on some
>>> target-independent heuristics, which might not be 100% accurate but
>>> can make a reasonable guess)
>>> 2. Remove 'returned' if it's present but obviously not useful to any
>>> caller (note that this itself won't eliminate the return value, only
>>> the argument's 'returned' attribute, so wouldn't require the function
>>> to be internalized.)
>>> 3. Canonicalize uses of return values and uses of 'returned' arguments
>>> dominated by the call (in some manner TBD)
>>> 
>>> In other words, 'returned' should be treated as a hint that should
>>> only be added by a pass if it has positive reason to believe that it
>>> will be useful, and can be removed if it looks like it will not be,
>>> rather than just being automatically applied automatically wherever
>>> the semantics hold. I think this is the best compromise approach that
>>> will allow the attribute to be taken advantage of without forcing
>>> inter-procedural coordination and complicated control flow analysis at
>>> code gen.
>>> 
>>> Given that the 'returned' attribute will, for the near future, only
>>> generated by clang in a specific place, this doesn't seem necessary
>>> yet, but should be done eventually. In the meantime, I think the
>>> current patch is the best tactical approach to at least get the
>>> attribute working for ARM C++ ABI 'this'-returns (in particular,
>>> allowing elision of 'this' save/restores across constructor or
>>> destructor calls, which the current front-end only implementation does
>>> not do) without breaking.
>> 
>> Anyone still have a significant objection to the patch given the
>> above? I would like to get this in, if not...
> 
> Well, yes. I'm not here to veto the patch, but I am going to take you up on your offer to raise an objection. :) I've been holding off on commenting further because I don't yet have the constructive suggestions that ought to follow my objection, and I think I might if I thought about it longer.
> 
> Let's explore the things we could do with the 'returned' attribute. If we have internal-linkage functions that return void, we could change them to return one of their arguments and mark it 'returned'. When is this profitable? You could even have some heuristics to decide which one of the arguments is most likely to be useful to the caller, and put 'returned' on that one.
> 
> There's no reason that we need to limit ourselves to a single register. Keep the void return type, and let the x86 calling convention have a rule where we use AX, BX, CX, DX in that order for up to four 'returned' arguments. When is it profitable? You might not want to use all four if the caller has higher register pressure. It depends on the register pressure in the caller, and register allocation particulars in the callee.
> 
> The 'returned' attribute is interprocedural register allocation done at the IR level, encoded with parameter attributes.
> 
> This is a major architectural change to llvm. I don't think anybody realized the implications of the 'returned' patch when it was under review.

FYI, this was discussed inside the Apple team and Chris has approved this approach. Also, I'm not sure why you consider this a "major architectural change".

I agree there are opportunities to further expand this optimization. But I'm not sure that's the relevant discussion here. The question is whether the approach for the most common case (which is important for C++ and x86, arm calling conventions).

> 
> We don't have the necessary framework to decide when or where 'returned' belongs. We can't know whether it will require expensive extra copies in the callee or if the caller will be under high register pressure, not at the IR level. Heuristics would be an approximation of codegen, with a tradeoff in complexity vs. accuracy  equal to how much of codegen we want to reimplement.
> 
> I do understand how it solves your problem in the cross-TU case, where you have this great ABI guarantee that the argument gets returned, which you can use in the register allocator. It isn't even a calling-convention on its own, it's more of a modifier that can be applied to any calling convention. I can see how it ended up as a parameter attribute, much like zeroext is a ABI-level parameter attribute. But once we start to apply its semantics in the same-TU case, it becomes an llvm IR codified hack to work around the lack of interprocedural register allocation. That's not okay.

We shouldn't require knowledge of the target and the calling conventions to perform optimization at the IR level. I believe the important thing here is for DAE to be consistent. It should either keep the argument and the "returned" attribute or remove the argument as well as the attribute. The rest of the work should be left to codegen. In today's implementations, the frontend is responsible for some of the ABI lowering. It has detailed knowledge about target calling conventions. It's a reasonable and obvious place to decide on where to place the "returned" attribute.

> 
> Anyhow, as I said I'm not going to block this patch because I don't have ideas on what would be a better way to do it, but please get an OK from Dan first.

This is a long thread so I didn't read all of it. If I understand correctly, Dan's proposal is to change the function prototype to return void and recognize the optimization opportunity at codegen. At the codegen level, the handling of function arguments and return values are scattered all over the place. It's going to add more complexity and I really do not see how it would make the optimization more powerful.

Evan

> 
> Nick
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits