[PATCH] Teach DeadArgElimination not to eliminate return values of functions with 'returned' arguments

Stephen Lin swlin at post.harvard.edu
Fri Jun 21 15:27:34 PDT 2013


Hi Nick,

Thanks for the feedback!

> Okay, so I haven't read the thread yet. I will. :) But my immediate reaction
> is that this sounds like a calling convention.
>
> Suppose we manage to internalize the constructor (ie., all callers are
> visible), then the calling convention can be changed to fastcc which means
> "llvm can do anything it wants, even different things for different
> functions". If there's an optimization which we could be doing, but that
> fastcc isn't doing it then we ought to fix that! That will apply to many
> more functions than just constructors and more architectures than arm.
>

Yes, in fact internally, for now, the ARM backend more or less treat
the specific case of 'this' on the first parameter it as a new calling
convention, with an extra wrinkle in that the same register is used to
pass an argument and is preserved by the call (that's currently not
possible via tablegen definitions.)

Ideally, we would like CodeGen to generically handle cases where
return values are guaranteed to be aliases of arguments (even if
they're not the same register) and then have all uses of either the
return value or the argument that are dominated by the call
intelligently choose the right one to use depending on register
availability, etc., but this would (in Jakob's estimation, and I
believe him after looking into it a bit) require a significant
reworking of the register allocator, at a minimum. It's not as simple
as canonicalizing the IR to choose one versus the other; you can come
up with scenarios where one would be better than the other depending
on other uses and the existence of other intervening calls, etc.

> If we don't internalize it, we'll see the 'armthisreturn' calling convention
> and the caller and callee both know that the argument is returned. But
> notice that in this case, it isn't possible for deadargelim to remove the
> return type, since we don't have all the callers, so deadargelim must
> conservatively assume the return value is used.
>
>
>> Admittedly, there are situations where it would be better to drop the
>> return value. For example, if the argument and return value are both
>> superfluous (i.e. the constructor/destructor doesn't do anything with
>> the 'this' pointer but pass it straight through) and the caller isn't
>> going to have the 'this' pointer of the callee handy in a register
>> anyway, then it would be better to get rid of both; however, it is
>> rare for a constructor or destructor to never use 'this'. Furthermore,
>> if there's significant register pressure in either the caller or the
>> callee, then maybe 'this' is going to have to be save/restored on both
>> sides anyway, and dropping the return value will keep it from having
>> to be restored on the callee side before being returned, etc.
>
>
> What about implicit constructors or other empty constructors? Normally those
> will get inlined, but supposing that inlining is disabled (for instance, to
> prevent code size growth which I hear is important on ARM) we'd like
> deadargelim to be able to nuke the argument and return value together. I
> think that should be a test case.

I updated my patch to remove both when the return value is the only
use of the argument (i.e. so if the callee does nothing with the
argument except returning it, and the return value is dead at the IR
level, then both are eliminated.)

To be honest, it's not even clear that this is the right thing to do,
though, in general...even if the constructor does not use the 'this'
pointer, it's conceivable the caller would prefer that the 'this'
pointer be preserved across the call in R0 so it doesn't have to save
it itself. I suppose if it's all internalized and we're using fastcc,
then in theory we can choose the right registers to pass arguments
through such that the 'this' pointer doesn't have to be save/restored
across the call, whatever register it happens to be in already--I am
not yet familiar enough with that infrastructure to know if or how
well that works. But I think at that point, taking away both is the
more reasonable default in these cases.

>
>
>> Unfortunately, as far as I can tell, there's not really any way for an
>> IR-level pass to detect these kinds of situations without lots of
>> target and backend-implementation specific details, and, since for
>> now, the attribute is only being used in a very specific situation
>> where (I believe) it's usually more profitable to allow code
>> generation to take advantage of the extra information provided by
>> 'returned', I decided to go with the default of leaving it in always.
>>
>> That said, if you think there's a better approach to this that's more
>> robust and retains the optimization benefits for the ARM C++ ABI,
>> please let me know!
>
>
> There's an elephant you never mentioned in this email: using the 'returned'
> attribute for IR-level optimizations. What things are possible there? If the
> answer really is "nothing" then I'm going to press that this doesn't belong
> and should be a calling convention. If there are cool things we can do with
> it at the IR-optimizing level, then that may be enough to justify its
> existence ABI issues aside.

Part of the implementation of this attribute overlaps with the
implementation of calling conventions and call lowering, but I think
it's actually orthogonal which is why we want an attribute. In theory
(and, in practice, with the IR attribute), you could have any
combination of an existing calling convention and an ordinal argument
number that is guaranteed returned; one refers to the specifics of
what happens at the ABI boundaries and the other refers to semantics.

I also think there are definitely IR-level optimizations that can be
done taking advantage of the fact that an argument and return value
are aliases of each other.  The main difficulty of taking full
advantage of this at the IR level, though, is that it's non-obvious
which alias is better to choose from without information only
available fairly late in CodeGen, and there's a big risk of making
code worse rather than better by inadvertently causing an extra spill,
etc.

However, even without that, we could have an IR pass that analyzed
functions and automatically placed 'returned' on attributes when the
semantics are detected, so that CodeGen can take advantage of this
information; this is much easier at the IR level than at the CodeGen
level (but we should only do this if/when we have more general support
for it in CodeGen, and we would also need to revisit things like this
patch at the same time..)

Let me know what you think,

Stephen



More information about the llvm-commits mailing list