[llvm] r179925 - Add CodeGen support for functions that always return arguments via a new parameter attribute 'returned', which is taken advantage of in target-independent tail call opportunity detection and in ARM call lowering (when placed on an integral first parameter).

Stephen Lin stephenwlin at gmail.com
Wed Apr 24 00:27:49 PDT 2013


Hi Evan, Jakob,

By writing some more tests, I found a corner case bug in the
implementation below that I've resolved, but I wanted to check with
you two as to whether you think my resolution is the most appropriate
one.

Basically, the issue is that the following:

  declare i32 @identity32(i32 returned %x)
  declare i16 @paramzext16(i16 zeroext returned %x)

  define i16 @test_matched_param(i16 %x) {
  entry:
    %call = tail call i16 @paramzext16(i16 %x)
    %b = zext i16 %call to i32
    %call2 = tail call i32 @identity32(i32 %b)
    %call3 = tail call i16 @paramzext16(i16 %call)
    ret i16 %call3
  }

gets optimized to:

    test_matched_param:
    @ BB#0:
           push    {r11, lr}
           uxth    r0, r0
           bl      paramzext16
           bl      identity32
           pop     {r11, lr}
           b       paramzext16

whereas the correct output and (as far as I can tell, optimal) output is:

   test_matched_param:
    @ BB#0:
           push    {r11, lr}
           uxth    r0, r0
           bl      paramzext16
           uxth    r0, r0
           bl      identity32
           pop     {r11, lr}
           b       paramzext16

In other words, the code generator incorrectly assumes that r0 is
completely preserved across the call to paramzext16, which is not
guaranteed (only the lower 16-bits are preserved; at least, that's the
most defensible interpretation of 'returned' here)

As far as I can tell this cannot be fixed in
ARMTargetLowering::LowerCall (with its current interface), since by
that point the register has already been extended to i32 and there's
no information that the original value was i16. So I resolved this by
changing the target-independent code in TargetLowering::LowerCallTo to
not pass 'returned' to the target-specific LowerCall implementations
unless either 1) the size of the register holding the passed argument
is equal to the size of the actual argument type or 2) the register is
extended using any-extend, in which case the high bits will be treated
as unknown even if the register is preserved across the call or 3) the
return value and 'returned' parameter have compatible extension
methods.

This is sufficient to fix the particular example and I believe is an
adequate solution for the general case: basically, it means that
'returned' being passed to the target-specific LowerCall means that
the actual full register value is returned unchanged, which is a
subset of the cases where there's a semantic 'returned' at the IR
level. The major limitation of this is that, in the case that the
parameter is zero/sign extended but the return value is not, it treats
the entire value as clobbered by the call when it's only the extended
bits that are possibly clobbered. This would be fixable with some
refactoring, but (as far as I can tell) would require changing the
interface to the target implementations of LowerCall to either pass
more information forward or back.

The issue can be seen by changing the line:
    %b = zext i16 %call to i32
to
    %b = zext i16 %x to i32

which results in significantly worse code, even though technically the
result should be the same because %x and %call should always be equal
in their lower 16 bits. Since it doesn't seem worth it to fix this for
now, I put a FIXME note about it in the code and in a test.

I considered a few other ways to resolve this, but all of them seemed
to have some issues (I can go into them in more detail, though, if
you'd like more details.)

Please let me know if you have any suggestions or concerns.

-Stephen

P.S. I also removed checks for sign and zero extension in
"sameNoopInput" that I added in r180118; they turned out to be
superfluous, which makes sense in retrospect now that I better
understand the semantics of the two.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fix-returned-param-ext.patch
Type: application/octet-stream
Size: 12494 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130424/6e5ece07/attachment.obj>


More information about the llvm-commits mailing list