[llvm] r242714 - [ARM] Refactor the prologue/epilogue emission to be more robust.
Quentin Colombet via llvm-commits
llvm-commits at lists.llvm.org
Mon Nov 30 12:41:24 PST 2015
> On Nov 30, 2015, at 11:14 AM, Quentin Colombet via llvm-commits <llvm-commits at lists.llvm.org> wrote:
>
> Hi Tim and Jon,
>
>> On Nov 30, 2015, at 10:51 AM, Jonathan Roelofs <jonathan at codesourcery.com> wrote:
>>
>>
>>
>> On 11/30/15 10:42 AM, Quentin Colombet wrote:
>>> Hi Jon,
>>>
>>>
>>>> On Nov 25, 2015, at 11:25 AM, Jonathan Roelofs
>>>> <jonathan at codesourcery.com <mailto:jonathan at codesourcery.com>> wrote:
>>>>
>>>>
>>>>
>>>> On 7/20/15 3:42 PM, Quentin Colombet wrote:
>>>>> Author: qcolombet
>>>>> Date: Mon Jul 20 16:42:14 2015
>>>>> New Revision: 242714
>>>>>
>>>>> URL: http://llvm.org/viewvc/llvm-project?rev=242714&view=rev
>>>>> Log:
>>>>> [ARM] Refactor the prologue/epilogue emission to be more robust.
>>>>>
>>>>> This is the first step toward supporting shrink-wrapping for this target.
>>>>>
>>>>> The changes could be summarized by these items:
>>>>> - Expand the tail-call return as part of the expand pseudo pass.
>>>>> - Get rid of the assumptions that the epilogue is the exit block:
>>>>> * Do not assume which registers are free in the epilogue. (This
>>>>> indirectly
>>>>> improve the lowering of the code for the segmented stacks, see
>>>>> the test
>>>>> cases.)
>>>>> * Take into account that the basic block can be empty.
>>>>>
>>>>> Related to <rdar://problem/20821730>
>>>>>
>>>>
>>>> snip
>>>>
>>>>> Modified: llvm/trunk/test/CodeGen/ARM/thumb1_return_sequence.ll
>>>>> URL:http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/ARM/thumb1_return_sequence.ll?rev=242714&r1=242713&r2=242714&view=diff
>>>>> ==============================================================================
>>>>> --- llvm/trunk/test/CodeGen/ARM/thumb1_return_sequence.ll (original)
>>>>> +++ llvm/trunk/test/CodeGen/ARM/thumb1_return_sequence.ll Mon Jul 20
>>>>> 16:42:14 2015
>>>>> @@ -23,11 +23,9 @@ entry:
>>>>> ; --------
>>>>> ; CHECK-V4T: add sp,
>>>>> ; CHECK-V4T-NEXT: pop {[[SAVED]]}
>>>>> -; CHECK-V4T-NEXT: mov r12, r3
>>>>> -; CHECK-V4T-NEXT: pop {r3}
>>>>> -; CHECK-V4T-NEXT: mov lr, r3
>>>>> -; CHECK-V4T-NEXT: mov r3, r12
>>>>> -; CHECK-V4T: bx lr
>>>>> +; We do not have any SP update to insert so we can just optimize
>>>>> +; the pop sequence.
>>>>> +; CHECK-V4T-NEXT: pop {pc}
>>>>
>>>> This ^ is an invalid optimization. The awkward sequence there is
>>>> necessary because:
>>>>
>>>> Arch constraints:
>>>> * v4t can't change the thumb bit via a `pop { pc }` insruction
>>>> * v4t can only pop into lo registers and the pc
>>>> (and bit[0] of the value popped into pc must be 0 if pc is in
>>>> the list)
>>>> * v4t can't move from a lo register to a lo register
>>>>
>>>> AAPCS constraints:
>>>> * r3 needs to be saved off in case it is used as part of the return
>>>> value.
>>>> * r12 is the only register we're allowed to clobber.
>>>
>>> Maybe the test or check line is just poorly written because the code
>>> that does this optimization is guarded under STI.hasV5TOps(), i.e.,
>>> this should not happen for v4t only arch.
>>
>> The problem is in the changes under the `if (ArgRegsSaveSize || IsV4PopReturn)` guard:
>>
>> https://github.com/llvm-mirror/llvm/commit/b2dab382ce01d602a757906160fccd9129c768f7#diff-5daa733fad8e67c69c0c5291dbb05ebbR392
>>
>> which did have `STI.hasV4TOps() && !STI.hasV5TOps()` as part of its condition. This was conservatively correct, but sub-optimal for v5t+.
>
> Thanks for pointing that out.
> I see the problem now.
>
> I’ll push a patch shortly.
Should be fixed by Committed revision 254325.
Q.
>
> Cheers,
> -Quentin
>>
>>>
>>> Do you have an actual test case where this fails?
>>
>> No, I don't. I can't find the original testcases that motivated r214928 either (other than the ones in the commit).
>>
>>
>> Jon
>>
>>>
>>> Thanks,
>>> -Quentin
>>>
>>>>
>>>>> ; CHECK-V5T: pop {[[SAVED]], pc}
>>>>> }
>>>>>
>>>>> @@ -53,19 +51,19 @@ entry:
>>>>> ; Epilogue
>>>>> ; --------
>>>>> ; CHECK-V4T: pop {[[SAVED]]}
>>>>> -; CHECK-V4T-NEXT: mov r12, r3
>>>>> -; CHECK-V4T-NEXT: pop {r3}
>>>>> +; CHECK-V4T-NEXT: mov r12, [[POP_REG:r[0-7]]]
>>>>> +; CHECK-V4T-NEXT: pop {[[POP_REG]]}
>>>>> ; CHECK-V4T-NEXT: add sp,
>>>>> -; CHECK-V4T-NEXT: mov lr, r3
>>>>> -; CHECK-V4T-NEXT: mov r3, r12
>>>>> +; CHECK-V4T-NEXT: mov lr, [[POP_REG]]
>>>>> +; CHECK-V4T-NEXT: mov [[POP_REG]], r12
>>>>> ; CHECK-V4T: bx lr
>>>>> ; CHECK-V5T: add sp,
>>>>> ; CHECK-V5T-NEXT: pop {[[SAVED]]}
>>>>> -; CHECK-V5T-NEXT: mov r12, r3
>>>>> -; CHECK-V5T-NEXT: pop {r3}
>>>>> +; CHECK-V5T-NEXT: mov r12, [[POP_REG:r[0-7]]]
>>>>> +; CHECK-V5T-NEXT: pop {[[POP_REG]]}
>>>>> ; CHECK-V5T-NEXT: add sp,
>>>>> -; CHECK-V5T-NEXT: mov lr, r3
>>>>> -; CHECK-V5T-NEXT: mov r3, r12
>>>>> +; CHECK-V5T-NEXT: mov lr, [[POP_REG]]
>>>>> +; CHECK-V5T-NEXT: mov [[POP_REG]], r12
>>>>> ; CHECK-V5T-NEXT: bx lr
>>>>> }
>>>>>
>>>>> @@ -95,8 +93,7 @@ entry:
>>>>> ; Epilogue
>>>>> ; --------
>>>>> ; CHECK-V4T: pop {[[SAVED]]}
>>>>> -; CHECK-V4T: pop {r3}
>>>>> -; CHECK-V4T: bx r3
>>>>> +; CHECK-V4T: pop {pc}
>>>>
>>>> This ^ is also invalid for the same reason.
>>>>
>>>>> ; CHECK-V5T: pop {[[SAVED]], pc}
>>>>> }
>>>>>
>>>>> @@ -148,14 +145,18 @@ entry:
>>>>> ; --------
>>>>> ; CHECK-V4T: add sp,
>>>>> ; CHECK-V4T-NEXT: pop {[[SAVED]]}
>>>>> -; CHECK-V4T-NEXT: pop {r3}
>>>>> +; Only r1 to r3 are available to pop LR.
>>>>> +; r0 is used for the return value.
>>>>> +; CHECK-V4T-NEXT: pop {[[POP_REG:r[1-3]]]}
>>>>> ; CHECK-V4T-NEXT: add sp,
>>>>> -; CHECK-V4T-NEXT: bx r3
>>>>> +; CHECK-V4T-NEXT: bx [[POP_REG]]
>>>>> ; CHECK-V5T: add sp,
>>>>> ; CHECK-V5T-NEXT: pop {[[SAVED]]}
>>>>> -; CHECK-V5T-NEXT: pop {r3}
>>>>> +; Only r1 to r3 are available to pop LR.
>>>>> +; r0 is used for the return value.
>>>>> +; CHECK-V5T-NEXT: pop {[[POP_REG:r[1-3]]]}
>>>>> ; CHECK-V5T-NEXT: add sp,
>>>>> -; CHECK-V5T-NEXT: bx r3
>>>>> +; CHECK-V5T-NEXT: bx [[POP_REG]]
>>>>> }
>>>>>
>>>>> ; CHECK-V4T-LABEL: noframe
>>>>> @@ -191,13 +192,17 @@ entry:
>>>>> ; Epilogue
>>>>> ; --------
>>>>> ; CHECK-V4T: pop {[[SAVED]]}
>>>>> -; CHECK-V4T-NEXT: pop {r3}
>>>>> +; Only r1 to r3 are available to pop LR.
>>>>> +; r0 is used for the return value.
>>>>> +; CHECK-V4T-NEXT: pop {[[POP_REG:r[1-3]]]}
>>>>> ; CHECK-V4T-NEXT: add sp,
>>>>> -; CHECK-V4T-NEXT: bx r3
>>>>> +; CHECK-V4T-NEXT: bx [[POP_REG]]
>>>>> ; CHECK-V5T: pop {[[SAVED]]}
>>>>> -; CHECK-V5T-NEXT: pop {r3}
>>>>> +; Only r1 to r3 are available to pop LR.
>>>>> +; r0 is used for the return value.
>>>>> +; CHECK-V5T-NEXT: pop {[[POP_REG:r[1-3]]]}
>>>>> ; CHECK-V5T-NEXT: add sp,
>>>>> -; CHECK-V5T-NEXT: bx r3
>>>>> +; CHECK-V5T-NEXT: bx [[POP_REG]]
>>>>> }
>>>>>
>>>>> declare void @llvm.va_start(i8*) nounwind
>>>>>
>>>>
>>>> snip
>>>>
>>>>>
>>>>> _______________________________________________
>>>>> llvm-commits mailing list
>>>>> llvm-commits at cs.uiuc.edu <mailto:llvm-commits at cs.uiuc.edu>
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>>
>>>>
>>>> --
>>>> Jon Roelofs
>>>> jonathan at codesourcery.com <mailto:jonathan at codesourcery.com>
>>>> CodeSourcery / Mentor Embedded
>>>
>>
>> --
>> Jon Roelofs
>> jonathan at codesourcery.com
>> CodeSourcery / Mentor Embedded
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org <mailto:llvm-commits at lists.llvm.org>
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20151130/ed4378d4/attachment.html>
More information about the llvm-commits
mailing list