[llvm] r242714 - [ARM] Refactor the prologue/epilogue emission to be more robust.
Quentin Colombet via llvm-commits
llvm-commits at lists.llvm.org
Mon Nov 30 11:14:00 PST 2015
Hi Tim and Jon,
> On Nov 30, 2015, at 10:51 AM, Jonathan Roelofs <jonathan at codesourcery.com> wrote:
>
>
>
> On 11/30/15 10:42 AM, Quentin Colombet wrote:
>> Hi Jon,
>>
>>
>>> On Nov 25, 2015, at 11:25 AM, Jonathan Roelofs
>>> <jonathan at codesourcery.com <mailto:jonathan at codesourcery.com>> wrote:
>>>
>>>
>>>
>>> On 7/20/15 3:42 PM, Quentin Colombet wrote:
>>>> Author: qcolombet
>>>> Date: Mon Jul 20 16:42:14 2015
>>>> New Revision: 242714
>>>>
>>>> URL: http://llvm.org/viewvc/llvm-project?rev=242714&view=rev
>>>> Log:
>>>> [ARM] Refactor the prologue/epilogue emission to be more robust.
>>>>
>>>> This is the first step toward supporting shrink-wrapping for this target.
>>>>
>>>> The changes could be summarized by these items:
>>>> - Expand the tail-call return as part of the expand pseudo pass.
>>>> - Get rid of the assumptions that the epilogue is the exit block:
>>>> * Do not assume which registers are free in the epilogue. (This
>>>> indirectly
>>>> improve the lowering of the code for the segmented stacks, see
>>>> the test
>>>> cases.)
>>>> * Take into account that the basic block can be empty.
>>>>
>>>> Related to <rdar://problem/20821730>
>>>>
>>>
>>> snip
>>>
>>>> Modified: llvm/trunk/test/CodeGen/ARM/thumb1_return_sequence.ll
>>>> URL:http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/ARM/thumb1_return_sequence.ll?rev=242714&r1=242713&r2=242714&view=diff
>>>> ==============================================================================
>>>> --- llvm/trunk/test/CodeGen/ARM/thumb1_return_sequence.ll (original)
>>>> +++ llvm/trunk/test/CodeGen/ARM/thumb1_return_sequence.ll Mon Jul 20
>>>> 16:42:14 2015
>>>> @@ -23,11 +23,9 @@ entry:
>>>> ; --------
>>>> ; CHECK-V4T: add sp,
>>>> ; CHECK-V4T-NEXT: pop {[[SAVED]]}
>>>> -; CHECK-V4T-NEXT: mov r12, r3
>>>> -; CHECK-V4T-NEXT: pop {r3}
>>>> -; CHECK-V4T-NEXT: mov lr, r3
>>>> -; CHECK-V4T-NEXT: mov r3, r12
>>>> -; CHECK-V4T: bx lr
>>>> +; We do not have any SP update to insert so we can just optimize
>>>> +; the pop sequence.
>>>> +; CHECK-V4T-NEXT: pop {pc}
>>>
>>> This ^ is an invalid optimization. The awkward sequence there is
>>> necessary because:
>>>
>>> Arch constraints:
>>> * v4t can't change the thumb bit via a `pop { pc }` insruction
>>> * v4t can only pop into lo registers and the pc
>>> (and bit[0] of the value popped into pc must be 0 if pc is in
>>> the list)
>>> * v4t can't move from a lo register to a lo register
>>>
>>> AAPCS constraints:
>>> * r3 needs to be saved off in case it is used as part of the return
>>> value.
>>> * r12 is the only register we're allowed to clobber.
>>
>> Maybe the test or check line is just poorly written because the code
>> that does this optimization is guarded under STI.hasV5TOps(), i.e.,
>> this should not happen for v4t only arch.
>
> The problem is in the changes under the `if (ArgRegsSaveSize || IsV4PopReturn)` guard:
>
> https://github.com/llvm-mirror/llvm/commit/b2dab382ce01d602a757906160fccd9129c768f7#diff-5daa733fad8e67c69c0c5291dbb05ebbR392
>
> which did have `STI.hasV4TOps() && !STI.hasV5TOps()` as part of its condition. This was conservatively correct, but sub-optimal for v5t+.
Thanks for pointing that out.
I see the problem now.
I’ll push a patch shortly.
Cheers,
-Quentin
>
>>
>> Do you have an actual test case where this fails?
>
> No, I don't. I can't find the original testcases that motivated r214928 either (other than the ones in the commit).
>
>
> Jon
>
>>
>> Thanks,
>> -Quentin
>>
>>>
>>>> ; CHECK-V5T: pop {[[SAVED]], pc}
>>>> }
>>>>
>>>> @@ -53,19 +51,19 @@ entry:
>>>> ; Epilogue
>>>> ; --------
>>>> ; CHECK-V4T: pop {[[SAVED]]}
>>>> -; CHECK-V4T-NEXT: mov r12, r3
>>>> -; CHECK-V4T-NEXT: pop {r3}
>>>> +; CHECK-V4T-NEXT: mov r12, [[POP_REG:r[0-7]]]
>>>> +; CHECK-V4T-NEXT: pop {[[POP_REG]]}
>>>> ; CHECK-V4T-NEXT: add sp,
>>>> -; CHECK-V4T-NEXT: mov lr, r3
>>>> -; CHECK-V4T-NEXT: mov r3, r12
>>>> +; CHECK-V4T-NEXT: mov lr, [[POP_REG]]
>>>> +; CHECK-V4T-NEXT: mov [[POP_REG]], r12
>>>> ; CHECK-V4T: bx lr
>>>> ; CHECK-V5T: add sp,
>>>> ; CHECK-V5T-NEXT: pop {[[SAVED]]}
>>>> -; CHECK-V5T-NEXT: mov r12, r3
>>>> -; CHECK-V5T-NEXT: pop {r3}
>>>> +; CHECK-V5T-NEXT: mov r12, [[POP_REG:r[0-7]]]
>>>> +; CHECK-V5T-NEXT: pop {[[POP_REG]]}
>>>> ; CHECK-V5T-NEXT: add sp,
>>>> -; CHECK-V5T-NEXT: mov lr, r3
>>>> -; CHECK-V5T-NEXT: mov r3, r12
>>>> +; CHECK-V5T-NEXT: mov lr, [[POP_REG]]
>>>> +; CHECK-V5T-NEXT: mov [[POP_REG]], r12
>>>> ; CHECK-V5T-NEXT: bx lr
>>>> }
>>>>
>>>> @@ -95,8 +93,7 @@ entry:
>>>> ; Epilogue
>>>> ; --------
>>>> ; CHECK-V4T: pop {[[SAVED]]}
>>>> -; CHECK-V4T: pop {r3}
>>>> -; CHECK-V4T: bx r3
>>>> +; CHECK-V4T: pop {pc}
>>>
>>> This ^ is also invalid for the same reason.
>>>
>>>> ; CHECK-V5T: pop {[[SAVED]], pc}
>>>> }
>>>>
>>>> @@ -148,14 +145,18 @@ entry:
>>>> ; --------
>>>> ; CHECK-V4T: add sp,
>>>> ; CHECK-V4T-NEXT: pop {[[SAVED]]}
>>>> -; CHECK-V4T-NEXT: pop {r3}
>>>> +; Only r1 to r3 are available to pop LR.
>>>> +; r0 is used for the return value.
>>>> +; CHECK-V4T-NEXT: pop {[[POP_REG:r[1-3]]]}
>>>> ; CHECK-V4T-NEXT: add sp,
>>>> -; CHECK-V4T-NEXT: bx r3
>>>> +; CHECK-V4T-NEXT: bx [[POP_REG]]
>>>> ; CHECK-V5T: add sp,
>>>> ; CHECK-V5T-NEXT: pop {[[SAVED]]}
>>>> -; CHECK-V5T-NEXT: pop {r3}
>>>> +; Only r1 to r3 are available to pop LR.
>>>> +; r0 is used for the return value.
>>>> +; CHECK-V5T-NEXT: pop {[[POP_REG:r[1-3]]]}
>>>> ; CHECK-V5T-NEXT: add sp,
>>>> -; CHECK-V5T-NEXT: bx r3
>>>> +; CHECK-V5T-NEXT: bx [[POP_REG]]
>>>> }
>>>>
>>>> ; CHECK-V4T-LABEL: noframe
>>>> @@ -191,13 +192,17 @@ entry:
>>>> ; Epilogue
>>>> ; --------
>>>> ; CHECK-V4T: pop {[[SAVED]]}
>>>> -; CHECK-V4T-NEXT: pop {r3}
>>>> +; Only r1 to r3 are available to pop LR.
>>>> +; r0 is used for the return value.
>>>> +; CHECK-V4T-NEXT: pop {[[POP_REG:r[1-3]]]}
>>>> ; CHECK-V4T-NEXT: add sp,
>>>> -; CHECK-V4T-NEXT: bx r3
>>>> +; CHECK-V4T-NEXT: bx [[POP_REG]]
>>>> ; CHECK-V5T: pop {[[SAVED]]}
>>>> -; CHECK-V5T-NEXT: pop {r3}
>>>> +; Only r1 to r3 are available to pop LR.
>>>> +; r0 is used for the return value.
>>>> +; CHECK-V5T-NEXT: pop {[[POP_REG:r[1-3]]]}
>>>> ; CHECK-V5T-NEXT: add sp,
>>>> -; CHECK-V5T-NEXT: bx r3
>>>> +; CHECK-V5T-NEXT: bx [[POP_REG]]
>>>> }
>>>>
>>>> declare void @llvm.va_start(i8*) nounwind
>>>>
>>>
>>> snip
>>>
>>>>
>>>> _______________________________________________
>>>> llvm-commits mailing list
>>>> llvm-commits at cs.uiuc.edu <mailto:llvm-commits at cs.uiuc.edu>
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>
>>>
>>> --
>>> Jon Roelofs
>>> jonathan at codesourcery.com <mailto:jonathan at codesourcery.com>
>>> CodeSourcery / Mentor Embedded
>>
>
> --
> Jon Roelofs
> jonathan at codesourcery.com
> CodeSourcery / Mentor Embedded
More information about the llvm-commits
mailing list