Fix PR15293: ARM codegen ice - expected larger existing stack allocation
Manman Ren
mren at apple.com
Tue Apr 16 11:58:35 PDT 2013
Hi Stepan,
AAPCS-C.4-vs-VFP-2013-04-16.patch looks good except that the testing case.
Should it have 9 doubles and 1 integer instead of a single double and a single integer?
The testing case passes even without the fix.
I am not sure about the renaming patch though. Functions get|setVarArgsRegSaveSize are used for both vararg and byval.
It does not sound right to replace VarARgs with ByVal.
Thanks for working on this.
Manman
On Apr 15, 2013, at 9:23 AM, Stepan Dyatkovskiy wrote:
> Hi Manman,
>
> >> *4.* One new change I've made is
> >> CC_ARM_AAPCS_Common rule.
> > Can we fix the above issue in a separate patch with additional testing case?
>
> Well.. We can use next case: Co-Processor register candidates may be
> either in VFP or in stack, so when all VFP are allocated stack is used. We can use stack without GPR allocation in that case, passing 9 f64 params, for example: first eight goes to d0-d7, night goes to stack. Now as 10th parameter we pass i32, according to AAPCS, 5.5 Parameters Passing, Stage C, C.4:
>
> [quote]
> If the size in words of the argument is not more than r4 minus NCRN, the argument is copied into
> core registers, starting at the NCRN. The NCRN is incremented by the number of registers used.
> Successive registers hold the parts of the argument they would hold if its value were loaded into
> those registers from memory using an LDM instruction. The argument has now been allocated.
> [/quote]
>
> It must to R0 in our case. But currently it goes to stack.
> I also checked this case with GCC - it works as in AAPCS written.
>
> About refactoring. I propose do it in two stages, first rename "getVarArgsRegSaveSize" to "get/setByValRegsSaveSize", and detach "StoreByValRegs" functionality from "VarArgStyleRegisters" method.
>
> I attached first refactor patch, Calling-Convention patch with test-case, and calling-convention .c test case.
> Try run .c file with "arm-linux-gnueabihf-gcc" and "./arm-linux-gnueabihf-clang -mcpu=cortex-a9".
>
> P.S.: My beagleboard is very very slow, "release" build takes about 15 hours, test-suite may be 15-20 hours too. I'll try ask somebody to present me faster device, but it take some time.
>
> -Stepan.
>
> Manman Ren wrote:
>>
>> On Apr 12, 2013, at 5:15 AM, Stepan Dyatkovskiy wrote:
>>
>>> Hello Manman,
>>>
>>> I reattached full patch with fixes.
>>>
>>> 1. I've removedskipWastedByValRegs, addByValWastedRegs and Wasted flag.
>>> 2. Fixed usage of "offset" variable in LowerFormalArguments
>>> 3. And fixed regression test: added function with two byval structures.
>> The above looks good.
>>>
>>> *4.* One new change I've made is
>>> CC_ARM_AAPCS_Common rule.
>>>
>>> Currently, for align == 4 and parameter type <=i32, we try to assign it to the first unallocated register from [r0-r3] range, but only if getNextStackOffset() == 0.
>>> I suppose that "getNextStackOffset() != 0" is treated as fact? that all registers are used? and params goes to the stack now. After byval structures will be implemented, the next case is possible:
>>>
>>> void @f(%i32_struct* byval, i32)
>>>
>>> Here, we need to allocate stack for first parameter, but after we still have unallocated registers. And second parameter should be sent to r1. Current CC_ARM_AAPCS_Common implementation will send second parameter to the stack.
>>> Though, I'm still not sure about this change, just because may be there are more reasons to check nextStackOffset value.
>> Can we fix the above issue in a separate patch with additional testing case?
>>>
>>>>>> + // for byval parameters.
>>>>>> + {
>>>>>> + unsigned InRegsParamsCount = State->getInRegsParamsCount();
>>>>>> + if (InRegsParamsCount) {
>>>>>> + unsigned RB, RE;
>>>>>> + State->getInRegsParamInfo(InRegsParamsCount-1, RB, RE);
>>>>>> + assert(reg >= RE);
>>>>>> + }
>>>>>> + }
>>>>> Is this only for assertion? Is this necessary?
>>>> Yes. I've just moved this check from "if" statement, since it must be
>>>> true always.
>>> How do you think, do we need this assertion?
>> I don't really think it is necessary :)
>>>
>>> I didn't present refactor patch, just because main patch didn't get LGTM community and hypothetically may contain some mistakes..
>> LGTM otherwise.
>>
>> Thanks,
>> Manman
>>>
>>> -Stepan
>>>
>>> Stepan Dyatkovskiy wrote:
>>>> Manman Ren wrote:
>>>>>
>>>>> Hi Stepan,
>>>>>
>>>>> One high-level question: do we need to insert the wasted record to
>>>>> ByValRegs and later on skip those records?
>>>>> It seems to me that we can only insert the records for the actual byval
>>>>> parameters.
>>>>> If that is true, we can remove addByValWastedRegs and
>>>>> skipWastedByValRegs.
>>>> Hm.. yes its reasonable. I'll try to fix.
>>>>
>>>>>
>>>>>> +
>>>>>> + // "offset" value would be usefull if byval parameter outsides
>>>>>> + // the registers area. Note, in that case nextInRegsParam()
>>>>>> will not
>>>>>> + // skip any waste registers, since parameter fillup the whole
>>>>>> rest
>>>>>> + // of registers set.
>>>>>> + unsigned RegsUsed = RegEnd - RegBegin;
>>>>>> + if (!CCInfo.nextInRegsParam() && Flags.getByValSize() >
>>>>>> 4*RegsUsed )
>>>>>> + offset = RegEnd - RegBegin;
>>>>>> }
>>>>>>
>>>>>> - if (Flags.getByValSize() - 4*offset > 0) {
>>>>>> + if (offset) {
>>>>>
>>>>> This still does not look right.
>>>>> There are 3 cases:
>>>>> If the byval parameter is in registers partially, we should create a
>>>>> COPY_STRUCT_BYVAL with the correct offset;
>>>>> If the byval parameter is all in registers, there is no need to create a
>>>>> COPY_STRUCT_BYVAL;
>>>>> If the byval parameter is all on stack, we should create a
>>>>> COPY_STRUCT_BYVAL with offset equal to 0.
>>>>>
>>>>> When looking at the above code, offset is default to zero, if
>>>>> CurByValIdx >= ByValArgsCount (this byval parameter is all on stack),
>>>>> offset will still be zero,
>>>>> and we will not create a COPY_STRUCT_BYVAL.
>>>>> The correct logic seems to be always updating offset inside the if and
>>>>> check Flags.getByValSize() > 4*offset
>>>>> if (CurByValIdx < ByValArgsCount) {
>>>>> …
>>>>> offset = RegEnd - RegBegin; //number of registers occupied by this
>>>>> byval parameter
>>>>> CCInfo.nextInRegsParam(); // go to the next byval parameter that is
>>>>> in registers partially or all
>>>>> }
>>>>> if (Flags.getByValSize() > 4*offset) {
>>>>> // Part of the byval parameter is on stack.
>>>>> ...
>>>> Oh.. I missed third case. Then we keep first variant here.
>>>>
>>>>>> - if ((!State->isFirstByValRegValid()) &&
>>>>>> - (ARM::R0 <= reg) && (reg <= ARM::R3)) {
>>>>>> + if ((ARM::R0 <= reg) && (reg <= ARM::R3)) {
>>>>>> + // All registers reserved for byval parameters should be allocated.
>>>>>> + // Check that first unallocated register is higher that last one
>>>>>> used
>>>>>> + // for byval parameters.
>>>>>> + {
>>>>>> + unsigned InRegsParamsCount = State->getInRegsParamsCount();
>>>>>> + if (InRegsParamsCount) {
>>>>>> + unsigned RB, RE;
>>>>>> + State->getInRegsParamInfo(InRegsParamsCount-1, RB, RE);
>>>>>> + assert(reg >= RE);
>>>>>> + }
>>>>>> + }
>>>>> Is this only for assertion? Is this necessary?
>>>> Yes. I've just moved this check from "if" statement, since it must be
>>>> true always.
>>>>
>>>>>> -void
>>>>>> -ARMTargetLowering::VarArgStyleRegisters(CCState &CCInfo, SelectionDAG
>>>>>> &DAG,
>>>>>> - DebugLoc dl, SDValue &Chain,
>>>>>> - const Value *OrigArg,
>>>>>> - unsigned OffsetFromOrigArg,
>>>>>> - unsigned ArgOffset,
>>>>>> - bool ForceMutable) const {
>>>>>> +// Return: The frame index registers were stored into.
>>>>>> +int
>>>>>> +ARMTargetLowering::StoreByValRegs(CCState &CCInfo, SelectionDAG &DAG,
>>>>>> + DebugLoc dl, SDValue &Chain,
>>>>>> + const Value *OrigArg,
>>>>>> + unsigned InRegsParamRecordIdx,
>>>>>> + unsigned OffsetFromOrigArg,
>>>>>> + unsigned ArgOffset,
>>>>>> + bool ForceMutable) const {
>>>>>
>>>>> It would be much cleaner if we can move the refactoring of
>>>>> VarArgStyleRegisters to a separate patch.
>>>> For sure.
>>>>
>>>> P.S.: Today I had some troubles with electricity, so I'll try present
>>>> new patch tomorrow.
>>>>
>>>> -Stepan.
>>>>
>>>> _______________________________________________
>>>> llvm-commits mailing list
>>>> llvm-commits at cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>
>>> <pr15293-2013-04-12.patch>
>>
>
> <stack-offset.c><VarArgsRegSaveSize-2013-04-16.patch><AAPCS-C.4-vs-VFP-2013-04-16.patch>
More information about the llvm-commits
mailing list