Fix PR15293: ARM codegen ice - expected larger existing stack allocation

Manman Ren mren at apple.com
Tue Apr 16 11:58:35 PDT 2013


Hi Stepan,

AAPCS-C.4-vs-VFP-2013-04-16.patch looks good except that the testing case.
Should it have 9 doubles and 1 integer instead of a single double and a single integer?
The testing case passes even without the fix.

I am not sure about the renaming patch though. Functions get|setVarArgsRegSaveSize are used for both vararg and byval.
It does not sound right to replace VarARgs with ByVal.

Thanks for working on this.
Manman

On Apr 15, 2013, at 9:23 AM, Stepan Dyatkovskiy wrote:

> Hi Manman,
> 
> >> *4.* One new change I've made is
> >> CC_ARM_AAPCS_Common rule.
> > Can we fix the above issue in a separate patch with additional testing case?
> 
> Well.. We can use next case: Co-Processor register candidates may be
> either in VFP or in stack, so when all VFP are allocated stack is used. We can use stack without GPR allocation in that case, passing 9 f64 params, for example: first eight goes to d0-d7, night goes to stack. Now as 10th parameter we pass i32, according to AAPCS, 5.5 Parameters Passing, Stage C, C.4:
> 
> [quote]
> If the size in words of the argument is not more than r4 minus NCRN, the argument is copied into
> core registers, starting at the NCRN. The NCRN is incremented by the number of registers used.
> Successive registers hold the parts of the argument they would hold if its value were loaded into
> those registers from memory using an LDM instruction. The argument has now been allocated.
> [/quote]
> 
> It must to R0 in our case. But currently it goes to stack.
> I also checked this case with GCC - it works as in AAPCS written.
> 
> About refactoring. I propose do it in two stages, first rename "getVarArgsRegSaveSize" to "get/setByValRegsSaveSize", and detach "StoreByValRegs" functionality from "VarArgStyleRegisters" method.
> 
> I attached first refactor patch, Calling-Convention patch with test-case, and calling-convention .c test case.
> Try run .c file with "arm-linux-gnueabihf-gcc" and "./arm-linux-gnueabihf-clang -mcpu=cortex-a9".
> 
> P.S.: My beagleboard is very very slow, "release" build takes about 15 hours, test-suite may be 15-20 hours too. I'll try ask somebody to present me faster device, but it take some time.
> 
> -Stepan.
> 
> Manman Ren wrote:
>> 
>> On Apr 12, 2013, at 5:15 AM, Stepan Dyatkovskiy wrote:
>> 
>>> Hello Manman,
>>> 
>>> I reattached full patch with fixes.
>>> 
>>> 1. I've removedskipWastedByValRegs, addByValWastedRegs and Wasted flag.
>>> 2. Fixed usage of "offset" variable in LowerFormalArguments
>>> 3. And fixed regression test: added function with two byval structures.
>> The above looks good.
>>> 
>>> *4.* One new change I've made is
>>> CC_ARM_AAPCS_Common rule.
>>> 
>>> Currently, for align == 4 and parameter type <=i32, we try to assign it to the first unallocated register from [r0-r3] range, but only if getNextStackOffset() == 0.
>>> I suppose that "getNextStackOffset() != 0" is treated as fact? that all registers are used? and params goes to the stack now. After byval structures will be implemented, the next case is possible:
>>> 
>>> void @f(%i32_struct* byval, i32)
>>> 
>>> Here, we need to allocate stack for first parameter, but after we still have unallocated registers. And second parameter should be sent to r1. Current CC_ARM_AAPCS_Common implementation will send second parameter to the stack.
>>> Though, I'm still not sure about this change, just because may be there are more reasons to check nextStackOffset value.
>> Can we fix the above issue in a separate patch with additional testing case?
>>> 
>>>>>> +    // for byval parameters.
>>>>>> +    {
>>>>>> +      unsigned InRegsParamsCount = State->getInRegsParamsCount();
>>>>>> +      if (InRegsParamsCount) {
>>>>>> +        unsigned RB, RE;
>>>>>> +        State->getInRegsParamInfo(InRegsParamsCount-1, RB, RE);
>>>>>> +        assert(reg >= RE);
>>>>>> +      }
>>>>>> +    }
>>>>> Is this only for assertion? Is this necessary?
>>>> Yes. I've just moved this check from "if" statement, since it must be
>>>> true always.
>>> How do you think, do we need this assertion?
>> I don't really think it is necessary :)
>>> 
>>> I didn't present refactor patch, just because main patch didn't get LGTM community and hypothetically may contain some mistakes..
>> LGTM otherwise.
>> 
>> Thanks,
>> Manman
>>> 
>>> -Stepan
>>> 
>>> Stepan Dyatkovskiy wrote:
>>>> Manman Ren wrote:
>>>>> 
>>>>> Hi Stepan,
>>>>> 
>>>>> One high-level question: do we need to insert the wasted record to
>>>>> ByValRegs and later on skip those records?
>>>>> It seems to me that we can only insert the records for the actual byval
>>>>> parameters.
>>>>> If that is true, we can remove addByValWastedRegs and
>>>>> skipWastedByValRegs.
>>>> Hm.. yes its reasonable. I'll try to fix.
>>>> 
>>>>> 
>>>>>> +
>>>>>> +        // "offset" value would be usefull if byval parameter outsides
>>>>>> +        // the registers area. Note, in that case nextInRegsParam()
>>>>>> will not
>>>>>> +        // skip any waste registers, since parameter fillup the whole
>>>>>> rest
>>>>>> +        // of registers set.
>>>>>> +        unsigned RegsUsed = RegEnd - RegBegin;
>>>>>> +        if (!CCInfo.nextInRegsParam() && Flags.getByValSize() >
>>>>>> 4*RegsUsed )
>>>>>> +          offset = RegEnd - RegBegin;
>>>>>>       }
>>>>>> 
>>>>>> -      if (Flags.getByValSize() - 4*offset > 0) {
>>>>>> +      if (offset) {
>>>>> 
>>>>> This still does not look right.
>>>>> There are 3 cases:
>>>>> If the byval parameter is in registers partially, we should create a
>>>>> COPY_STRUCT_BYVAL with the correct offset;
>>>>> If the byval parameter is all in registers, there is no need to create a
>>>>> COPY_STRUCT_BYVAL;
>>>>> If the byval parameter is all on stack, we should create a
>>>>> COPY_STRUCT_BYVAL with offset equal to 0.
>>>>> 
>>>>> When looking at the above code, offset is default to zero, if
>>>>> CurByValIdx >= ByValArgsCount (this byval parameter is all on stack),
>>>>> offset will still be zero,
>>>>> and we will not create a COPY_STRUCT_BYVAL.
>>>>> The correct logic seems to be always updating offset inside the if and
>>>>> check Flags.getByValSize() > 4*offset
>>>>> if (CurByValIdx < ByValArgsCount) {
>>>>>>>>>>   offset = RegEnd - RegBegin; //number of registers occupied by this
>>>>> byval parameter
>>>>>   CCInfo.nextInRegsParam(); // go to the next byval parameter that is
>>>>> in registers partially or all
>>>>> }
>>>>> if (Flags.getByValSize() > 4*offset) {
>>>>>   // Part of the byval parameter is on stack.
>>>>>   ...
>>>> Oh.. I missed third case. Then we keep first variant here.
>>>> 
>>>>>> -  if ((!State->isFirstByValRegValid()) &&
>>>>>> -      (ARM::R0 <= reg) && (reg <= ARM::R3)) {
>>>>>> +  if ((ARM::R0 <= reg) && (reg <= ARM::R3)) {
>>>>>> +    // All registers reserved for byval parameters should be allocated.
>>>>>> +    // Check that first unallocated register is higher that last one
>>>>>> used
>>>>>> +    // for byval parameters.
>>>>>> +    {
>>>>>> +      unsigned InRegsParamsCount = State->getInRegsParamsCount();
>>>>>> +      if (InRegsParamsCount) {
>>>>>> +        unsigned RB, RE;
>>>>>> +        State->getInRegsParamInfo(InRegsParamsCount-1, RB, RE);
>>>>>> +        assert(reg >= RE);
>>>>>> +      }
>>>>>> +    }
>>>>> Is this only for assertion? Is this necessary?
>>>> Yes. I've just moved this check from "if" statement, since it must be
>>>> true always.
>>>> 
>>>>>> -void
>>>>>> -ARMTargetLowering::VarArgStyleRegisters(CCState &CCInfo, SelectionDAG
>>>>>> &DAG,
>>>>>> -                                        DebugLoc dl, SDValue &Chain,
>>>>>> -                                        const Value *OrigArg,
>>>>>> -                                        unsigned OffsetFromOrigArg,
>>>>>> -                                        unsigned ArgOffset,
>>>>>> -                                        bool ForceMutable) const {
>>>>>> +// Return: The frame index registers were stored into.
>>>>>> +int
>>>>>> +ARMTargetLowering::StoreByValRegs(CCState &CCInfo, SelectionDAG &DAG,
>>>>>> +                                  DebugLoc dl, SDValue &Chain,
>>>>>> +                                  const Value *OrigArg,
>>>>>> +                                  unsigned InRegsParamRecordIdx,
>>>>>> +                                  unsigned OffsetFromOrigArg,
>>>>>> +                                  unsigned ArgOffset,
>>>>>> +                                  bool ForceMutable) const {
>>>>> 
>>>>> It would be much cleaner if we can move the refactoring of
>>>>> VarArgStyleRegisters to a separate patch.
>>>> For sure.
>>>> 
>>>> P.S.: Today I had some troubles with electricity, so I'll try present
>>>> new patch tomorrow.
>>>> 
>>>> -Stepan.
>>>> 
>>>> _______________________________________________
>>>> llvm-commits mailing list
>>>> llvm-commits at cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>> 
>>> <pr15293-2013-04-12.patch>
>> 
> 
> <stack-offset.c><VarArgsRegSaveSize-2013-04-16.patch><AAPCS-C.4-vs-VFP-2013-04-16.patch>





More information about the llvm-commits mailing list