Fix PR15293: ARM codegen ice - expected larger existing stack allocation

Stepan Dyatkovskiy stpworld at narod.ru
Mon Apr 15 09:23:54 PDT 2013


Hi Manman,

 >> *4.* One new change I've made is
 >> CC_ARM_AAPCS_Common rule.
 > Can we fix the above issue in a separate patch with additional 
testing case?

Well.. We can use next case: Co-Processor register candidates may be
either in VFP or in stack, so when all VFP are allocated stack is used. 
We can use stack without GPR allocation in that case, passing 9 f64 
params, for example: first eight goes to d0-d7, night goes to stack. Now 
as 10th parameter we pass i32, according to AAPCS, 5.5 Parameters 
Passing, Stage C, C.4:

[quote]
If the size in words of the argument is not more than r4 minus NCRN, the 
argument is copied into
core registers, starting at the NCRN. The NCRN is incremented by the 
number of registers used.
Successive registers hold the parts of the argument they would hold if 
its value were loaded into
those registers from memory using an LDM instruction. The argument has 
now been allocated.
[/quote]

It must to R0 in our case. But currently it goes to stack.
I also checked this case with GCC - it works as in AAPCS written.

About refactoring. I propose do it in two stages, first rename 
"getVarArgsRegSaveSize" to "get/setByValRegsSaveSize", and detach 
"StoreByValRegs" functionality from "VarArgStyleRegisters" method.

I attached first refactor patch, Calling-Convention patch with 
test-case, and calling-convention .c test case.
Try run .c file with "arm-linux-gnueabihf-gcc" and 
"./arm-linux-gnueabihf-clang -mcpu=cortex-a9".

P.S.: My beagleboard is very very slow, "release" build takes about 15 
hours, test-suite may be 15-20 hours too. I'll try ask somebody to 
present me faster device, but it take some time.

-Stepan.

Manman Ren wrote:
>
> On Apr 12, 2013, at 5:15 AM, Stepan Dyatkovskiy wrote:
>
>> Hello Manman,
>>
>> I reattached full patch with fixes.
>>
>> 1. I've removedskipWastedByValRegs, addByValWastedRegs and Wasted flag.
>> 2. Fixed usage of "offset" variable in LowerFormalArguments
>> 3. And fixed regression test: added function with two byval structures.
> The above looks good.
>>
>> *4.* One new change I've made is
>> CC_ARM_AAPCS_Common rule.
>>
>> Currently, for align == 4 and parameter type <=i32, we try to assign it to the first unallocated register from [r0-r3] range, but only if getNextStackOffset() == 0.
>> I suppose that "getNextStackOffset() != 0" is treated as fact? that all registers are used? and params goes to the stack now. After byval structures will be implemented, the next case is possible:
>>
>> void @f(%i32_struct* byval, i32)
>>
>> Here, we need to allocate stack for first parameter, but after we still have unallocated registers. And second parameter should be sent to r1. Current CC_ARM_AAPCS_Common implementation will send second parameter to the stack.
>> Though, I'm still not sure about this change, just because may be there are more reasons to check nextStackOffset value.
> Can we fix the above issue in a separate patch with additional testing case?
>>
>>>>> +    // for byval parameters.
>>>>> +    {
>>>>> +      unsigned InRegsParamsCount = State->getInRegsParamsCount();
>>>>> +      if (InRegsParamsCount) {
>>>>> +        unsigned RB, RE;
>>>>> +        State->getInRegsParamInfo(InRegsParamsCount-1, RB, RE);
>>>>> +        assert(reg >= RE);
>>>>> +      }
>>>>> +    }
>>>> Is this only for assertion? Is this necessary?
>>> Yes. I've just moved this check from "if" statement, since it must be
>>> true always.
>> How do you think, do we need this assertion?
> I don't really think it is necessary :)
>>
>> I didn't present refactor patch, just because main patch didn't get LGTM community and hypothetically may contain some mistakes..
> LGTM otherwise.
>
> Thanks,
> Manman
>>
>> -Stepan
>>
>> Stepan Dyatkovskiy wrote:
>>> Manman Ren wrote:
>>>>
>>>> Hi Stepan,
>>>>
>>>> One high-level question: do we need to insert the wasted record to
>>>> ByValRegs and later on skip those records?
>>>> It seems to me that we can only insert the records for the actual byval
>>>> parameters.
>>>> If that is true, we can remove addByValWastedRegs and
>>>> skipWastedByValRegs.
>>> Hm.. yes its reasonable. I'll try to fix.
>>>
>>>>
>>>>> +
>>>>> +        // "offset" value would be usefull if byval parameter outsides
>>>>> +        // the registers area. Note, in that case nextInRegsParam()
>>>>> will not
>>>>> +        // skip any waste registers, since parameter fillup the whole
>>>>> rest
>>>>> +        // of registers set.
>>>>> +        unsigned RegsUsed = RegEnd - RegBegin;
>>>>> +        if (!CCInfo.nextInRegsParam() && Flags.getByValSize() >
>>>>> 4*RegsUsed )
>>>>> +          offset = RegEnd - RegBegin;
>>>>>        }
>>>>>
>>>>> -      if (Flags.getByValSize() - 4*offset > 0) {
>>>>> +      if (offset) {
>>>>
>>>> This still does not look right.
>>>> There are 3 cases:
>>>> If the byval parameter is in registers partially, we should create a
>>>> COPY_STRUCT_BYVAL with the correct offset;
>>>> If the byval parameter is all in registers, there is no need to create a
>>>> COPY_STRUCT_BYVAL;
>>>> If the byval parameter is all on stack, we should create a
>>>> COPY_STRUCT_BYVAL with offset equal to 0.
>>>>
>>>> When looking at the above code, offset is default to zero, if
>>>> CurByValIdx >= ByValArgsCount (this byval parameter is all on stack),
>>>> offset will still be zero,
>>>> and we will not create a COPY_STRUCT_BYVAL.
>>>> The correct logic seems to be always updating offset inside the if and
>>>> check Flags.getByValSize() > 4*offset
>>>> if (CurByValIdx < ByValArgsCount) {
>>>>>>>>    offset = RegEnd - RegBegin; //number of registers occupied by this
>>>> byval parameter
>>>>    CCInfo.nextInRegsParam(); // go to the next byval parameter that is
>>>> in registers partially or all
>>>> }
>>>> if (Flags.getByValSize() > 4*offset) {
>>>>    // Part of the byval parameter is on stack.
>>>>    ...
>>> Oh.. I missed third case. Then we keep first variant here.
>>>
>>>>> -  if ((!State->isFirstByValRegValid()) &&
>>>>> -      (ARM::R0 <= reg) && (reg <= ARM::R3)) {
>>>>> +  if ((ARM::R0 <= reg) && (reg <= ARM::R3)) {
>>>>> +    // All registers reserved for byval parameters should be allocated.
>>>>> +    // Check that first unallocated register is higher that last one
>>>>> used
>>>>> +    // for byval parameters.
>>>>> +    {
>>>>> +      unsigned InRegsParamsCount = State->getInRegsParamsCount();
>>>>> +      if (InRegsParamsCount) {
>>>>> +        unsigned RB, RE;
>>>>> +        State->getInRegsParamInfo(InRegsParamsCount-1, RB, RE);
>>>>> +        assert(reg >= RE);
>>>>> +      }
>>>>> +    }
>>>> Is this only for assertion? Is this necessary?
>>> Yes. I've just moved this check from "if" statement, since it must be
>>> true always.
>>>
>>>>> -void
>>>>> -ARMTargetLowering::VarArgStyleRegisters(CCState &CCInfo, SelectionDAG
>>>>> &DAG,
>>>>> -                                        DebugLoc dl, SDValue &Chain,
>>>>> -                                        const Value *OrigArg,
>>>>> -                                        unsigned OffsetFromOrigArg,
>>>>> -                                        unsigned ArgOffset,
>>>>> -                                        bool ForceMutable) const {
>>>>> +// Return: The frame index registers were stored into.
>>>>> +int
>>>>> +ARMTargetLowering::StoreByValRegs(CCState &CCInfo, SelectionDAG &DAG,
>>>>> +                                  DebugLoc dl, SDValue &Chain,
>>>>> +                                  const Value *OrigArg,
>>>>> +                                  unsigned InRegsParamRecordIdx,
>>>>> +                                  unsigned OffsetFromOrigArg,
>>>>> +                                  unsigned ArgOffset,
>>>>> +                                  bool ForceMutable) const {
>>>>
>>>> It would be much cleaner if we can move the refactoring of
>>>> VarArgStyleRegisters to a separate patch.
>>> For sure.
>>>
>>> P.S.: Today I had some troubles with electricity, so I'll try present
>>> new patch tomorrow.
>>>
>>> -Stepan.
>>>
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>> <pr15293-2013-04-12.patch>
>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: stack-offset.c
Type: text/x-csrc
Size: 346 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130415/c262d810/attachment.c>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: VarArgsRegSaveSize-2013-04-16.patch
Type: text/x-diff
Size: 5061 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130415/c262d810/attachment.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: AAPCS-C.4-vs-VFP-2013-04-16.patch
Type: text/x-diff
Size: 3511 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130415/c262d810/attachment-0001.patch>


More information about the llvm-commits mailing list