Fix PR15293: ARM codegen ice - expected larger existing stack allocation

Manman Ren mren at apple.com
Wed May 1 12:30:29 PDT 2013


+  // NSAA should be evaluted (NSAA means "next stacked argument address").
typo "evaluated"

+  // So: NextStackOffset = NSAAOffset + SizeOfByValParamsStoredInRegs.
+  // Then: NSAAOffset = NextStackOffset - SizeOfByValParamsStoredInRegs.
+  unsigned NSAAOffset = State->getNextStackOffset();
+  if (State->getCallOrPrologue() != Call) {
+    for (unsigned i = 0, e = State->getInRegsParamsCount(); i != e; ++i) {
+      unsigned RB, RE;
+      State->getInRegsParamInfo(i, RB, RE);
+      assert(NSAAOffset >= (RE-RB)*4 &&
+             "Stack offset for byval regs doesn't introduced anymore?");
+      NSAAOffset -= (RE-RB)*4;
+    }
+  }
+
+  if ((!Subtarget->isAAPCS_ABI() || NSAAOffset == 0) &&
I don't quite get the above section to update NSAAOffset. Could you give an example?

Should we use up all GPRArgRegs when NSAAOffset != 0 so arguments after the byval will not be allocated to registers?

Otherwise LGTM.

Thanks for working on this,
Manman

On May 1, 2013, at 11:36 AM, Stepan Dyatkovskiy wrote:

> Hello,
> Please find patch in attachment that fixes byval support in ARM backend for llvm level.
> 
> -Stepan.
> 
> Stepan Dyatkovskiy wrote:
>> Commited in r180774. Now I'm working on milestone patch, that fixes
>> byval support for llvm level. I published it before, now I need to
>> update it. Hope it come soon.
>> 
>> -Stepan.
>> 
>> Manman Ren wrote:
>>> 
>>> LGTM,
>>> 
>>> Thanks,
>>> Manman
>>> 
>>> On Apr 28, 2013, at 10:50 AM, Stepan Dyatkovskiy wrote:
>>> 
>>>> Hi Manman,
>>>> Just for keeping things actual: I reattached patch with new names.
>>>> -Stepan
>>>> 
>>>> Stepan Dyatkovskiy wrote:
>>>>> Hi Manman,
>>>>> 
>>>>>> Since it represents the stack area needed to save parameters that are
>>>>> actually passed in registers, how about RegsSaveSizeForParams or just
>>>>> RegsSaveSize?
>>>>> 
>>>>> In AAPCS R0-R3 are called like:
>>>>> "Argument / result / scratch register".
>>>>> So maybe we'll use "set/getArgRegsSaveSize" names?
>>>>> 
>>>>>> Also it is weird to have "unsigned VARegSaveSize =
>>>>> AFI->getByValRegsSaveSize();"
>>>>> 
>>>>> Yea :-) Here replace "VAReg" with "ArgRegs": ArgRegsSaveSize,
>>>>> ArgRegsSize.
>>>>> 
>>>>> -Stepan
>>>>> 
>>>>> Manman Ren wrote:
>>>>>> 
>>>>>> Hi Stepan,
>>>>>> 
>>>>>> Thanks for working on this.
>>>>>> 
>>>>>> I put in a comment about the refactor patch sometime ago:
>>>>>>>> I am not sure about the renaming patch though. Functions
>>>>>>>> get|setVarArgsRegSaveSize are used for both vararg and byval.
>>>>>>>> It does not sound right to replace VarARgs with ByVal.
>>>>>>> 
>>>>>>> Actually it is used not for vararg function, as I found it main role
>>>>>>> of this param in emitPrologue, it is used for proper stack
>>>>>>> allocation.
>>>>>>> Stack size should enough for storing all registers, used for byval
>>>>>>> parameters. And yes, if it is VA function, we must also extend this
>>>>>>> stack area for all unallocated R0-R3 regs storage.
>>>>>> 
>>>>>> Agree. That is why I think ByVal is not a great name either.
>>>>>> Since it represents the stack area needed to save parameters that are
>>>>>> actually passed in registers, how about RegsSaveSizeForParams or just
>>>>>> RegsSaveSize?
>>>>>> Also it is weird to have "unsigned VARegSaveSize =
>>>>>> AFI->getByValRegsSaveSize();" where we assign return value of
>>>>>> getByValRegsSaveSize to a variable "VARegSaveSize".
>>>>>> 
>>>>>> Manman
>>>>>> 
>>>>>> On Apr 26, 2013, at 11:12 AM, Stepan Dyatkovskiy wrote:
>>>>>> 
>>>>>>> Hi Manman and Renato,
>>>>>>> Patch was tested with test-suite all test have bee passed.
>>>>>>> -Stepan
>>>>>>> 
>>>>>>> Stepan Dyatkovskiy wrote:
>>>>>>>> Hi all,
>>>>>>>> 
>>>>>>>> AAPCS, 5.5 Parameters Passing, Stage C, C4 and C5 have been fixed,
>>>>>>>> r180011.
>>>>>>>> 
>>>>>>>> So, next patch is in attachment: VarArgsStyleRegisters refactoring.
>>>>>>>> 
>>>>>>>> -Stepan.
>>>>>>>> 
>>>>>>>> Stepan Dyatkovskiy wrote:
>>>>>>>>> Hello, Manman.
>>>>>>>>> That's kind of snowball!
>>>>>>>>> Today test-suite finished with two new failures on my beagle.
>>>>>>>>> It helped me find out one more hidden logic dependency in ARM
>>>>>>>>> backend.
>>>>>>>>> 
>>>>>>>>> May be its not a mistake, but look:
>>>>>>>>> 
>>>>>>>>> In ARMCallingConv.h, "f64AssignAAPCS" allocates registers for f64.
>>>>>>>>> This
>>>>>>>>> method is called for VA functions, since there is no CPRCs, and
>>>>>>>>> *any*
>>>>>>>>> parameters could be allocated in R0-R3. Either R0,R1 or R2,R3
>>>>>>>>> could be
>>>>>>>>> allocated here. If all the registers except R3 were allocated
>>>>>>>>> somewhere
>>>>>>>>> before, f64AssignAAPCS just moves whole f64 to stack, it doesn't
>>>>>>>>> waste
>>>>>>>>> R3, so R3 could be allocated later. But logic seems to be same
>>>>>>>>> as in
>>>>>>>>> ARMTargetLowering::HandleByVal.
>>>>>>>>> 
>>>>>>>>> My main fix in calling convention was:
>>>>>>>>> 
>>>>>>>>> def CC_ARM_AAPCS_Common : CallingConv<[
>>>>>>>>> ...
>>>>>>>>> -  CCIfType<[i32], CCIf<"State.getNextStackOffset() == 0 &&"
>>>>>>>>> -                       "ArgFlags.getOrigAlign() != 8",
>>>>>>>>> +  CCIfType<[i32], CCIf<"ArgFlags.getOrigAlign() != 8",
>>>>>>>>>                         CCAssignToReg<[R0, R1, R2, R3]>>>,
>>>>>>>>> 
>>>>>>>>> Now I got some picture why "State.getNextStackOffset() == 0" was
>>>>>>>>> added:
>>>>>>>>> due to f64AssignAAPCS implementation. May be the are more reasons:
>>>>>>>>> next
>>>>>>>>> test-suite results will show it then :-)
>>>>>>>>> 
>>>>>>>>> Anyway, original CCIfType brakes C.4 and C.5, while my fix
>>>>>>>>> brakes some
>>>>>>>>> implicit statement defined in f64AssignAAPCS:
>>>>>>>>> 
>>>>>>>>> "stack-offset != 0 doesn't mean all registers are allocated"
>>>>>>>>> 
>>>>>>>>> So I've fixed f64AssignAAPCS too. Once we waste registers in
>>>>>>>>> HandleByVal
>>>>>>>>> (when we send parameter to stack), I think it would be good to keep
>>>>>>>>> same
>>>>>>>>> behaviour for f64AssignAAPCS since its HandleByVal's analogue
>>>>>>>>> for f64
>>>>>>>>> params.
>>>>>>>>> 
>>>>>>>>> I attached new patch, now with 3 tests :-)
>>>>>>>>> (3rd test is 2013-04-21-AAPCS-VA-C.1.cp.ll)
>>>>>>>>> 
>>>>>>>>> -Stepan.
>>>>>>>>> 
>>>>>>>>> Manman Ren wrote:
>>>>>>>>>> 
>>>>>>>>>> LGTM,
>>>>>>>>>> 
>>>>>>>>>> Thanks a lot,
>>>>>>>>>> Manman
>>>>>>>>>> 
>>>>>>>>>> On Apr 18, 2013, at 8:41 PM, Stepan Dyatkovskiy wrote:
>>>>>>>>>> 
>>>>>>>>>>> Fixed :-) New patch (+2 tests) is attached.
>>>>>>>>>>> -Stepan
>>>>>>>>>>> 
>>>>>>>>>>> Manman Ren wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> "AAPCS-C.4-vs-VFP-2013-04-17.patch" looks good to me.
>>>>>>>>>>>> 
>>>>>>>>>>>> As a side note, we may have issues with 9 floats, 3 ints and a
>>>>>>>>>>>> small
>>>>>>>>>>>> byval.
>>>>>>>>>>>> The standard says we should split the byval only when NSAA is
>>>>>>>>>>>> equal
>>>>>>>>>>>> to SP (C.5), in the case of 9 floats, 3 ints and a small byval,
>>>>>>>>>>>> we should not split the byval since we advanced NSAA with the
>>>>>>>>>>>> 9th
>>>>>>>>>>>> float.
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Manman
>>>>>>>>>>>> 
>>>>>>>>>>>> On Apr 16, 2013, at 8:51 PM, Stepan Dyatkovskiy wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>>> Hi Manman,
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> AAPCS-C.4-vs-VFP-2013-04-16.patch looks good except that the
>>>>>>>>>>>>>> testing case.
>>>>>>>>>>>>>> Should it have 9 doubles and 1 integer instead of a single
>>>>>>>>>>>>>> double
>>>>>>>>>>>>>> and a single integer?
>>>>>>>>>>>>>> The testing case passes even without the fix.
>>>>>>>>>>>>> That was my midnight mistake, sorry.
>>>>>>>>>>>>> I reattached patch with proper test. In the end of test
>>>>>>>>>>>>> file, I've
>>>>>>>>>>>>> added comments for @doFoo: small description of how parameters
>>>>>>>>>>>>> must
>>>>>>>>>>>>> be passed:
>>>>>>>>>>>>> first eight doubles --> D0-D7
>>>>>>>>>>>>> 9th double --> Stack
>>>>>>>>>>>>> 10th param (first i32) --> R0.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> It is important to have 9 double params here, since we must
>>>>>>>>>>>>> fill-up
>>>>>>>>>>>>> all possible VFPs, then get compiler to use stack offset,
>>>>>>>>>>>>> and only
>>>>>>>>>>>>> after start use GPRs.
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I am not sure about the renaming patch though. Functions
>>>>>>>>>>>>>> get|setVarArgsRegSaveSize are used for both vararg and byval.
>>>>>>>>>>>>>> It does not sound right to replace VarARgs with ByVal.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Actually it is used not for vararg function, as I found it main
>>>>>>>>>>>>> role of this param in emitPrologue, it is used for proper stack
>>>>>>>>>>>>> allocation.
>>>>>>>>>>>>> Stack size should enough for storing all registers, used for
>>>>>>>>>>>>> byval
>>>>>>>>>>>>> parameters. And yes, if it is VA function, we must also extend
>>>>>>>>>>>>> this
>>>>>>>>>>>>> stack area for all unallocated R0-R3 regs storage.
>>>>>>>>>>>> Agree. That is why I think ByVal is not a great name either.
>>>>>>>>>>>> Since it represents the stack area needed to save parameters
>>>>>>>>>>>> that
>>>>>>>>>>>> are actually passed in registers, how about
>>>>>>>>>>>> RegsSaveSizeForParams or
>>>>>>>>>>>> just RegsSaveSize?
>>>>>>>>>>>> Also it is weird to have "unsigned VARegSaveSize =
>>>>>>>>>>>> AFI->getByValRegsSaveSize();" where we assign return value of
>>>>>>>>>>>> getByValRegsSaveSize to a variable "VARegSaveSize".
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks for working on this.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> -Stepan.
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thanks for working on this.
>>>>>>>>>>>>>> Manman
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Apr 15, 2013, at 9:23 AM, Stepan Dyatkovskiy wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Hi Manman,
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> *4.* One new change I've made is
>>>>>>>>>>>>>>>>> CC_ARM_AAPCS_Common rule.
>>>>>>>>>>>>>>>> Can we fix the above issue in a separate patch with
>>>>>>>>>>>>>>>> additional
>>>>>>>>>>>>>>>> testing case?
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Well.. We can use next case: Co-Processor register candidates
>>>>>>>>>>>>>>> may be
>>>>>>>>>>>>>>> either in VFP or in stack, so when all VFP are allocated
>>>>>>>>>>>>>>> stack is
>>>>>>>>>>>>>>> used. We can use stack without GPR allocation in that case,
>>>>>>>>>>>>>>> passing 9 f64 params, for example: first eight goes to d0-d7,
>>>>>>>>>>>>>>> night goes to stack. Now as 10th parameter we pass i32,
>>>>>>>>>>>>>>> according
>>>>>>>>>>>>>>> to AAPCS, 5.5 Parameters Passing, Stage C, C.4:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> [quote]
>>>>>>>>>>>>>>> If the size in words of the argument is not more than r4
>>>>>>>>>>>>>>> minus
>>>>>>>>>>>>>>> NCRN, the argument is copied into
>>>>>>>>>>>>>>> core registers, starting at the NCRN. The NCRN is
>>>>>>>>>>>>>>> incremented by
>>>>>>>>>>>>>>> the number of registers used.
>>>>>>>>>>>>>>> Successive registers hold the parts of the argument they
>>>>>>>>>>>>>>> would
>>>>>>>>>>>>>>> hold if its value were loaded into
>>>>>>>>>>>>>>> those registers from memory using an LDM instruction. The
>>>>>>>>>>>>>>> argument has now been allocated.
>>>>>>>>>>>>>>> [/quote]
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> It must to R0 in our case. But currently it goes to stack.
>>>>>>>>>>>>>>> I also checked this case with GCC - it works as in AAPCS
>>>>>>>>>>>>>>> written.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> About refactoring. I propose do it in two stages, first
>>>>>>>>>>>>>>> rename
>>>>>>>>>>>>>>> "getVarArgsRegSaveSize" to "get/setByValRegsSaveSize", and
>>>>>>>>>>>>>>> detach
>>>>>>>>>>>>>>> "StoreByValRegs" functionality from "VarArgStyleRegisters"
>>>>>>>>>>>>>>> method.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> I attached first refactor patch, Calling-Convention patch
>>>>>>>>>>>>>>> with
>>>>>>>>>>>>>>> test-case, and calling-convention .c test case.
>>>>>>>>>>>>>>> Try run .c file with "arm-linux-gnueabihf-gcc" and
>>>>>>>>>>>>>>> "./arm-linux-gnueabihf-clang -mcpu=cortex-a9".
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> P.S.: My beagleboard is very very slow, "release" build takes
>>>>>>>>>>>>>>> about 15 hours, test-suite may be 15-20 hours too. I'll
>>>>>>>>>>>>>>> try ask
>>>>>>>>>>>>>>> somebody to present me faster device, but it take some time.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> -Stepan.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Manman Ren wrote:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> On Apr 12, 2013, at 5:15 AM, Stepan Dyatkovskiy wrote:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Hello Manman,
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> I reattached full patch with fixes.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 1. I've removedskipWastedByValRegs, addByValWastedRegs and
>>>>>>>>>>>>>>>>> Wasted flag.
>>>>>>>>>>>>>>>>> 2. Fixed usage of "offset" variable in LowerFormalArguments
>>>>>>>>>>>>>>>>> 3. And fixed regression test: added function with two byval
>>>>>>>>>>>>>>>>> structures.
>>>>>>>>>>>>>>>> The above looks good.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> *4.* One new change I've made is
>>>>>>>>>>>>>>>>> CC_ARM_AAPCS_Common rule.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Currently, for align == 4 and parameter type <=i32, we
>>>>>>>>>>>>>>>>> try to
>>>>>>>>>>>>>>>>> assign it to the first unallocated register from [r0-r3]
>>>>>>>>>>>>>>>>> range,
>>>>>>>>>>>>>>>>> but only if getNextStackOffset() == 0.
>>>>>>>>>>>>>>>>> I suppose that "getNextStackOffset() != 0" is treated as
>>>>>>>>>>>>>>>>> fact?
>>>>>>>>>>>>>>>>> that all registers are used? and params goes to the
>>>>>>>>>>>>>>>>> stack now.
>>>>>>>>>>>>>>>>> After byval structures will be implemented, the next
>>>>>>>>>>>>>>>>> case is
>>>>>>>>>>>>>>>>> possible:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> void @f(%i32_struct* byval, i32)
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Here, we need to allocate stack for first parameter, but
>>>>>>>>>>>>>>>>> after
>>>>>>>>>>>>>>>>> we still have unallocated registers. And second parameter
>>>>>>>>>>>>>>>>> should be sent to r1. Current CC_ARM_AAPCS_Common
>>>>>>>>>>>>>>>>> implementation will send second parameter to the stack.
>>>>>>>>>>>>>>>>> Though, I'm still not sure about this change, just
>>>>>>>>>>>>>>>>> because may
>>>>>>>>>>>>>>>>> be there are more reasons to check nextStackOffset value.
>>>>>>>>>>>>>>>> Can we fix the above issue in a separate patch with
>>>>>>>>>>>>>>>> additional
>>>>>>>>>>>>>>>> testing case?
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> +    // for byval parameters.
>>>>>>>>>>>>>>>>>>>> +    {
>>>>>>>>>>>>>>>>>>>> +      unsigned InRegsParamsCount =
>>>>>>>>>>>>>>>>>>>> State->getInRegsParamsCount();
>>>>>>>>>>>>>>>>>>>> +      if (InRegsParamsCount) {
>>>>>>>>>>>>>>>>>>>> +        unsigned RB, RE;
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> State->getInRegsParamInfo(InRegsParamsCount-1, RB,
>>>>>>>>>>>>>>>>>>>> RE);
>>>>>>>>>>>>>>>>>>>> +        assert(reg >= RE);
>>>>>>>>>>>>>>>>>>>> +      }
>>>>>>>>>>>>>>>>>>>> +    }
>>>>>>>>>>>>>>>>>>> Is this only for assertion? Is this necessary?
>>>>>>>>>>>>>>>>>> Yes. I've just moved this check from "if" statement,
>>>>>>>>>>>>>>>>>> since it
>>>>>>>>>>>>>>>>>> must be
>>>>>>>>>>>>>>>>>> true always.
>>>>>>>>>>>>>>>>> How do you think, do we need this assertion?
>>>>>>>>>>>>>>>> I don't really think it is necessary :)
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> I didn't present refactor patch, just because main patch
>>>>>>>>>>>>>>>>> didn't
>>>>>>>>>>>>>>>>> get LGTM community and hypothetically may contain some
>>>>>>>>>>>>>>>>> mistakes..
>>>>>>>>>>>>>>>> LGTM otherwise.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Manman
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> -Stepan
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Stepan Dyatkovskiy wrote:
>>>>>>>>>>>>>>>>>> Manman Ren wrote:
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Hi Stepan,
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> One high-level question: do we need to insert the wasted
>>>>>>>>>>>>>>>>>>> record to
>>>>>>>>>>>>>>>>>>> ByValRegs and later on skip those records?
>>>>>>>>>>>>>>>>>>> It seems to me that we can only insert the records for
>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>> actual byval
>>>>>>>>>>>>>>>>>>> parameters.
>>>>>>>>>>>>>>>>>>> If that is true, we can remove addByValWastedRegs and
>>>>>>>>>>>>>>>>>>> skipWastedByValRegs.
>>>>>>>>>>>>>>>>>> Hm.. yes its reasonable. I'll try to fix.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +        // "offset" value would be usefull if byval
>>>>>>>>>>>>>>>>>>>> parameter outsides
>>>>>>>>>>>>>>>>>>>> +        // the registers area. Note, in that case
>>>>>>>>>>>>>>>>>>>> nextInRegsParam()
>>>>>>>>>>>>>>>>>>>> will not
>>>>>>>>>>>>>>>>>>>> +        // skip any waste registers, since parameter
>>>>>>>>>>>>>>>>>>>> fillup
>>>>>>>>>>>>>>>>>>>> the whole
>>>>>>>>>>>>>>>>>>>> rest
>>>>>>>>>>>>>>>>>>>> +        // of registers set.
>>>>>>>>>>>>>>>>>>>> +        unsigned RegsUsed = RegEnd - RegBegin;
>>>>>>>>>>>>>>>>>>>> +        if (!CCInfo.nextInRegsParam() &&
>>>>>>>>>>>>>>>>>>>> Flags.getByValSize() >
>>>>>>>>>>>>>>>>>>>> 4*RegsUsed )
>>>>>>>>>>>>>>>>>>>> +          offset = RegEnd - RegBegin;
>>>>>>>>>>>>>>>>>>>>       }
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> -      if (Flags.getByValSize() - 4*offset > 0) {
>>>>>>>>>>>>>>>>>>>> +      if (offset) {
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> This still does not look right.
>>>>>>>>>>>>>>>>>>> There are 3 cases:
>>>>>>>>>>>>>>>>>>> If the byval parameter is in registers partially, we
>>>>>>>>>>>>>>>>>>> should
>>>>>>>>>>>>>>>>>>> create a
>>>>>>>>>>>>>>>>>>> COPY_STRUCT_BYVAL with the correct offset;
>>>>>>>>>>>>>>>>>>> If the byval parameter is all in registers, there is
>>>>>>>>>>>>>>>>>>> no need
>>>>>>>>>>>>>>>>>>> to create a
>>>>>>>>>>>>>>>>>>> COPY_STRUCT_BYVAL;
>>>>>>>>>>>>>>>>>>> If the byval parameter is all on stack, we should
>>>>>>>>>>>>>>>>>>> create a
>>>>>>>>>>>>>>>>>>> COPY_STRUCT_BYVAL with offset equal to 0.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> When looking at the above code, offset is default to
>>>>>>>>>>>>>>>>>>> zero, if
>>>>>>>>>>>>>>>>>>> CurByValIdx >= ByValArgsCount (this byval parameter is
>>>>>>>>>>>>>>>>>>> all on
>>>>>>>>>>>>>>>>>>> stack),
>>>>>>>>>>>>>>>>>>> offset will still be zero,
>>>>>>>>>>>>>>>>>>> and we will not create a COPY_STRUCT_BYVAL.
>>>>>>>>>>>>>>>>>>> The correct logic seems to be always updating offset
>>>>>>>>>>>>>>>>>>> inside
>>>>>>>>>>>>>>>>>>> the if and
>>>>>>>>>>>>>>>>>>> check Flags.getByValSize() > 4*offset
>>>>>>>>>>>>>>>>>>> if (CurByValIdx < ByValArgsCount) {
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>   offset = RegEnd - RegBegin; //number of registers
>>>>>>>>>>>>>>>>>>> occupied
>>>>>>>>>>>>>>>>>>> by this
>>>>>>>>>>>>>>>>>>> byval parameter
>>>>>>>>>>>>>>>>>>>   CCInfo.nextInRegsParam(); // go to the next byval
>>>>>>>>>>>>>>>>>>> parameter that is
>>>>>>>>>>>>>>>>>>> in registers partially or all
>>>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>>> if (Flags.getByValSize() > 4*offset) {
>>>>>>>>>>>>>>>>>>>   // Part of the byval parameter is on stack.
>>>>>>>>>>>>>>>>>>>   ...
>>>>>>>>>>>>>>>>>> Oh.. I missed third case. Then we keep first variant here.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> -  if ((!State->isFirstByValRegValid()) &&
>>>>>>>>>>>>>>>>>>>> -      (ARM::R0 <= reg) && (reg <= ARM::R3)) {
>>>>>>>>>>>>>>>>>>>> +  if ((ARM::R0 <= reg) && (reg <= ARM::R3)) {
>>>>>>>>>>>>>>>>>>>> +    // All registers reserved for byval parameters
>>>>>>>>>>>>>>>>>>>> should
>>>>>>>>>>>>>>>>>>>> be allocated.
>>>>>>>>>>>>>>>>>>>> +    // Check that first unallocated register is higher
>>>>>>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>>>>> last one
>>>>>>>>>>>>>>>>>>>> used
>>>>>>>>>>>>>>>>>>>> +    // for byval parameters.
>>>>>>>>>>>>>>>>>>>> +    {
>>>>>>>>>>>>>>>>>>>> +      unsigned InRegsParamsCount =
>>>>>>>>>>>>>>>>>>>> State->getInRegsParamsCount();
>>>>>>>>>>>>>>>>>>>> +      if (InRegsParamsCount) {
>>>>>>>>>>>>>>>>>>>> +        unsigned RB, RE;
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> State->getInRegsParamInfo(InRegsParamsCount-1, RB,
>>>>>>>>>>>>>>>>>>>> RE);
>>>>>>>>>>>>>>>>>>>> +        assert(reg >= RE);
>>>>>>>>>>>>>>>>>>>> +      }
>>>>>>>>>>>>>>>>>>>> +    }
>>>>>>>>>>>>>>>>>>> Is this only for assertion? Is this necessary?
>>>>>>>>>>>>>>>>>> Yes. I've just moved this check from "if" statement,
>>>>>>>>>>>>>>>>>> since it
>>>>>>>>>>>>>>>>>> must be
>>>>>>>>>>>>>>>>>> true always.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> -void
>>>>>>>>>>>>>>>>>>>> -ARMTargetLowering::VarArgStyleRegisters(CCState
>>>>>>>>>>>>>>>>>>>> &CCInfo,
>>>>>>>>>>>>>>>>>>>> SelectionDAG
>>>>>>>>>>>>>>>>>>>> &DAG,
>>>>>>>>>>>>>>>>>>>> -                                        DebugLoc dl,
>>>>>>>>>>>>>>>>>>>> SDValue &Chain,
>>>>>>>>>>>>>>>>>>>> -                                        const Value
>>>>>>>>>>>>>>>>>>>> *OrigArg,
>>>>>>>>>>>>>>>>>>>> -                                        unsigned
>>>>>>>>>>>>>>>>>>>> OffsetFromOrigArg,
>>>>>>>>>>>>>>>>>>>> -                                        unsigned
>>>>>>>>>>>>>>>>>>>> ArgOffset,
>>>>>>>>>>>>>>>>>>>> -                                        bool
>>>>>>>>>>>>>>>>>>>> ForceMutable)
>>>>>>>>>>>>>>>>>>>> const {
>>>>>>>>>>>>>>>>>>>> +// Return: The frame index registers were stored into.
>>>>>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>>>>>> +ARMTargetLowering::StoreByValRegs(CCState &CCInfo,
>>>>>>>>>>>>>>>>>>>> SelectionDAG &DAG,
>>>>>>>>>>>>>>>>>>>> +                                  DebugLoc dl, SDValue
>>>>>>>>>>>>>>>>>>>> &Chain,
>>>>>>>>>>>>>>>>>>>> +                                  const Value *OrigArg,
>>>>>>>>>>>>>>>>>>>> +                                  unsigned
>>>>>>>>>>>>>>>>>>>> InRegsParamRecordIdx,
>>>>>>>>>>>>>>>>>>>> +                                  unsigned
>>>>>>>>>>>>>>>>>>>> OffsetFromOrigArg,
>>>>>>>>>>>>>>>>>>>> +                                  unsigned ArgOffset,
>>>>>>>>>>>>>>>>>>>> +                                  bool ForceMutable)
>>>>>>>>>>>>>>>>>>>> const {
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> It would be much cleaner if we can move the
>>>>>>>>>>>>>>>>>>> refactoring of
>>>>>>>>>>>>>>>>>>> VarArgStyleRegisters to a separate patch.
>>>>>>>>>>>>>>>>>> For sure.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> P.S.: Today I had some troubles with electricity, so
>>>>>>>>>>>>>>>>>> I'll try
>>>>>>>>>>>>>>>>>> present
>>>>>>>>>>>>>>>>>> new patch tomorrow.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> -Stepan.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>>> llvm-commits mailing list
>>>>>>>>>>>>>>>>>> llvm-commits at cs.uiuc.edu
>>>>>>>>>>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> <pr15293-2013-04-12.patch>
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> <stack-offset.c><VarArgsRegSaveSize-2013-04-16.patch><AAPCS-C.4-vs-VFP-2013-04-16.patch>
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> <AAPCS-C.4-vs-VFP-2013-04-17.patch>
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> <AAPCS-C.4-vs-VFP-2013-04-19.patch>
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>>> <StoreByValRegs-2013-04-17.patch>
>>>>>> 
>>>>> 
>>>> 
>>>> <StoreByValRegs-2013-04-28.patch>
>>> 
>> 
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> 
> <pr15293-2013-05-01.patch>





More information about the llvm-commits mailing list